本文介绍了使编译器使用movsd复制字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在时间关键型函数中复制相对较短的内存序列(小于1 KB,通常为2-200字节).在CPU端,最好的代码似乎是rep movsd.但是我不知何故不能使我的编译器生成此代码.我希望(我隐约记得记得如此)使用memcpy可以使用编译器内置的内在函数做到这一点,但是基于反汇编和调试,似乎编译器正在使用对memcpy/memmove库实现的调用.我还希望编译器可能足够聪明,可以识别以下循环并单独使用rep movsd,但似乎没有.

I would like to copy a relatively short sequence of memory (less than 1 KB, typically 2-200 bytes) in a time critical function. The best code for this on CPU side seems to be rep movsd. However I somehow cannot make my compiler to generate this code. I hoped (and I vaguely remember seeing so) using memcpy would do this using compiler built-in intrinsics, but based on disassembly and debugging it seems compiler is using call to memcpy/memmove library implementation instead. I also hoped the compiler might be smart enough to recognize following loop and use rep movsd on its own, but it seems it does not.

char *dst;
const char *src;
// ...
for (int r=size; --r>=0; ) *dst++ = *src++;

除了使用内联汇编之外,是否有其他方法可以使Visual Studio编译器生成rep movsd序列?

Is there some way to make the Visual Studio compiler to generate rep movsd sequence other than using inline assembly?

推荐答案

使用大小固定的memcpy

同时我发现了什么

Using memcpy with a constant size

What I have found meanwhile:

当复制的块大小在编译时已知时,编译器将使用内在函数.如果不是,则调用库实现.当知道大小时,根据大小选择生成的代码非常好.根据需要,它可以是单个mov或movsd或movsd后跟movsb.

Compiler will use intrinsic when the copied block size is compile time known. When it is not, is calls the library implementation. When the size is known, the code generated is very nice, selected based on the size. It may be a single mov, or movsd, or movsd followed by movsb, as needed.

似乎,如果我真的想始终使用movsb或movsd,即使具有动态"大小,我也必须使用内联汇编或特殊的内在函数(请参见下文).我知道大小是很短",但是编译器不知道并且我无法与之通信-我什至尝试使用__assume(size< 16),但这还不够.

It seems that if I really want to use movsb or movsd always, even with a "dynamic" size I will have to use inline assembly or special intrinsic (see below). I know the size is "quite short", but the compiler does not know it and I cannot communicate this to it - I have even tried to use __assume(size<16), but it is not enough.

演示代码,使用"-Ob1(仅用于内联扩展)进行编译:

Demo code, compile with "-Ob1 (expansion for inline only):

  #include <memory.h>

  void MemCpyTest(void *tgt, const void *src, size_t size)
  {
    memcpy(tgt,src,size);
  }

  template <int size>
  void MemCpyTestT(void *tgt, const void *src)
  {
    memcpy(tgt,src,size);
  }

  int main ( int argc, char **argv )
  {
    int src;
    int dst;
    MemCpyTest(&dst,&src,sizeof(dst));
    MemCpyTestT<sizeof(dst)>(&dst,&src);
    return 0;
  }

特殊内在函数

最近我发现存在一种非常简单的方法来使Visual Studio编译器使用movsd复制字符-非常自然和简单:使用内在函数.以下内在函数可能会派上用场:

Specialized intrinsics

I have found recently there exists very simple way how to make Visual Studio compiler copy characters using movsd - very natural and simple: using intrinsics. Following intrinsics may come handy:

  • __movsb
  • __movsw
  • __movsd

这篇关于使编译器使用movsd复制字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-12 19:39