本文介绍了使用BTS与gcc编译器的汇编指令的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用的BTS和BT x86汇编指令在Mac上我的C ++ code,加快位操作。在Windows上,_bittestandset和_bittest内在做工精良,并提供显著的性能提升。在Mac上,gcc编译器似乎并不支持这些,所以我想直接用汇编做吧。

下面是我的C ++ code(注意,'位'可以是> = 32):

 的typedef无符号长的长字;
的#define DivLongWord(w)的((无符号)并且R w→5)
#定义ModLongWord(W)((符号)W&安培;(32-1))内嵌无效SetBit(长字[]数组,const int的位)
{
   数组[DivLongWord(位)] | = 1<< ModLongWord(位);
}内联BOOL TestBit(常量数组长字[],const int的位)
{
    回报(数组[DivLongWord(位)及(1 LT;< ModLongWord(位)))!= 0;
}

以下汇编code的作品,但不是最优的,因为编译器不能优化寄存器分配:

 内嵌无效SetBit(长字*数组,const int的位)
{
   __asm​​ {
      MOV EAX,位
      MOV ECX,阵列
      BTS [ECX],EAX
   }
}

问:我如何让编译器周围的BTS指令全面优化?而且我怎么用BT指令替换TestBit?


解决方案

 内嵌无效SetBit(*阵列位){
    ASM(BTS%1,%0:+ M(*数组):R(位));
}

I want to use the bts and bt x86 assembly instructions to speed up bit operations in my C++ code on the Mac. On Windows, the _bittestandset and _bittest intrinsics work well, and provide significant performance gains. On the Mac, the gcc compiler doesn't seem to support those, so I'm trying to do it directly in assembler instead.

Here's my C++ code (note that 'bit' can be >= 32):

typedef unsigned long LongWord;
#define DivLongWord(w) ((unsigned)w >> 5)
#define ModLongWord(w) ((unsigned)w & (32-1))

inline void SetBit(LongWord array[], const int bit)
{
   array[DivLongWord(bit)] |= 1 << ModLongWord(bit);
}

inline bool TestBit(const LongWord array[], const int bit)
{
    return (array[DivLongWord(bit)] & (1 << ModLongWord(bit))) != 0;
}

The following assembler code works, but is not optimal, as the compiler can't optimize register allocation:

inline void SetBit(LongWord* array, const int bit)
{
   __asm {
      mov   eax, bit
      mov   ecx, array
      bts   [ecx], eax
   }
}

Question: How do I get the compiler to fully optimize around the bts instruction? And how do I replace TestBit by a bt instruction?

解决方案
inline void SetBit(*array, bit) {
    asm("bts %1,%0" : "+m" (*array) : "r" (bit));
}

这篇关于使用BTS与gcc编译器的汇编指令的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-03 06:04