问题描述
我工作在x86或x86_64的机器上。我有一个数组 unsigned int类型A
所有的[32],它的元素有值为0或1。我想设置单个变量 unsigned int类型b
让(b>>我)及1 ==一[I]
将举行所有32元素
。我与海湾合作委员会的工作在Linux上(应该没有多大我猜)。
I'm working on an x86 or x86_64 machine. I have an array unsigned int a[32]
all of whose elements have value either 0 or 1. I want to set the single variable unsigned int b
so that (b >> i) & 1 == a[i]
will hold for all 32 elements of a
. I'm working with GCC on Linux (shouldn't matter much I guess).
什么是用C来做到这一点的最快方法?
What's the fastest way to do this in C?
推荐答案
在最近的x86处理器,最快的方式可能是利用MOVMSKB家族的其中提取SIMD字的MSB指令,并将它们打包到一个正常的整数注册。
The fastest way on recent x86 processors is probably to make use of the MOVMSKB family of instructions which extract the MSBs of a SIMD word and pack them into a normal integer register.
我怕SIMD内部函数是不是真的我的事,但这些方针的东西应该工作,如果你有一个AVX2配备的处理器:
I fear SIMD intrinsics are not really my thing but something along these lines ought to work if you've got an AVX2 equipped processor:
uint32_t bitpack(const bool array[32]) {
__mm256i tmp = _mm256_loadu_si256((const __mm256i *) array);
tmp = _mm256_cmpgt_epi8(tmp, _mm256_setzero_si256());
return _mm256_movemask_epi8(tmp);
}
假设的sizeof(布尔)= 1
。对于老SSE2系统,你将不得不串起来一对128位操作来代替。对齐32字节边界上的阵列,并应保存另一个周期左右。
Assuming sizeof(bool) = 1
. For older SSE2 systems you will have to string together a pair of 128-bit operations instead. Aligning the array on a 32-byte boundary and should save another cycle or so.
这篇关于什么是收拾32 0/1值成一个单一的32位变量位的最快方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!