一个很棒的编程资源 Bit Twiddling Hacks,提出了(此处)以下方法计算 32 位整数的 log2:
A great programming resource, Bit Twiddling Hacks, proposes (here) the following method to compute log2 of a 32-bit integer:
#define LT(n) n, n, n, n, n, n, n, n, n, n, n, n, n, n, n, n
static const char LogTable256[256] =
-1, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3,
LT(4), LT(5), LT(5), LT(6), LT(6), LT(6), LT(6),
LT(7), LT(7), LT(7), LT(7), LT(7), LT(7), LT(7), LT(7)
unsigned int v; // 32-bit word to find the log of
unsigned r; // r will be lg(v)
register unsigned int t, tt; // temporaries
if (tt = v >> 16)
r = (t = tt >> 8) ? 24 + LogTable256[t] : 16 + LogTable256[tt];
r = (t = v >> 8) ? 8 + LogTable256[t] : LogTable256[v];
查找表的方法只需要7次左右的操作就可以找到日志一个 32 位的值.如果扩展为 64 位数量,则需要大约 9 次操作.
但是,唉,没有提供任何关于实际应该采用哪种方式将算法扩展到 64 位整数的额外信息.
but, alas, doesn't give any additional info about which way one should actually go to extend the algorithm to 64-bit integers.
关于这种 64 位算法的外观有什么提示吗?
Any hints about how a 64-bit algorithm of this kind would look like?
内在函数确实很快,但仍然不足以实现真正的跨平台、独立于编译器的 log2 实现.因此,如果有人感兴趣,这里是我自己研究该主题时使用的最快、无分支、CPU 抽象的类 DeBruijn 算法.
Intrinsic functions are really fast, but still are insufficient for a truly cross-platform, compiler-independent implementation of log2. So in case anyone is interested, here is the fastest, branch-free, CPU-abstract DeBruijn-like algorithm I've come to while researching the topic on my own.
const int tab64[64] = {
63, 0, 58, 1, 59, 47, 53, 2,
60, 39, 48, 27, 54, 33, 42, 3,
61, 51, 37, 40, 49, 18, 28, 20,
55, 30, 34, 11, 43, 14, 22, 4,
62, 57, 46, 52, 38, 26, 32, 41,
50, 36, 17, 19, 29, 10, 13, 21,
56, 45, 25, 31, 35, 16, 9, 12,
44, 24, 15, 8, 23, 7, 6, 5};
int log2_64 (uint64_t value)
value |= value >> 1;
value |= value >> 2;
value |= value >> 4;
value |= value >> 8;
value |= value >> 16;
value |= value >> 32;
return tab64[((uint64_t)((value - (value >> 1))*0x07EDD5E59A4E28C2)) >> 58];
四舍五入到 2 的下一个幂的部分取自 Power-of-2边界和获取尾随零数量的部分取自BitScan((bb & -bb)
代码需要挑出设置为 1 的最右边的位,这在我们将值四舍五入到 2 的下一次幂后不需要).
The part of rounding down to the next lower power of 2 was taken from Power-of-2 Boundaries and the part of getting the number of trailing zeros was taken from BitScan (the (bb & -bb)
code there is to single out the rightmost bit that is set to 1, which is not needed after we've rounded the value down to the next power of 2).
顺便说一下,32 位实现是
And the 32-bit implementation, by the way, is
const int tab32[32] = {
0, 9, 1, 10, 13, 21, 2, 29,
11, 14, 16, 18, 22, 25, 3, 30,
8, 12, 20, 28, 15, 17, 24, 7,
19, 27, 23, 6, 26, 5, 4, 31};
int log2_32 (uint32_t value)
value |= value >> 1;
value |= value >> 2;
value |= value >> 4;
value |= value >> 8;
value |= value >> 16;
return tab32[(uint32_t)(value*0x07C4ACDD) >> 27];
与任何其他计算方法一样,log2 要求输入值大于零.
As with any other computational method, log2 requires the input value to be greater than zero.