问题描述
我有一个这种形式的函数(来自使用 SSE 最快实现指数函数):
I have a function in this form (From Fastest Implementation of Exponential Function Using SSE):
__m128 FastExpSse(__m128 x)
{
static __m128 const a = _mm_set1_ps(12102203.2f); // (1 << 23) / ln(2)
static __m128i const b = _mm_set1_epi32(127 * (1 << 23) - 486411);
static __m128 const m87 = _mm_set1_ps(-87);
// fast exponential function, x should be in [-87, 87]
__m128 mask = _mm_cmpge_ps(x, m87);
__m128i tmp = _mm_add_epi32(_mm_cvtps_epi32(_mm_mul_ps(a, x)), b);
return _mm_and_ps(_mm_castsi128_ps(tmp), mask);
}
我想让它与 C
兼容.
然而,当我使用 C
编译器时,编译器不接受 static __m128i const b = _mm_set1_epi32(127 * (1 << 23) - 486411);
形式.
I want to make it C
compatible.
Yet the compiler doesn't accept the form static __m128i const b = _mm_set1_epi32(127 * (1 << 23) - 486411);
when I use C
compiler.
但我不希望在每个函数调用中重新计算前 3 个值.
一种解决方案是内联它(但有时编译器会拒绝).
Yet I don't want the first 3 values to be recalculated in each function call.
One solution is to inline it (But sometimes the compilers reject that).
是否有 C
风格来实现它,以防函数没有内联?
Is there a C
style to achieve it in case the function isn't inlined?
谢谢.
推荐答案
移除 static
和 const
.
也将它们从 C++ 版本中删除.const
没问题,但 static
很糟糕,引入了每次都检查的保护变量,并且第一次初始化非常昂贵.
Also remove them from the C++ version. const
is OK, but static
is horrible, introducing guard variables that are checked every time, and a very expensive initialization the first time.
__m128 a = _mm_set1_ps(12102203.2f);
不是函数调用,它只是一种表达向量常量的方式.只做一次"不能节省时间——它通常发生零次,常量向量在程序的数据段中准备好并在运行时简单地加载,没有垃圾static
引入的.
__m128 a = _mm_set1_ps(12102203.2f);
is not a function call, it's just a way to express a vector constant. No time can be saved by "doing it only once" - it normally happens zero times, with the constant vector being prepared in the data segment of the program and simply being loaded at runtime, without the junk around it that static
introduces.
检查 asm 以确保没有 static
会发生这种情况:(来自天马)
Check the asm to be sure, without static
this is what happens: (from godbolt)
FastExpSse(float __vector(4)):
movaps xmm1, XMMWORD PTR .LC0[rip]
cmpleps xmm1, xmm0
mulps xmm0, XMMWORD PTR .LC1[rip]
cvtps2dq xmm0, xmm0
paddd xmm0, XMMWORD PTR .LC2[rip]
andps xmm0, xmm1
ret
.LC0:
.long 3266183168
.long 3266183168
.long 3266183168
.long 3266183168
.LC1:
.long 1262004795
.long 1262004795
.long 1262004795
.long 1262004795
.LC2:
.long 1064866805
.long 1064866805
.long 1064866805
.long 1064866805
这篇关于在 `C` 函数中定义一个 `static const` SIMD 变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!