提示和指控比比皆是算术NaN的可以在硬件的FPU慢。特别是在现代的x64 FPU,例如在Nehalem处理器i7处理器,是仍然如此?不要FPU乘法得到搅动了在不考虑操作数的值相同的速度?

Hints and allegations abound that arithmetic with NaNs can be 'slow' in hardware FPUs. Specifically in the modern x64 FPU, e.g on a Nehalem i7, is that still true? Do FPU multiplies get churned out at the same speed regardless of the values of the operands?


I have some interpolation code that can wander off the edge of our defined data, and I'm trying to determine whether it's faster to check for NaNs (or some other sentinel value) here there and everywhere, or just at convenient points.


Yes, I will benchmark my particular case (it could be dominated by something else entirely, like memory bandwidth), but I was surprised not to see a concise summary somewhere to help with my intuition.


I'll be doing this from the CLR, if it makes a difference as to the flavor of NaNs generated.


有关它的价值,使用SSE指令 mulsd NaN的是pretty的多少完全一样快,随着不断的 4.0 (由一个公平的骰子,保证是随机选择)。

For what it's worth, using the SSE instruction mulsd with NaN is pretty much exactly as fast as with the constant 4.0 (chosen by a fair dice roll, guaranteed to be random).


for (unsigned i = 0; i < 2000000000; i++)
    double j = doubleValue * i;


generates this machine code (inside the loop) with clang (I assume the .NET virtual machine uses SSE instructions when it can too):

movsd     -16(%rbp), %xmm0    ; gets the constant (NaN or 4.0) into xmm0
movl      -20(%rbp), %eax     ; puts i into a register
cvtsi2sdq %rax, %xmm1         ; converts i to a double and puts it in xmm1
mulsd     %xmm0, %xmm1        ; multiplies xmm0 (the constant) with xmm1 (i)
movsd     %xmm1, -32(%rbp)    ; puts the result somewhere on the stack

和两个十亿次迭代,在 NaN的(由C宏定义 NAN &LT;文件math.h&GT; )的版本花了大约0.017的的秒在我的i7处理器执行。所不同的可能是由任务调度引起的。

And with two billion iterations, the NaN (as defined by the C macro NAN from <math.h>) version took about 0.017 less seconds to execute on my i7. The difference was probably caused by the task scheduler.


So to be fair, they're exactly as fast.

