问题描述
[我在全局范围内对问题进行了编辑,使其更加有用"和清晰]
[I globally edited the question to be more "useful" and clear]
我想知道在cmath中实现函数exp
的复杂性.
I was wondering about the complexity of the implementation of the function exp
in cmath.
就复杂度而言,如果可能的话,我指的是算法复杂度.否则,与浮点运算(例如,加法运算)相比,成本会更高
By complexity, I mean algorithmic complexity if possible. Otherwise cost compared to a floating point operation (addition for example)
以下几行:
double x = 3;
double y = std::exp(x);
编译为:
...
19,23d16
movq %rax, -40(%rbp)
movsd -40(%rbp), %xmm0
call exp
movsd %xmm0, -40(%rbp)
movq -40(%rbp), %rax
...
exp
必须在运行时动态加载,但是我找不到很多有关实现算法复杂性的信息.似乎没有对特殊处理器指令的调用(至少在我的带有gcc的x86_64平台上),因此必须有一个我找不到的实现.在我看来,该算法很可能会使用输入的二进制表示形式来具有非常弱的复杂性,但是我无法找到关于该主题的有价值的参考.
exp
has to be dynamically loaded at runtime but I can't find many information on the implementation algorithmic complexity. It seems there's no call to a special processor instruction (at least on my x86_64 platform with gcc) so there must be an implementation somewhere that I can't find.On my mind, the algorithm is likely to use the binary representation of the input to have a very weak complexity but I haven't been able to find a valuable reference on this topic.
也许在这种情况下实际上不可能实现算法复杂性,而我们所能做的只是测试(请参阅下面的答案),但是我不知道我们如何客观地量化浮点运算和调用之间的差异展开?
Maybe speaking of algorithmical complexity is not really possible in this case and all what we can do is testing (cf. answers below) but I don't know how we can quantify objectively the difference between a floating point operation and a call to exp ?
推荐答案
似乎复杂度实际上是恒定的,因为MSVC9编译器进行了一些位魔术,涉及特定的表,位掩码和偏差.因为毕竟分支很少,所以指令流水线应该会有所帮助.下面是它的实际作用.
It seems that the complexity is actually constant since MSVC9 compiler does some bit-magic involving specific tables, bitmasks and biases. As there are few branches after all instruction pipeline should help a lot. Below is what it does actually.
unpcklpd xmm0,xmm0
movapd xmm1,xmmword ptr [cv]
movapd xmm6,xmmword ptr [Shifter]
movapd xmm2,xmmword ptr [cv+10h]
movapd xmm3,xmmword ptr [cv+20h]
pextrw eax,xmm0,3
and eax,7FFFh
mov edx,408Fh
sub edx,eax
sub eax,3C90h
or edx,eax
cmp edx,80000000h
jae RETURN_ONE
mulpd xmm1,xmm0
addpd xmm1,xmm6
movapd xmm7,xmm1
subpd xmm1,xmm6
mulpd xmm2,xmm1
movapd xmm4,xmmword ptr [cv+30h]
mulpd xmm3,xmm1
movapd xmm5,xmmword ptr [cv+40h]
subpd xmm0,xmm2
movd eax,xmm7
mov ecx,eax
and ecx,3Fh
shl ecx,4
sar eax,6
mov edx,eax
subpd xmm0,xmm3
movapd xmm2,xmmword ptr Tbl_addr[ecx]
mulpd xmm4,xmm0
movapd xmm1,xmm0
mulpd xmm0,xmm0
addpd xmm5,xmm4
mulsd xmm0,xmm0
addsd xmm1,xmm2
unpckhpd xmm2,xmm2
movdqa xmm6,xmmword ptr [mmask]
pand xmm7,xmm6
movdqa xmm6,xmmword ptr [bias]
paddq xmm7,xmm6
psllq xmm7,2Eh
mulpd xmm0,xmm5
addsd xmm1,xmm0
orpd xmm2,xmm7
unpckhpd xmm0,xmm0
addsd xmm0,xmm1
add edx,37Eh
cmp edx,77Ch
ja ADJUST
mulsd xmm0,xmm2
sub esp,10h
addsd xmm0,xmm2
movlpd qword ptr [esp+4],xmm0
fld qword ptr [esp+4]
add esp,10h
ret
这篇关于相较于FLOP,以cmath表示的exp的复杂性/实际成本是多少?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!