本文介绍了acos(double)在x64和x32 ​​Visual Studio上给出不同的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

acos(double)在x64和x32 ​​Visual Studio上给出不同的结果.

acos(double) gives different result on x64 and x32 Visual Studio.

printf("%.30g\n", double(acosl(0.49990774364240564)));
printf("%.30g\n", acos(0.49990774364240564));

在x64上

:1.0473040763868076
在x32上:1.0473040763868078

on x64: 1.0473040763868076
on x32: 1.0473040763868078

在启用sse的linux4.4 x32和x64上:1.0473040763868078

on linux4.4 x32 and x64 with sse enabled: 1.0473040763868078

有没有办法让VSx64 acos()给我1.0473040763868078作为结果?

is there a way to make VSx64 acos() give me 1.0473040763868078 as result?

推荐答案

TL:DR:这是正常现象,您无法合理地对其进行更改.

32位库可能使用x87寄存器中的80位FP值作为临时值,以避免在每次操作后舍入到64位double. (除非有一个完整的单独的库,否则编译自己的代码以使用SSE不会更改库中的内容,甚至不会更改将数据传递到库的调用约定.但是由于32位传递了doublefloat在堆栈上的内存中,可以通过SSE2或x87随意加载库.但是,除非非SSE代码无法使用该库,否则您无法获得在xmm寄存器中传递FP值的性能优势.)

The 32-bit library may be using 80-bit FP values in x87 registers for its temporaries, avoiding rounding off to 64-bit double after every operation. (Unless there's a whole separate library, compiling your own code to use SSE doesn't change what's inside the library, or even the calling convention for passing data to the library. But since 32-bit passes double and float in memory on the stack, a library is free to load it with SSE2 or with x87. Still, you don't get the performance advantage of passing FP values in xmm registers unless it's impossible for non-SSE code to use the library.)

它们之所以不同,也可能仅仅是因为它们使用不同的操作顺序,并在此过程中产生了不同的临时工.除非它们是分别用asm手写的,否则这似乎不太合理.如果它们是从相同的C源代码构建的(没有不安全"的FP优化),则由于FP数学的这种非关联行为,不允许编译器重新排序.

It's also possible that they're different simply because they use a different order of operations, producing different temporaries along the way. That's less plausible, unless they're separately hand-written in asm. If they're built from the same C source (without "unsafe" FP optimizations), then the compiler isn't allowed to reorder things, because of this non-associative behaviour of FP math.

glibc的libm(在Linux上使用)通常偏向于精度而不是速度,因此它为32位和64位的尾数的最后一位提供了正确舍入的结果. IEEE FP标准仅要求将基本操作(+-*/FMA和FP余数)正确舍入"到尾数的最后一位. (即,舍入误差最多为 0.5 ulp ). (根据 calc 的确切结果是1.047304076386807714...请记住,double(在带有普通编译器的x86上)是 IEEE754 binary64 ,因此内部的尾数和指数都在base2中.但是,如果您打印了足够多的十进制数字,您可以说...7714应该四舍五入到...78,尽管实际上您应该打印更多的数字以防万一.不是零,我只是假设它是...78000.)

glibc's libm (used on Linux) typically favours precision over speed, so its giving you the correctly-rounded result out to the last bit of the mantissa for both 32 and 64-bit. The IEEE FP standard only requires the basic operations (+ - * / FMA and FP remainder) to be "correctly rounded" out to the last bit of the mantissa. (i.e. rounding error of at most 0.5 ulp). (The exact result, according to calc, is 1.047304076386807714.... Keep in mind that double (on x86 with normal compilers) is IEEE754 binary64, so internally the mantissa and exponent are in base2. If you print enough extra decimal digits, though, you can tell that ...7714 should round up to ...78, although really you should print more digits in case they're not zero beyond that. I'm just assuming it's ...78000.)

因此,Microsoft的64位库实现产生1.0473040763868076,除了不使用它外,您几乎无能为力. (例如,找到您自己的acos()实现并使用它.)但是FP确定性是困难,即使您将SSE限制为x86也是如此.参见是否进行任何浮动点密集型代码在任何基于x86的体系结构中都能产生位精确的结果吗?.如果您将自己限制在一个编译器中,则可以避免使用复杂的库函数,例如acos().

So Microsoft's 64-bit library implementation produces 1.0473040763868076 and there's pretty much nothing you can do about it, other than not use it. (e.g. find your own acos() implementation and use it.) But FP determinism is hard, even if you limit yourself to just x86 with SSE. See Does any floating point-intensive code produce bit-exact results in any x86-based architecture?. If you limit yourself to a single compiler, it can be possible if you avoid complicated library functions like acos().

如果使用x87,并且更改x87精度设置会影响它,则也许可以获取32位库版本以产生与64位版本相同的值.但是,另一种方式是不可能的:SSE2具有针对64位double和32位float的单独指令,并且总是在每条指令之后舍入,因此您不能更改任何设置来提高精度结果. (您可以更改SSE舍入模式,但这会改变结果,但是效果不是很好!)

You might be able to get the 32-bit library version to produce the same value as the 64-bit version, if it uses x87 and changing the x87 precision setting affects it. But the other way around is not possible: SSE2 has separate instructions for 64-bit double and 32-bit float, and always rounds after every instruction, so you can't change any setting that will increase precision result. (You could change the SSE rounding mode, and that will change the result, but not in a good way!)

另请参阅:

  • Intermediate Floating-Point Precision and the rest of Bruce Dawson's excellent series of articles about floating point. (table of contents.

链接的文章描述了VC ++的CRT运行时启动的某些版本如何将x87 FP寄存器精度设置为53位尾数而不是80位全精度.此外,D3D9会将其设置为24,因此,如果使用x87完成操作,即使double的精度也仅为float.

The linked article describes how some versions of VC++'s CRT runtime startup set the x87 FP register precision to 53-bit mantissa instead of 80-bit full precision. Also that D3D9 will set it to 24, so even double only has the precision of float if done with x87.

这篇关于acos(double)在x64和x32 ​​Visual Studio上给出不同的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-26 08:58