问题描述
我想临时启用FTZ
/DAZ
模式以获得某些严格遵守标准的问题,在这种情况下,遵从性很重要.
我一直在阅读此有关如何启用/禁用这些模式和这对非常规处理的性能影响,但不幸的是,我在多线程环境中混合使用了代码,无法一劳永逸地启用这些模式. /p>
我的理解是,由于MXCSR
寄存器的标志确定硬件的行为,并且由于每个线程都有其自己的寄存器上下文,因此设置这些标志只会影响当前线程的行为.
对吗?
是的,MXCSR
是上下文切换器保存/恢复的每线程体系结构状态的一部分,以及xmm/ymm/zmm和x87堆栈寄存器(使用xsave
/xrstor
).不同的线程具有各自的FPU状态.
有趣的想法,我总是认为DAZ仅在您具有非标准常量或某些内容(或文件中的数据)时才有用,但是让其他线程在没有FTZ的情况下运行是另一个异常来源.
您可能还希望使用-ffast-math
或这些选项的子集来编译某些文件.请注意,在gcc中将链接与-ffast-math
一起将包含CRT函数,该函数会将DAZ/FTZ设置在main()
之前,因此请不要这样做.
由快速数学实现的优化与正交是否刷新为零几乎是正交的.即使只是-fno-math-errno
也可以使更多的数学函数内联(更好/根本),例如sqrtf
,如果您不关心errno
的设置以及获得NaN结果的话,那将是绝对安全的.
I'd like to enable temporarily FTZ
/DAZ
modes to get a performance gain for some code where strict compliance with the IEEE 754 standard is not an issue, without changing the behaviour of other threads, which could be executing code, where that compliance is important.
I've been reading this on how to enable/disable these modes and this on the performance impact of denormals handling, but unfortunately I've got a mixed code in a multithreaded environment and I cannot enable these modes once and for all.
My understanding is that since MXCSR
register's flags determine the behavior of the hardware and since every thread has its own context of registers, setting these flags will only affect the behaviour of the current thread.
Is it correct?
Yes, MXCSR
is part of the per-thread architectural state saved/restored by context switches, along with the xmm/ymm/zmm and x87 stack registers (using xsave
/xrstor
). Different threads have their own FPU state.
Interesting idea, I'd always figured DAZ was only useful if you had denormal constants or something (or data from a file), but having other threads running without FTZ is another source of denormals.
You might also want to compile some files with -ffast-math
, or a subset of those options. Note that linking with -ffast-math
in gcc will include a CRT function that sets DAZ/FTZ before main()
, so don't do that.
The optimizations enabled by fast-math are mostly orthogonal to whether denormals are flushed to zero. Even just -fno-math-errno
lets more math functions inline (better / at-all), e.g. sqrtf
, and is totally safe if you don't care about errno
being set as well as getting a NaN result.
这篇关于我可以为线程临时启用FTZ和DAZ浮点模式吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!