本文介绍了ICC是否满足C99规范的复数乘法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

限时删除!!

考虑以下简单代码:

#include <complex.h>
complex float f(complex float x) {
  return x*x;
}

如果使用英特尔编译器通过-O3 -march=core-avx2 -fp-model strict进行编译,则会得到:

If you compile it with -O3 -march=core-avx2 -fp-model strict using the Intel Compiler you get:

f:
        vmovsldup xmm1, xmm0                                    #3.12
        vmovshdup xmm2, xmm0                                    #3.12
        vshufps   xmm3, xmm0, xmm0, 177                         #3.12
        vmulps    xmm4, xmm1, xmm0                              #3.12
        vmulps    xmm5, xmm2, xmm3                              #3.12
        vaddsubps xmm0, xmm4, xmm5                              #3.12
        ret

这比从gccclang获得的代码简单得多,并且比在网上找到的乘以复数的代码简单得多.例如,它似乎并未明确处理复杂的NaN或无穷大.

This is much simpler code than you get from both gcc and clang and also much simpler than the code you will find online for multiplying complex numbers. It doesn't, for example appear explicitly to deal with complex NaN or infinities.

推荐答案

代码不符合要求.

附录G,第5.1节,第4段,

Annex G, Section 5.1, Paragraph 4 reads

-如果一个操作数是无穷大,而另一个操作数是非零有限数或无穷大,则*运算符的结果是无穷大;

— if one operand is an infinity and the other operand is a nonzero finite number or an infinity, then the result of the * operator is an infinity;

因此,如果 z = a * i b是无限的,而 w = c * i d是无限的,数字 z * w 必须是无限的.

So if z = a * ib is infinite and w = c * id is infinite, the number z * w must be infinite.

同一附件的第3节第1款定义了复数是无限的含义:

The same annex, Section 3, Paragraph 1 defines what it means for a complex number to be infinite:

因此,如果a或b是 z 是无限的.
这确实是一个明智的选择,因为它反映了数学框架.

So z is infinite if either a or b are.
This is indeed a sensible choice as it reflects the mathematical framework.

但是,如果让 z =∞+ i ∞(无穷大)和 w = i ∞(和无穷大)Intel代码的结果为 z * w = NaN + i NaN由于∞·0中间物.

However if we let z = ∞ + i∞ (an infinite value) and w = i∞ (and infinite value) the result for the Intel code is z * w = NaN + iNaN due to the ∞ · 0 intermediates.

这足以将其标记为不合格.

This suffices to label it as non-conforming.

我们可以通过看一下第一引号的脚注来进一步确认这一点(此处未报告脚注),其中提到了CX_LIMITED_RANGE pragma指令.

We can further confirm this by taking a look at the footnote on the first quote (the footnote was not reported here), it mentions the CX_LIMITED_RANGE pragma directive.

第7.3.4节的第1段内容为

Section 7.3.4, Paragraph 1 reads

标准委员会正在努力减轻复杂乘法(和除法)的繁琐工作.
实际上,海湾合作委员会有一个标志来控制这种行为:

Here the standard committee is trying to alleviate the huge mole of work for the complex multiplication (and division).
In fact GCC has a flag to control this behaviour:

此外,也没有检查复数乘法或除法的结果是否为NaN + I * NaN,从而试图挽救这种情况.

默认值为-fno-cx-limited-range,但已由-ffast-math 启用.
此选项控制ISO C99 CX_LIMITED_RANGE编译指示的默认设置.

The default is -fno-cx-limited-range, but is enabled by -ffast-math.
This option controls the default setting of the ISO C99 CX_LIMITED_RANGE pragma.

仅此选项,即使GCC生成缓慢的代码和其他检查,如果没有它,它生成的代码具有与英特尔的(我将源代码翻译成C ++)

It this option alone that makes GCC generate slow code and additional checks, without it the code it generate has the same flaws of Intel's one (I translated the source to C++)

f(std::complex<float>):
        movq    QWORD PTR [rsp-8], xmm0
        movss   xmm0, DWORD PTR [rsp-8]
        movss   xmm2, DWORD PTR [rsp-4]
        movaps  xmm1, xmm0
        movaps  xmm3, xmm2
        mulss   xmm1, xmm0
        mulss   xmm3, xmm2
        mulss   xmm0, xmm2
        subss   xmm1, xmm3
        addss   xmm0, xmm0
        movss   DWORD PTR [rsp-16], xmm1
        movss   DWORD PTR [rsp-12], xmm0
        movq    xmm0, QWORD PTR [rsp-16]
        ret

没有它的代码是

f(std::complex<float>):
        sub     rsp, 40
        movq    QWORD PTR [rsp+24], xmm0
        movss   xmm3, DWORD PTR [rsp+28]
        movss   xmm2, DWORD PTR [rsp+24]
        movaps  xmm1, xmm3
        movaps  xmm0, xmm2
        call    __mulsc3
        movq    QWORD PTR [rsp+16], xmm0
        movss   xmm0, DWORD PTR [rsp+16]
        movss   DWORD PTR [rsp+8], xmm0
        movss   xmm0, DWORD PTR [rsp+20]
        movss   DWORD PTR [rsp+12], xmm0
        movq    xmm0, QWORD PTR [rsp+8]
        add     rsp, 40
        ret

__mulsc3函数实际上与标准C99对复数乘法推荐的相同.
它包括上述检查.

and the __mulsc3 function is practically the same the standard C99 recommends for complex multiplication.
It includes the above mentioned checks.

其中,数字的模数是从实数| z |扩展而来的到复数"z",因为无穷限制而保持了无限的定义.简而言之,在复平面中有一个无限值的整个圆周,并且只需要一个坐标"就可以使一个无限大的模数成为无限大.

Where the modulus of a number is extended from the real case |z| to the complex one ‖z‖, keeping the definition of infinite as the result of unbounded limits. Simply put, in the complex plane there is a whole circumference of infinite values and it takes just one "coordinate" to be infinite to get an infinite modulus.

如果我们记得 z = NaN + i ∞或 z =∞,情况将变得最糟+ i NaN是有效的无限值

The situation get worst if we remember that z = NaN + i∞ or z = ∞ + iNaN are valid infinite values

这篇关于ICC是否满足C99规范的复数乘法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

1403页,肝出来的..

09-06 09:32