本文介绍了是否所有支持AVX2的CPU也都支持SSE4.2和AVX?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我计划实现SIMD扩展的运行时检测.如果我发现处理器具有AVX2支持,是否也可以保证具有SSE4.2和AVX支持?

I am planning to implement runtime detection of SIMD extensions. Is it such that if I find out that the processor has AVX2 support, it is also guaranteed to have SSE4.2 and AVX support?

推荐答案

支持最新的Intel SIMD ISA扩展意味着支持以前的SIMD.

AVX2绝对暗含AVX1.

AVX2 definitely implies AVX1.

我认为AVX1暗示所有SSE/SSE2/SSE3/SSSE3/SSE4.1/SSE4.2功能位也必须在CPUID中设置.如果没有形式上的保证,很多会做出这个假设,违反它的CPU在商业上可能无法在一般情况下使用.

I think AVX1 implies all of SSE/SSE2/SSE3/SSSE3/SSE4.1/SSE4.2 feature bits must also be set in CPUID. If not formally guaranteed, many things make this assumption and a CPU that violated it would probably not be commercially viable for general use.

请注意,popcnt有其自己的功能位,因此从理论上讲,您可以拥有带有AVX2和SSE4.2的CPU,但没有popcnt,但是很多事情都将SSE4.2视为隐含的popcnt.因此,更像是您可以在不使用SSE4.2的情况下发布对popcnt的支持.

Note that popcnt has its own feature bit, so in theory you could have a CPU with AVX2 and SSE4.2, but not popcnt, but many things treat SSE4.2 as implying popcnt. So it's more like you can advertize support for popcnt without SSE4.2.

理论上,您可以使用AVX制作CPU(或虚拟机),但不接受pcmpistri之类的SSE4.2指令的非VEX旧版SSE编码,但我认为您将违反英特尔的保证AVX功能位的含义.不确定是否将其正式记录在手册中,但是大多数软件会假定这样做.

In theory you could make a CPU (or virtual machine) with AVX but which didn't accept the non-VEX legacy-SSE encoding of SSE4.2 instructions like pcmpistri, but I think you'd be violating Intel's guarantees about what the AVX feature bit implies. Not sure if that's formally written down in a manual, but most software will assume that.

但是AVX1 确实暗示支持 all SSE4.2和早期SIMD指令的VEX编码,例如 vpcmpistri vminss

But AVX1 does imply support for the VEX encoding of all SSE4.2 and earlier SIMD instructions, e.g. vpcmpistri or vminss

gcc -mavx2绝对暗含AVX1和以前的扩展名,但只会发出使用VEX编码的代码.但是,它将定义__SSE4_2__宏,依此类推,因此gcc确实将AVX2视为暗示了较早的SSE扩展名和popcnt,但未包含FMA,AES-NI或PCLMUL.即使对于海湾合作委员会,这些都是单独的功能.

gcc -mavx2 definitely implies AVX1 and previous extensions, but will only emit code that uses the VEX encoding. It will define the __SSE4_2__ macro and so on, though, so gcc does treat AVX2 as implying earlier SSE extensions and popcnt, but not FMA, AES-NI or PCLMUL. Those are separate features even for GCC.

(在实践中,您应该使用gcc -march=nativegcc -march=znver1或其他任何方式来启用CPU的所有功能,为其设置调整选项.不仅限于-mavx2 -mfma,还需要进行调整默认设置不正确,例如将每个可能未对齐的256位加载/存储拆分为128位.)

(In practice you should use gcc -march=native or gcc -march=znver1 or whatever to enable all the features your CPU has, and set tuning options for it. Not just -mavx2 -mfma, that leaves tuning settings at bad defaults like splitting every possibly-unaligned 256-bit load/store into 128-bit halves.)

(请注意,MSVC没有太多的SIMD ISA检测宏;它有一个用于AVX,但没有用于所有较早的SSE *扩展.MSVC的模型是基于以下假设设计的:程序将执行运行时CPU检测而不是正在为本地计算机进行编译.尽管MSVC现在具有AVX和AVX2选项,可以将它们用作基准.)

(Note that MSVC doesn't have as many SIMD ISA detection macros; it has one for AVX but not for all of the earlier SSE* extensions. MSVC's model is designed around the assumption that programs will do runtime CPU detection instead of being compiled for the local machine. Although MSVC does now have AVX and AVX2 options to use those as baselines.)

请注意,AVX512违反了传统. AVX512F表示支持AVX2及其之前的所有内容,但除此之外,AVX512DQ并非之前".或之后";以AVX512ER为例.从理论上讲,您可以选择两者都选,也可以都不选. (实际上,Skylake-X/Cannonlake等与Xeon Phi(骑士的着陆点/骑士的磨坊)只有一点重叠,只是AVX512F以外. https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512

Note that AVX512 kind of breaks the traditions. AVX512F implies support for AVX2 and everything before it, but beyond that AVX512DQ doesn't come "before" or "after" AVX512ER, for example. You can (in theory) have either, both, or neither. (In practice, Skylake-X/Cannonlake/etc. has only a bit of overlap with Xeon Phi (Knight's Landing / Knight's Mill), beyond AVX512F. https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512

这篇关于是否所有支持AVX2的CPU也都支持SSE4.2和AVX?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-29 09:19