之间到底有什么区别

之间到底有什么区别

本文介绍了英特尔和AMD的ISA(如果有)之间到底有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道人们以前曾经问过类似的问题,但是有太多相互矛盾的信息,我真的想一劳永逸地清除它.我将通过清楚地区分指令集体系结构(ISA)和实际硬件实现来尝试这样做.首先,我尝试进行澄清:

I know people have asked similar questions like this before, however there is so much conflicting information that I really want to try and clear it up once and for all. I will attempt to do so by clearly distinguishing between instruction set architecture (ISA) and actual hardware implementation. First my attempted clarifications:

1.)当前有intel64和amd64 CPU(还有其他,但这些是重点)

1.) Currently there are intel64 and amd64 CPU's out there (among others but these are the focus)

2.)鉴于ISA是1个或更多CPU指令的二进制表示,这意味着ISA与它的实际硬件实现是完全分开的.

2.) Given that an ISA is the binary representation of 1 or more CPU instructions this means an ISA is completely separate from it's actual hardware implementation.

我的问题:

intel 64和amd64 CPU之间的区别是否与不同或扩展的x86-64 ISA有关?还是x86-64 ISA的不同硬件实现?还是两者都有?

Does the differences between intel 64 and amd64 CPUs have to do with different or extended x86-64 ISAs? Or different hardware implementations of the x86-64 ISA? Or both?

推荐答案

是的,ISA是文档/规范,而不是硬件.正确地实现所有功能才是使x86 CPU成为可能的原因,而不仅仅是与x86相似的东西.

Yes, the ISA is a document / specification, not hardware. Implementing all of it correctly is what makes something an x86 CPU, rather than just something with similarities to x86.

请参见 x86 标签Wiki的问题,以获取指向以下链接的链接官方文档(英特尔手册).

See the x86 tag wiki for links to the official docs (Intel's manuals).

Intel和AMD的对于x86 ISA的实现有所不同主要表现在性能上,以及它们支持的指令集扩展.软件可以使用CPUID指令查询支持的内容.

Intel and AMD's implementations of the x86 ISA differ mainly in performance, and in which extensions to the instruction-set they support. Software can query what's supported using the CPUID instruction.

该链接的问题有多个答案,涉及到该问题正在询问的一些相同问题.

That linked question has several answers which touch on some of the same things this question is asking about.

此处的主要差异之一是英特尔,AMD和威盛各自都有自己的硬件虚拟化扩展,甚至不尝试兼容.因此,像Xen这样的VM对于每个扩展都需要单独的驱动程序"或后端"代码.但是这些仍然是扩展,不是基准x86的一部分.

One of the major divergences here is that Intel, AMD, and VIA each have their own hardware-virtualization extensions which don't even try to be compatible. So a VM like Xen needs separate "drivers" or "backend" code for each of these extensions. But those are still extensions, not part of baseline x86.

供用户空间程序使用的SIMD扩展最终在这两者上均可用,这要归功于英特尔努力通过反竞争做法将AMD搞砸.这浪费了其他所有人的时间,并且通常不利于整个x86生态系统(例如,现在可以将SSSE3视为更多软件的基准),但有助于英特尔的底线.一个很好的例子:AMD Bulldozer支持FMA4,但是英特尔在最后一刻改变了主意,并在Haswell中实现了FMA3.直到他们的下一个微架构(Piledriver),AMD才支持.

SIMD extensions for use by user-space programs end up being available on both, often with a delay thanks to Intel's efforts to screw over AMD with anti-competitive practices. This costs everyone else's time, and is often detrimental to the overall x86 ecosystem (e.g. SSSE3 could have been assumed as a baseline for more software by now), but helps Intel's bottom line. A good example here: AMD Bulldozer supports FMA4, but Intel changed their mind at the last minute and implemented FMA3 in Haswell. AMD didn't support that until their next microarch (Piledriver).

不,ISA不仅限于此.英特尔在所有x86 CPU上保证的所有内容都是ISA的一部分.这不仅是每条指令的详细行为,还包括诸如哪个控制寄存器执行什么操作以及内存排序规则之类的事情.基本上,英特尔和AMD发行的手册中的所有内容都没有以基于这样的特定型号的CPU"作为开头.

No, an ISA is much more than that. Everything that Intel documents as being guaranteed across all x86 CPUs is part of the ISA. This isn't just the detailed behaviour of every instruction, but also stuff like which control register does what, and the memory ordering rules. Basically everything in the manuals published by Intel and AMD that isn't prefaced by "on such and such a specific model of CPU".

我希望在某些情况下,英特尔和AMD的系统编程指南在x86的工作方式上会有所不同. (如果他们为x86 CPU发布自己的产品,则是VIA).我没有检查,但是我很确定用户空间不会受此影响:如果存在差异,它们仅限于特权指令,这些指令只有在内核运行它们时才起作用.无论如何,在那种情况下,我猜你可以说x86 ISA是Intel和AMD文档的常见子集.

I expect there are a few cases where Intel's and AMD's system programming guides differ on how x86 should work. (And VIA's if they publish their own for their x86 CPUs). I haven't checked, but I'm pretty sure user-space doesn't suffer from this: If there are differences, they're limited to privileged instructions that only work if the kernel runs them. Anyway, in that case I guess you could say the x86 ISA is the common subset of what Intel and AMD document.

请注意,尝试查找实际硬件在实践中的工作有助于理解文档,而不是代替阅读文档.您不希望代码依赖于指令在您测试的CPU上的行为方式.

Note that experimenting to find what real hardware does in practice is useful for understanding the docs, but NOT a replacement for reading them. You don't want your code to rely on how an instruction happens to behave on the CPU you tested.

但是,英特尔确实使用真实的软件测试了他们的新设计,因为不能运行现有版本的Windows在商业上是不利的.例如 Windows9x不会使只能通过推测方式填充的TLB条目无效 (本示例的其余所有内容只是该非常详细的博客文章的摘要和推断).这可能是基于安全性(当时在硬件上也安全)的假设进行的性能黑客攻击,或者是未引起注意的错误.当时无法通过硬件测试检测到.

However, Intel does test their new designs with real software, because not being able to run existing versions of Windows would be a downside commercially. e.g. Windows9x doesn't invalidate a TLB entry that could only have been filled speculatively (all the rest of this example is just a summary of and extrapolation from that very detailed blog post). This was either a performance hack based on the assumption that it was safe (and was safe on hardware at the time), or an unnoticed bug. It couldn't have been detected by testing on hardware at the time.

现代的Intel CPU会进行推测性的分页查询,但是直到Haswell检测并击落错误推测之后,假设这种情况不会发生的代码仍然可以使用.

Modern Intel CPUs do speculative pagewalks, but even as recently as Haswell detect and shoot-down mis-speculation so code that assumes this doesn't happen will still work.

这意味着真正的硬件比ISA提供了更强大的订购保证,

This means the real hardware gives a stronger ordering guarantee than the ISA, which says:

尽管如此,除非您仅在已知的微体系结构上这样做,否则依赖于此更强的行为将是一个错误. AMD K8/K10与Intel类似,但是Bulldozer家族推测没有任何检测+回退机制来提供一致性,因此Win9x内核代码在该硬件上并不安全.而且,未来的英特尔硬件也可能会放弃检测+回滚"机制.

Still, depending on this stronger behaviour would be a mistake, unless you only do it on known microarchitectures. AMD K8/K10 is like Intel, but Bulldozer-family speculates without any detect+rollback mechanism to give coherence, so that Win9x kernel code isn't safe on that hardware. And future Intel hardware might drop the detect+rollback mechanism, too.

TL:DR:所有的uarches都实现了x86 ISA所说的内容,但是有些提供了更强有力的保证.如果您和Microsoft一样大,那么Intel和AMD将设计可重现代码所依赖的非ISA指定行为的CPU.至少要等到该软件过时为止.无法保证将来的Intel uarches会保留回滚机制.

TL:DR: all the uarches implement what the x86 ISA says, but some give stronger guarantees. If you're as big as Microsoft, Intel and AMD will design CPUs that reproduce the non-ISA-specified behaviour that your code depends on. At least until the software is long-obsolete. There's no guarantee that future Intel uarches will keep the rollback mechanism.

另一个示例:输入为零的 bsf指令使输出未定义,根据Intel insn参考手册中的纸张规格.

A different example: the bsf instruction with an input of zero leaves its output undefined, according to the paper spec in Intel's insn ref manual.

但是对于任何特定的CPU,都会有某种行为模式,例如将输出设置为零或保持不变.从表面上看,对于乱序执行的CPU,由于微体系结构状态的不同,实际上给出不可预测的结果(对于相同的输入而言会有所不同)是有效的.

But for any specific CPU, there will be some pattern of behaviour, like setting the output to zero, or leaving it unchanged. On paper, it would be valid for an out-of-order-execution CPU to really give unpredictable results that were different for the same inputs, because of different microarchitectural state.

但是英特尔选择在芯片中实现的行为是,当bsfbsr输入为零时,始终保持目标不变. AMD会这样做,甚至会记录行为.基本上,这是非官方的保证,mov eax,32/bsf eax, ebx的工作方式与tzcnt完全一样(标志设置除外,例如,基于输入为0而不是输出的ZF).

But the behaviour Intel chooses to implement in silicon is to always leave the destination unchanged when the bsf or bsr input is zero. AMD does the same, and even documents the behaviour. It's basically an unofficial guarantee that mov eax,32 / bsf eax, ebx will work exactly like tzcnt (except for flag setting, e.g. ZF based on the input being 0, rather than the output).

这就是为什么 popcnt/lzcnt/tzcnt对Intel CPU中的输出寄存器有错误的依赖性!.

This is why popcnt / lzcnt / tzcnt have a false dependency on the output register in Intel CPUs!.

CPU供应商通常会超出纸上ISA规范,以避免破坏某些依赖于此行为的现有代码(例如,该代码是否是Windows的一部分,或者是Intel/AMD测试所基于的其他主要软件)他们的新CPU设计).

It's common for CPU vendors to go above and beyond the paper ISA spec to avoid breaking some existing code that depends on this behaviour (e.g. if that code is part of Windows, or other major pieces of software that Intel / AMD test on their new CPU designs).

如Andy Glew所说一个注释线程,上面提到的有关连贯的页面遍历的事情,以及关于自我修改的代码:

As Andy Glew said in a comment thread about the coherent page walk thing mentioned above, and about self-modifying code:

这篇关于英特尔和AMD的ISA(如果有)之间到底有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-29 05:34