本文介绍了英特尔JCC勘误表-JCC真的应该单独对待吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Intel推动了微代码更新,以修复称为跳转条件代码(JCC)勘误"的错误.由于在某些情况下无法将代码放入ICache,因此更新微码导致某些操作效率低下.

Intel pushed microcode update to fix error called "Jump Conditional Code (JCC) Erratum". The update microcode caused some operation to be inefficient due to disabling putting code to ICache under certain conditions.

已发布的文档,标题为跳转条件代码勘误的缓解措施不仅列出了JCC,而且还列出了:无条件跳转,条件跳转,宏混合条件跳转,调用和返回.

Published document, titled Mitigations for Jump Conditional Code Erratum lists not only JCC, it lists: unconditional jumps, conditional jumps, macro-fused conditional jumps, calls, and return.

MSVC开关 /QIntel-jcc-erratum 文档中提到:

MSVC switch /QIntel-jcc-erratum documentation mentions:

问题是:

  • 是否有理由将JCC与其他跳跃分开对待?
  • 是否有理由将宏融合的JCC与其他JCC分开处理?

推荐答案

必须单独提及与宏融合的跳转,因为这意味着整个cmp/jcc或如果cmp触及边界时容易受到这种减速的影响jcc本身没有.因为uop缓存对于这两个x86机器指令来说都只有一个uop,并且带有非跳转指令的起始地址.

Macro-fused jumps have to be mentioned separately because it means the whole cmp/jcc or whatever is vulnerable to this slowdown if the cmp touches the boundary when the jcc itself doesn't. Because the uop cache would have a single uop for both those x86 machine instructions together, with the start address of the non-jump instruction.

如果每个人都只说跳跃",那么您会希望只有JCC/JMP/CALL/RET本身必须避免触及32B边界.因此,突出显示与宏融合的交互是一件好事.

If everyone only said "jumps", you'd expect that only the JCC / JMP / CALL / RET itself had to avoid touching a 32B boundary. So it's a good thing to highlight the interaction with macro-fusion.

(对于所有跳转而言)此速度下降是由于微码缓解/解决硬件设计缺陷的解决方法的结果. 无法uop-cache缓存跳转到32字节边界并不是最初的错误,这是解决方法的副作用.

This slowdown (for all jumps) is the result of a microcode mitigation / workaround for a hardware design flaw. Not being able to uop-cache cache jumps that touch a 32-byte boundary is not the original erratum, it's a side effect of the cure.

最初的错误说明并没有说明仅影响条件分支.即使只有条件分支才是真正的问题,也许不幸的是,英特尔通过微码更新找到使其安全的最佳方法不幸地影响了所有跳转.

That original erratum description doesn't say anything about affecting only conditional branches. Even if it was only conditional branches that were a real problem, perhaps the best way Intel could find to make it safe with a microcode update unfortunately affected all jumps.

例如,在Skylake-Xeon(SKX)中,原始勘误记录为英特尔该uarch的" spec update勘误表:

For example, in Skylake-Xeon (SKX), the original erratum is documented as SKX102 in Intel's "spec update" errata list for that uarch:

问题:在涉及分支指令字节的复杂微体系结构条件下, 跨越多个64字节边界(跨高速缓存行),系统行为无法预测 可能导致.

Problem: Under complex micro-architectural conditions involving branch instructions bytes that span multiple 64 byte boundaries (cross cache line), unpredictable system behavior may occur.

含义:当出现这种错误时,系统可能无法正常运行.

Implication: When this erratum occurs, the system may behave unpredictably.

解决方法:BIOS可能包含此错误的解决方法. [IE.微码更新]

Workaround: It is possible for BIOS to contain a workaround for this erratum. [i.e. a microcode update]

状态:未解决.


我怀疑"JCC勘误"的名称之所以流行,是因为"hot"中的大多数分支都是有条件的.编译器通常可以避免在快速路径中放入无条件的分支.因此,人们很可能首先注意到了JCC指令的性能问题,即使这个名称不准确,也只是简单地卡住了它.


I suspect the "JCC erratum" name caught on because most branches in "hot" are conditional. Compilers can usually avoid putting unconditional taken branches in the fast path. So it's likely that people noticed the performance problem with JCC instructions first, and that name simply stuck even though it's not accurate.

BTW, 32字节对齐的例程不适合uops缓存,其中包含您链接的有关英特尔PDF的相关图表的屏幕快照,以及一些其他链接和有关性能影响的详细信息.

BTW, 32-byte aligned routine does not fit the uops cache has a screenshot of the relevant diagram from the Intel PDF you linked about, and some other links and details about performance effects.

这篇关于英特尔JCC勘误表-JCC真的应该单独对待吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-13 08:27