本文介绍了如何为我的独立可启动代码启用 SSE?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

(这个问题最初是关于 CVTSI2SD 指令的,事实上我认为它在 Pentium M CPU 上不起作用,但实际上是因为我使用的是自定义操作系统,我需要手动启用 SSE.)

(This question was originally about the CVTSI2SD instruction and the fact that I thought it didn't work on the Pentium M CPU, but in fact it's because I'm using a custom OS and I need to manually enable SSE.)

我有一个 Pentium M CPU 和一个定制的操作系统,到目前为止没有使用过 SSE 指令,但我现在需要使用它们.

I have a Pentium M CPU and a custom OS which so far used no SSE instructions, but I now need to use them.

尝试执行任何 SSE 指令都会导致中断 6,非法操作码(在 Linux 中会导致 SIGILL,但这不是 Linux),也在 英特尔架构软件开发人员手册(我从现在起将其称为 IASDM)作为 #UD - 无效的操作码(未定义的操作码).

Trying to execute any SSE instruction results in an interruption 6, illegal opcode (which in Linux would cause a SIGILL, but this isn't Linux), also referred to in the Intel architectures software developer's manual (which I refer from now on as IASDM) as #UD - Invalid Opcode (UnDefined Opcode).

编辑:Peter Cordes 实际上确定了正确的原因,并向我指出了解决方案,我在下面继续:

Edit: Peter Cordes actually identified the right cause, and pointed me to the solution, which I resume below:

如果您运行的是不支持在上下文切换时保存 XMM regs 的古老操作系统,则不会设置其中一个机器控制寄存器中的 SSE 启用位.

事实上,IASDM 提到了这一点:

Indeed, the IASDM mentions this:

如果操作系统没有为 SSE 提供足够的系统级支持,执行 SSE 或 SSE2 指令也会生成 #UD.

Peter Cordes 向我介绍了 SSE OSDev wiki,它描述了如何通过写信来启用 SSECR0CR4 控制寄存器:

Peter Cordes pointed me to the SSE OSDev wiki, which describes how to enable SSE by writing to both CR0 and CR4 control registers:

clear the CR0.EM bit (bit 2) [ CR0 &= ~(1 << 2) ]
set the CR0.MP bit (bit 1) [ CR0 |= (1 << 1) ]
set the CR4.OSFXSR bit (bit 9) [ CR4 |= (1 << 9) ]
set the CR4.OSXMMEXCPT bit (bit 10) [ CR4 |= (1 << 10) ]

请注意,为了能够写入这些寄存器,如果您处于保护模式,那么您需要处于特权级别 0.这个问题的答案解释了如何测试:如果在保护模式下,即当CR0中的位0(PE)被设置时到 1,然后您可以测试 CS 选择器中的位 0 和 1,它们应该都是 0.

Note that, in order to be able to write to these registers, if you are in protected mode, then you need to be in privilege level 0. The answer to this question explains how to test it: if in protected mode, that is, when bit 0 (PE) in CR0 is set to 1, then you can test bits 0 and 1 from the CS selector, which should be both 0.

最后,自定义操作系统必须在上下文切换期间正确处理 XMM 寄存器,并在必要时保存和恢复它们.

Finally, the custom OS must properly handle XMM registers during context switches, by saving and restoring them when necessary.

推荐答案

如果您运行的是不支持在上下文切换时保存 XMM regs 的古老或自定义操作系统,则它不会设置启用 SSE 的位在机器控制寄存器中.在这种情况下,所有接触 xmm regs 的指令都会出错.

If you're running an ancient or custom OS that doesn't support saving XMM regs on context switches, it won't have set the SSE-enabling bits in the machine control registers. In that case all instructions that touch xmm regs will fault.

我花了一秒钟才找到,但是 http://wiki.osdev.org/SSE解释了如何更改 CR0 和 CR4 以允许 SSE 指令在没有 #UD 的情况下在裸机上运行.

Took me a sec to find, but http://wiki.osdev.org/SSE explains how to alter CR0 and CR4 to allow SSE instructions to run on bare metal without #UD.

我对你旧版问题的第一个想法是您可能已经使用 -mavx-march=sandybridge 或等效工具编译了您的程序,从而导致编译器发出所有内容的 VEX 编码版本.

My first thought on your old version of the question wasthat you might have compiled your program with -mavx, -march=sandybridge or equivalent, causing the compiler to emit the VEX-encoded version of everything.

CVTSI2SD   xmm1, xmm2/m32         ; SSE2
VCVTSI2SD  xmm1, xmm2, xmm3/m32   ; AVX

请参阅 https://stackoverflow.com/tags/x86/info 以获取链接,包括 Intel 的 insn set ref手册.

See https://stackoverflow.com/tags/x86/info for links, including to Intel's insn set ref manual.

相关:哪些版本的 Windows 支持/需要哪些 CPU 多媒体扩展? 有一些关于如何检查对 AVX 和 AVX512 支持的详细信息(这也引入了新的架构状态,因此操作系统必须设置一点,否则硬件会出错).它是从另一个角度来看的,但链接应指明如何激活/禁用 AVX 支持.

Related: Which versions of Windows support/require which CPU multimedia extensions? has some details about how to check for support for AVX and AVX512 (which also introduce new architectural state, so the OS has to set a bit or the HW will fault). It's coming at it from the other angle, but the links should indicate how to activate / disable AVX support.

这篇关于如何为我的独立可启动代码启用 SSE?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-29 05:36