问题描述
我正在组装32位操作系统.
我已经设置了IDT,并且正在通过int
指令处理程序中断.
I am building an 32 bit OS in assembly.
I have setup the IDT and I am handling program interruptus through int
instruction.
如何启用syscall
和sysenter
指令以及如何处理/返回?
是真的,英特尔处理器不支持32位的syscall
指令,所以我不能使用它?sysret
指令不安全吗?某处是否存在针对此的教程?
How can I enable the syscall
and sysenter
instructions and how do I handle them/return?
Is true that syscall
instruction isn't supported in 32 bit by Intel processors so I can't use it?Is true that sysret
instruction isn't safe?Do somewhere exist a tutorial for that?
编辑:我的主要问题是如何启用syscall
和sysenter
指令! (无重复)
EDIT: My main question is how to enable the syscall
and sysenter
instructions! (No duplication)
推荐答案
有关sysenter
的详细信息,请参见 OSdev Wiki. ,其中包含有关如何避免安全问题的注释.另请参阅英特尔/AMD手册.他们深入研究了OS开发人员需要的许多细节.请参阅 x86 标签wiki的链接.
See the OSdev wiki for details on sysenter
, including a note about how to avoid a security/safety problem. Also see the Intel / AMD manuals for that. They go into a lot of the detail that OS developers need. See the x86 tag wiki for links.
各种系统调用说明的概述:
Overview of the various system-call instructions:
-
int
:永久可用(8086) - 通过执行无效指令进行诱捕,显然是在80386上进入内核的最快方法. (但现在不再是这种情况了.)
- 呼叫门 (即).有关该陷阱的详细信息,请参见OSdev链接.
-
sysenter
:( http://wiki.osdev.org/Sysenter )是Intel在x86-64出现之前引入的,不久之后又被AMD采纳(很多年前).在所有现代x86 CPU上可用.极简的设计要求内核能够返回用户空间,因为它不会在任何地方保存EIP,ESP或EFLAGS .
int
: available since forever (8086)- Trap by executing an invalid instruction, apparently was the fastest way to enter the kernel on 80386. (But that's not the case anymore).
- call gate (i.e. a
far call
). See the OSdev link for details on that and traps. sysenter
: (http://wiki.osdev.org/Sysenter) Introduced by Intel before x86-64 existed, adopted by AMD not long after (many years ago). Available on all modern x86 CPUs. Very minimalist design, requires user-space cooperation for the kernel to be able to return, because it doesn't save EIP, ESP, or EFLAGS anywhere.
Linux在32位和64位内核中仅支持从32位进程进行系统调用.如果您可以设计一个将IDK用于64位系统调用的内核,也可以使用IDK. (我知道这不是问题,但有关系.)
Linux supports it in 32 and 64-bit kernels for system calls from 32-bit processes only. IDK if you could design a kernel that used it for 64-bit system calls as well / instead. (I know that wasn't the question, but it's related.)
使用sysenter
需要用户空间的合作才能提供返回地址并保存其自己的ESP和EFLAGS.在Linux中,内核会导出一个页面的代码,该页面具有这种舞步的用户空间.用户空间应该使用call
此代码,而不是直接使用sysenter
,但是可以根据需要随意设计操作系统.如果您在其他地方找不到示例,那么查看此舞步两面的Linux代码可能会很有用.
Using sysenter
requires user-space cooperation to provide the return address and save its own ESP and EFLAGS. In Linux, the kernel exports a page of code which has the user-space side of this dance. User-space is expected to call
this code instead of using sysenter
directly, but feel free to design your OS however you want. Looking at Linux's code for both sides of this dance will probably be useful, if you don't find an example somewhere else.
syscall
(来自64位用户空间):随处可见,因为Intel与AMD64的其余部分一起实现了该功能.经过精心设计的接口可在进入内核之前屏蔽RFLAGS(具有可配置的掩码),因此您可以避免出现竞争窗口(如果必须使用cli
手动禁用中断).与swapgs
一起使用,内核可以访问其堆栈,依此类推.
syscall
from 64-bit user-space: available everywhere because Intel implemented it along with the rest of AMD64. Well-designed interface that masks RFLAGS (with a configurable mask) before entering the kernel, so you can avoid a race window (if you had to disable interrupts manually with cli
). Used with swapgs
for the kernel to get access to its stack and so on.
在主流x86操作系统(如Linux)上,syscall
是进行64位系统调用的唯一方法.
On mainstream x86 OSes (like Linux), syscall
is the only way to make 64-bit system calls.
syscall
(来自32位用户空间):与长模式syscall
完全不同的指令,仅在AMD CPU上可用.内核侧接口对于32位内核(传统模式)与运行32位用户空间(兼容模式)的64位内核有所不同.
syscall
from 32-bit user-space: A totally different instruction from long mode syscall
, only available on AMD CPUs. The kernel-side interface is different for 32-bit kernels (legacy mode) vs. 64-bit kernels running 32-bit user-space (compat mode).
Linux内核对此有一些有用的注释:
The Linux kernel has some useful comments on it:
/* ...
* - Most programmers do not directly target AMD CPUs, and the 32-bit
* SYSCALL instruction does not exist on Intel CPUs. Even on AMD
* CPUs, Linux disables the SYSCALL instruction on 32-bit kernels
* because the SYSCALL instruction in legacy/native 32-bit mode (as
* opposed to compat mode) is sufficiently poorly designed as to be
* essentially unusable.
也许玩具OS可以使用它,而不必担心任何问题使其不适用于Linux,IDK.但是,除非您只是好奇,否则请不要浪费时间. OTOH,如果您对OS& CPU设计,找出ISA设计出了什么问题可能很有趣.
Maybe a toy OS could use it without worrying about whatever problems make it unsuitable for Linux, IDK. But unless you're just plain curious, don't waste your time with it. OTOH, if you're interested in OS & CPU design, finding out what's wrong with the ISA design might be interesting.
BTW,当AMD在设计AMD64时,他们从amd64邮件列表上的Linux内核开发人员那里得到了一些反馈,这些反馈改进了64位syscall
的设计(以可配置方式掩盖RFLAGS),因为它们的初始设计对于Linux.到这些已归档邮件列表帖子的链接.
BTW, when AMD was designing AMD64, they got some feedback from Linux kernel devs on the amd64 mailing list that improved the design of 64-bit syscall
(to configurably mask RFLAGS) because their initial design would have been problematic for Linux. Links to those archived mailing list posts in this answer.
建议:对32位内核使用sysenter
.它应该可以在任何地方使用,包括多年来在AMD CPU上.如果要添加第二兼容ABI,则不支持该功能的古代CPU可以使用int 0x80
ABI(或您为操作系统选择的任何编号).
Recommendation: Use sysenter
for your 32-bit kernel. It should be usable everywhere, including on AMD CPUs for years now. Ancient CPUs that don't support it can use the int 0x80
ABI (or whatever number you picked for your OS), if you want to add a 2nd compatibility ABI.
Linux内核入口点已得到很好的注释,并且编写得相当可读.在编写,我很容易就能弄清楚使用(本机64位系统调用)或int 0x80
或sysenter
(32位系统调用,通常从兼容模式运行,但64位进程支持int 0x80
.但它仍会调用32位ABI !)如果启用了各种跟踪/调试,会有很多复杂的事情发生,但是其他部分相当容易理解.请参阅该答案,以逐步了解Linux的一些系统调用处理内部组件.
The Linux kernel entry points are well commented, and written fairly readably. While writing What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?, I had an easy time figuring out what was going on in the entry points into a 64-bit kernel using syscall
(native 64-bit system calls), or int 0x80
or sysenter
(32-bit system calls, normally from compat mode but int 0x80
is supported for 64-bit processes. But it still invokes the 32-bit ABI!) There's a bunch of complicated stuff going on in case various kinds of tracing / debugging are enabled, but the other parts are fairly easy to follow. See that answer for a walk-through of some of Linux's system-call handling internals.
在 arch/x86/entry
中,这些是主要文件感兴趣的:
In arch/x86/entry
, these are the main files of interest:
-
entry_32.S
:用于从用户空间输入的32位内核代码. (旧版模式) -
entry_64_compat.S
:用于从32位用户空间输入的64位内核代码(compat模式-> long模式). -
entry_64.S
:用于从64位用户空间输入的64位内核代码(长模式->长模式).
entry_32.S
: 32-bit kernel code for entry from user-space. (legacy mode)entry_64_compat.S
: 64-bit kernel code for entry from 32-bit user-space (compat mode -> long mode).entry_64.S
: 64-bit kernel code for entry from 64-bit user-space (long mode -> long mode).
您应该能够找到sysenter
舞蹈的用户空间端的Linux VDSO代码,该代码将内核返回给用户空间所需的值传递给内核. ("int 0x80"或"syscall"哪个更好?).相关:什么是更好的"int 0x80"?或"syscall"?和《 Linux系统调用权威指南》 将提供有关Linux所做的设计选择的一些有用信息.
You should be able to find Linux's VDSO code for the user-space side of the sysenter
dance that passes the kernel the values it needs to return to user-space. (What is better "int 0x80" or "syscall"?). Related: What is better "int 0x80" or "syscall"?, and The Definitive Guide to Linux System Calls will give some useful info on the design choices Linux made.
当返回到64位用户空间时,Intel和AMD都有非规范RIP的独立错误.例如在Intel上, Linux的entry_64.S
这样描述它:
Intel and AMD both have separate bugs with non-canonical RIP when returning to 64-bit user-space. e.g. on Intel, Linux's entry_64.S
describes it this way:
/*
* On Intel CPUs, SYSRET with non-canonical RCX/RIP will #GP
* in kernel space. This essentially lets the user take over
* the kernel, since userspace controls RSP.
如果ptrace
系统调用(例如,由调试器进行的调用)将进程的RIP
的保存值更改为非规范地址,则会发生这种情况. Linux检查它是否可以使用sysret
,如果不能使用它的iret
返回路径. (sysret
路径足够快,因此值得做一些额外的工作来检查它是否安全).
That can happen if a ptrace
system call (e.g. made by a debugger) changed the saved value of the process's RIP
to a non-canonical address. Linux checks whether it can use sysret
, and if not uses its iret
return path. (The sysret
path is fast enough that it's worth doing extra work to check that it's safe).
请注意,如果系统调用阻塞/睡眠,则用户空间整数寄存器状态的主副本"位于其内核堆栈上,系统调用入口将其推送到该堆栈. (在Linux中.其他设计也是可能的!)但是无论如何,这就是为什么有可能最终以怪异的保存状态结束,用户空间无法运行syscall
的原因(因为它会在jmp
上错误地显示为非规范地址),或者使用saved_rcx != saved_RIP
(64位syscall
设置RCX = RIP,并且R11 = RFLAGS(在屏蔽之前),因此它掩盖了RCX和R11,但允许内核恢复RIP和RFLAGS.)
Note that if a system call blocks / sleeps, the "master copy" of user-space's integer register state is on its kernel stack, where the system call entry point pushed it. (In Linux. Other designs are possible!) But anyway, this is why it's possible to end up with weird saved state that user-space couldn't have run syscall
with (because it would have faulted on jmp
to a non-canonical address), or with saved_rcx != saved_RIP
(64-bit syscall
sets RCX=RIP, and R11=RFLAGS (before masking), so it clobbers RCX and R11 but allows the kernel to restore RIP and RFLAGS.)
我不知道32位syscall
的工作方式,对不起,我在这里下了话题.但是我怀疑您可能读到的关于sysret
不安全的内容是关于64位内核的.
I don't know how 32-bit syscall
works, sorry I got off topic here. But I suspect that what you may have read about sysret
being unsafe was talking about 64-bit kernels.
如果32位内核sysret
或64位内核sysret
-to-compat-mode中存在任何类似的错误,则为IDK.
IDK if there are any similar bugs in 32-bit-kernel sysret
, or 64-bit-kernel sysret
-to-compat-mode.
这篇关于OsDev syscall/sysret和sysenter/sysexit指令启用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!