问题描述
我正在尝试在我的.text部分中用汇编语言定义一个字节.我知道数据应该转到.data部分,但我想知道为什么这样做时会给我分段错误.如果我在.data中定义字节,则不会像.text一样给我任何错误.我正在使用运行Mint 19.1的Linux计算机,并使用NASM + LD编译和链接可执行文件.
I'm trying to define a byte in Assembly language inside my .text section. I know data should go to the .data section but I was wondering why it gives me a segmentation fault when I do it. If I define the byte inside .data, it doesn't give me any errors, unlike .text. I am using a Linux machine running Mint 19.1 and using NASM + LD to compile and link the executable.
这运行时没有分段错误:
This runs without segmentation faults:
global _start
section .data
db 0x41
section .text
_start:
mov rax, 60 ; Exit(0) syscall
xor rdi, rdi
syscall
这给了我一个段错误:
global _start
section .text
_start:
db 0x41
mov rax, 60 ; Exit(0) syscall
xor rdi, rdi
syscall
我正在使用以下脚本进行编译和链接:
I'm using the following script to compile and link it:
nasm -felf64 main.s -o main.o
ld main.o -o main
我希望程序可以正常运行,没有任何分段错误,但是当我在.text中使用DB时却没有.我怀疑.text是只读的,这可能是此问题的原因,我正确吗?有人可以向我解释为什么我的第二个代码示例段错误吗?
I expect the program to work without any segmentation faults, but it doesn't when I use DB inside .text.I suspect that .text is readonly and that may be the reason of this problem, am I correct? Can someone explain to me why my second code example segfaults?
推荐答案
如果告诉汇编器在某个地方汇编任意字节,它将执行. db
是发出字节的伪指令,因此就NASM而言,mov eax, 60
和db 0xb8, 0x3c, 0, 0, 0
几乎完全等效.要么将这5个字节发送到该位置的输出.
If you tell the assembler to assemble arbitrary bytes somewhere, it will. db
is a pseudo-instruction that emits bytes, so mov eax, 60
and db 0xb8, 0x3c, 0, 0, 0
are pretty much exactly equivalent as far as NASM is concerned. Either one will emit those 5 bytes into the output at that position.
如果您不希望将数据解码为(一部分)指令,请不要将其放在执行可访问的位置.
由于您使用的是NASM ,因此会将mov rax,60
优化为mov eax,60
,因此该指令没有您希望从源代码获得的REX前缀.
Since you're using NASM, it optimizes mov rax,60
into mov eax,60
, so the instruction doesn't have the REX prefix you'd expect from the source.
您为mov
手动编码的REX前缀将其更改为mov
而不是EAX的R8D :41 b8 3c 00 00 00 mov r8d,0x3c
Your manually-encoded REX prefix for mov
changes it into a mov
to R8D instead of EAX:41 b8 3c 00 00 00 mov r8d,0x3c
(我用objdump -drwC -Mintel
进行了检查,而不是查找REX前缀中的哪个位.我只记得REX.W是0x48
.但是0x41
是x86-64中的REX.B前缀)
(I checked with objdump -drwC -Mintel
instead of looking up which bit is which in the REX prefix. I only remember that REX.W is 0x48
. But 0x41
is a REX.B prefix in x86-64).
因此,您的代码不是使用sys_exit
系统调用,而是使用EAX = 0(即__NR_read
)运行syscall
. (Linux内核在进程启动之前将RSP以外的所有寄存器清零,并且在静态链接的可执行文件中,_start
是真正的入口点,没有先运行动态链接程序代码.因此RAX仍然为零.)
So instead of making a sys_exit
system call, your code runs syscall
with EAX=0, which is __NR_read
. (The Linux kernel zeros all the registers other than RSP before process startup, and in a statically-linked executable, _start
is the true entry point with no dynamic linker code running first. So RAX is still zero).
$ strace ./rex
execve("./rex", ["./rex"], 0x7fffbbadad60 /* 54 vars */) = 0
read(0, NULL, 0) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=NULL} ---
+++ killed by SIGSEGV (core dumped) +++
,然后执行会落入之后 syscall
中,在这种情况下,这是00 00
个字节,它们解码为add [rax], al
,从而导致段错误.如果您要在GDB中运行代码,您会看到的.
And then execution falls through into whatever is after syscall
, which in this case is 00 00
bytes that decode as add [rax], al
, and thus segfault. You would have seen this if you'd run your code inside GDB.
脚注1:如果您使用的YASM不能优化为32位操作数大小:
Intel的手册说,在一条指令上具有2个REX前缀是非法的.我预计会出现非法指令错误(#UD计算机异常->内核提供了SIGILL),但是我的Skylake CPU忽略了第一个REX前缀,并将其解码为mov rax, sign_extended_imm32
.
Intel's manuals say that it's illegal to have 2 REX prefixes on one instruction. I expected an illegal-instruction fault (#UD machine exception -> kernel delivers SIGILL), but my Skylake CPU ignores the first REX prefix and decodes it as mov rax, sign_extended_imm32
.
单步执行,它被视为一条长指令,因此我猜想Skylake选择像处理多个前缀的其他情况一样处理它,其中只有类型的最后一个才起作用. (但是请记住,这不是面向未来的,其他x86 CPU可能会以不同的方式处理它.)
Single-stepping, it's treated as one long instructions, so I guess Skylake chooses to handle it like other cases of multiple prefixes, where only the last one of a type has an effect. (But remember this is not future-proof, other x86 CPUs could handle it differently.)
这篇关于在函数中使用DB(定义字节)时出现分段错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!