本文介绍了在函数中使用DB(定义字节)时出现分段错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在我的.text部分中用汇编语言定义一个字节.我知道数据应该转到.data部分,但我想知道为什么这样做时会给我分段错误.如果我在.data中定义字节,则不会像.text一样给我任何错误.我正在使用运行Mint 19.1的Linux计算机,并使用NASM + LD编译和链接可执行文件.

I'm trying to define a byte in Assembly language inside my .text section. I know data should go to the .data section but I was wondering why it gives me a segmentation fault when I do it. If I define the byte inside .data, it doesn't give me any errors, unlike .text. I am using a Linux machine running Mint 19.1 and using NASM + LD to compile and link the executable.

这运行时没有分段错误:

This runs without segmentation faults:

global _start
section .data
db 0x41
section .text
_start:
    mov rax, 60    ; Exit(0) syscall
    xor rdi, rdi
    syscall

这给了我一个段错误:

global _start
section .text
_start:
    db 0x41
    mov rax, 60     ; Exit(0) syscall
    xor rdi, rdi
    syscall

我正在使用以下脚本进行编译和链接:

I'm using the following script to compile and link it:

nasm -felf64 main.s -o main.o
ld main.o -o main

我希望程序可以正常运行,没有任何分段错误,但是当我在.text中使用DB时却没有.我怀疑.text是只读的,这可能是此问题的原因,我正确吗?有人可以向我解释为什么我的第二个代码示例段错误吗?

I expect the program to work without any segmentation faults, but it doesn't when I use DB inside .text.I suspect that .text is readonly and that may be the reason of this problem, am I correct? Can someone explain to me why my second code example segfaults?

推荐答案

如果告诉汇编器在某个地方汇编任意字节,它将执行. db是发出字节的伪指令,因此就NASM而言,mov eax, 60db 0xb8, 0x3c, 0, 0, 0几乎完全等效.要么将这5个字节发送到该位置的输出.

If you tell the assembler to assemble arbitrary bytes somewhere, it will. db is a pseudo-instruction that emits bytes, so mov eax, 60 and db 0xb8, 0x3c, 0, 0, 0 are pretty much exactly equivalent as far as NASM is concerned. Either one will emit those 5 bytes into the output at that position.

如果您不希望将数据解码为(一部分)指令,请不要将其放在执行可访问的位置.

由于您使用的是NASM ,因此会将mov rax,60优化为mov eax,60,因此该指令没有您希望从源代码获得的REX前缀.

Since you're using NASM, it optimizes mov rax,60 into mov eax,60, so the instruction doesn't have the REX prefix you'd expect from the source.

您为mov手动编码的REX前缀将其更改为mov而不是EAX的R8D :
41 b8 3c 00 00 00 mov r8d,0x3c

Your manually-encoded REX prefix for mov changes it into a mov to R8D instead of EAX:
41 b8 3c 00 00 00 mov r8d,0x3c

(我用objdump -drwC -Mintel进行了检查,而不是查找REX前缀中的哪个位.我只记得REX.W是0x48.但是0x41是x86-64中的REX.B前缀)

(I checked with objdump -drwC -Mintel instead of looking up which bit is which in the REX prefix. I only remember that REX.W is 0x48. But 0x41 is a REX.B prefix in x86-64).

因此,您的代码不是使用sys_exit系统调用,而是使用EAX = 0(即__NR_read )运行syscall. (Linux内核在进程启动之前将RSP以外的所有寄存器清零,并且在静态链接的可执行文件中,_start是真正的入口点,没有先运行动态链接程序代码.因此RAX仍然为零.)

So instead of making a sys_exit system call, your code runs syscall with EAX=0, which is __NR_read. (The Linux kernel zeros all the registers other than RSP before process startup, and in a statically-linked executable, _start is the true entry point with no dynamic linker code running first. So RAX is still zero).

$ strace ./rex
execve("./rex", ["./rex"], 0x7fffbbadad60 /* 54 vars */) = 0
read(0, NULL, 0)                        = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=NULL} ---
+++ killed by SIGSEGV (core dumped) +++

,然后执行会落入之后 syscall中,在这种情况下,这是00 00个字节,它们解码为add [rax], al,从而导致段错误.如果您要在GDB中运行代码,您会看到的.

And then execution falls through into whatever is after syscall, which in this case is 00 00 bytes that decode as add [rax], al, and thus segfault. You would have seen this if you'd run your code inside GDB.

脚注1:如果您使用的YASM不能优化为32位操作数大小:

Intel的手册说,在一条指令上具有2个REX前缀是非法的.我预计会出现非法指令错误(#UD计算机异常->内核提供了SIGILL),但是我的Skylake CPU忽略了第一个REX前缀,并将其解码为mov rax, sign_extended_imm32.

Intel's manuals say that it's illegal to have 2 REX prefixes on one instruction. I expected an illegal-instruction fault (#UD machine exception -> kernel delivers SIGILL), but my Skylake CPU ignores the first REX prefix and decodes it as mov rax, sign_extended_imm32.

单步执行,它被视为一条长指令,因此我猜想Skylake选择像处理多个前缀的其他情况一样处理它,其中只有类型的最后一个才起作用. (但是请记住,这不是面向未来的,其他x86 CPU可能会以不同的方式处理它.)

Single-stepping, it's treated as one long instructions, so I guess Skylake chooses to handle it like other cases of multiple prefixes, where only the last one of a type has an effect. (But remember this is not future-proof, other x86 CPUs could handle it differently.)

这篇关于在函数中使用DB(定义字节)时出现分段错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-21 18:55