像“[RIP + _a]"这样的 RIP 相关变量引用如何?在 x86-64 GAS Intel 语法中工作?

本文介绍了像“[RIP + _a]"这样的 RIP 相关变量引用如何?在 x86-64 GAS Intel 语法中工作?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

考虑 x64 Intel 程序集中的以下变量引用，其中变量 a 在 .data 部分声明:

Consider the following variable reference in x64 Intel assembly, where the variable a is declared in the .data section:

mov eax, dword ptr [rip + _a]

我无法理解此变量引用的工作原理.由于a是变量运行时地址对应的符号(带重定位)，如何[rip + _a]解引用a?确实，rip保存的是当前指令的地址，它是一个很大的正整数，所以加法会导致a的地址不正确?

I have trouble understanding how this variable reference works. Since a is a symbol corresponding to the runtime address of the variable (with relocation), how can [rip + _a] dereference the correct memory location of a? Indeed, rip holds the address of the current instruction, which is a large positive integer, so the addition results in an incorrect address of a?

相反，如果我使用 x86 语法(非常直观):

Conversely, if I use x86 syntax (which is very intuitive):

mov eax, dword ptr [_a]

，我收到以下错误:64 位模式不支持 32 位绝对寻址.

有什么解释吗?

  1 int a = 5;
  2
  3 int main() {
  4     int b = a;
  5     return b;
  6 }

编译:gcc -S -masm=intel abs_ref.c -o abs_ref:

  1     .section    __TEXT,__text,regular,pure_instructions
  2     .build_version macos, 10, 14
  3     .intel_syntax noprefix
  4     .globl  _main                   ## -- Begin function main
  5     .p2align    4, 0x90
  6 _main:                                  ## @main
  7     .cfi_startproc
  8 ## %bb.0:
  9     push    rbp
 10     .cfi_def_cfa_offset 16
 11     .cfi_offset rbp, -16
 12     mov rbp, rsp
 13     .cfi_def_cfa_register rbp
 14     mov dword ptr [rbp - 4], 0
 15     mov eax, dword ptr [rip + _a]
 16     mov dword ptr [rbp - 8], eax
 17     mov eax, dword ptr [rbp - 8]
 18     pop rbp
 19     ret
 20     .cfi_endproc
 21                                         ## -- End function
 22     .section    __DATA,__data
 23     .globl  _a                      ## @a
 24     .p2align    2
 25 _a:
 26     .long   5                       ## 0x5
 27
 28
 29 .subsections_via_symbols

`推荐答案`

RIP 相对寻址的 GAS 语法看起来像 symbol + current_address (RIP)，但它实际上意味着符号相对于 RIP.

GAS syntax for RIP-relative addressing looks like symbol + current_address (RIP), but it actually means symbol with respect to RIP.

与数字文字不一致:

[rip + 10] 或 AT&T 10(%rip) 表示该指令结束后 10 个字节

[rip + 10] or AT&T 10(%rip) means 10 bytes past the end of this instruction

[rip + a] 或 AT&T a(%rip) 表示计算一个 rel32 位移达到 a, not RIP + 符号值.(GAS 手册记录了这个特殊解释)

[rip + a] or AT&T a(%rip) means to calculate a rel32 displacement to reach a, not RIP + symbol value. (The GAS manual documents this special interpretation)

[a] 或 AT&T a 是绝对地址，使用 disp32 寻址方式.这在 OS X 上不受支持，其中图像基地址始终在低 32 位之外.(或者对于 mov 到/从 al/ax/eax/rax，可以使用 64 位绝对 moffs 编码，但您不希望那样).

[a] or AT&T a is an absolute address, using a disp32 addressing mode. This isn't supported on OS X, where the image base address is always outside the low 32 bits. (Or for mov to/from al/ax/eax/rax, a 64-bit absolute moffs encoding is available, but you don't want that).

Linux 位置相关可执行文件 do 将静态代码/数据放在虚拟地址空间的低 31 位 (2GiB) 中，因此您可以/应该使用 mov edi, sym 那里，但在 OS X 上，如果您需要寄存器中的地址，则最好的选择是 lea rdi, [sym+RIP].无法将 .data 中的变量移动到注册到 Mac x86 程序集.

Linux position-dependent executables do put static code/data in the low 31 bits (2GiB) of virtual address space, so you can/should use mov edi, sym there, but on OS X your best option is lea rdi, [sym+RIP] if you need an address in a register. Unable to move variables in .data to registers with Mac x86 Assembly.

(在 OS X 中，约定是 C 变量/函数名称在 asm 中以 _ 开头.在手写 asm 中，您不必这用于您不想从 C 访问的符号.)

(In OS X, the convention is that C variable/function names are prepended with _ in asm. In hand-written asm you don't have to do this for symbols you don't want to access from C.)

NASM 在这方面不那么令人困惑:

NASM is much less confusing in this respect:

[rel a] 表示 [a]

 的 RIP 相对寻址[abs a] 表示 [disp32].
default rel 或 default abs 设置用于 [a] 的内容.默认是(不幸的是)default abs，所以你几乎总是想要一个default rel.

[rel a] means RIP-relative addressing for [a]
[abs a] means [disp32].
default rel or default abs sets what's used for [a]. The default is (unfortunately) default abs, so you almost always want a default rel.

.intel_syntax noprefix
mov  dword ptr [sym + rip], 0x11111111
sym:

.equ x, 8
inc  byte ptr [x + rip]

.set y, 32
inc byte ptr [y + rip]

.set z, sym
inc byte ptr [z + rip]

gcc -nostdlib foo.s &&objdump -drwC -Mintel a.out(在 Linux 上；我没有 OS X):

gcc -nostdlib foo.s && objdump -drwC -Mintel a.out (on Linux; I don't have OS X):

0000000000001000 <sym-0xa>:
    1000:       c7 05 00 00 00 00 11 11 11 11   mov    DWORD PTR [rip+0x0],0x11111111        # 100a <sym>    # rel32 = 0; it's from the end of the instruction not the end of the rel32 or anywhere else.

000000000000100a <sym>:
    100a:       fe 05 08 00 00 00       inc    BYTE PTR [rip+0x8]        # 1018 <sym+0xe>
    1010:       fe 05 20 00 00 00       inc    BYTE PTR [rip+0x20]        # 1036 <sym+0x2c>
    1016:       fe 05 ee ff ff ff       inc    BYTE PTR [rip+0xffffffffffffffee]        # 100a <sym>

(用 objdump -dr 反汇编 .o 会告诉你没有任何需要链接器填充的重定位，它们都是在汇编时完成的.)

(Disassembling the .o with objdump -dr will show you that there aren't any relocations for the linker to fill in, they were all done at assemble time.)

请注意，只有 .set z, sym 导致了相对计算.x 和 y 最初来自纯数字文字，而不是标签，所以即使指令本身使用了 [x + RIP]，我们仍然得到[RIP + 8].

Notice that only .set z, sym resulted in a with-respect-to calculation. x and y were original from plain numeric literals, not labels, so even though the instruction itself used [x + RIP], we still got [RIP + 8].

(仅限 Linux 非 PIE):解决绝对 8 wrt.RIP，您需要 AT&T 语法 incb 8-.(%rip).我不知道如何在 GAS intel_syntax 中编写它；[8 - .+ RIP] 被拒绝，错误:'-' 的操作数无效(*ABS* 和 .text 部分).

(Linux non-PIE only): To address absolute 8 wrt. RIP, you'd need AT&T syntax incb 8-.(%rip). I don't know how to write that in GAS intel_syntax; [8 - . + RIP] is rejected with Error: invalid operands (*ABS* and .text sections) for '-'.

当然，你不能在 OS X 上这样做，除非是在图像库范围内的绝对地址.但是可能没有重定位可以保存要为 32 位 rel32 计算的 64 位绝对地址.

Of course you can't do that anyway on OS X, except maybe for absolute addresses that are in range of the image base. But there's probably no relocation that can hold the 64-bit absolute address to be calculated for a 32-bit rel32.