问题描述
我正在关注关于程序集的。根据教程(我也在本地尝试,并得到类似的结果),以下源代码:
$ b blockquote>
int natural_generator()
{
int a = 1;
static int b = -1;
b + = 1; / *(1,2)* /
返回a + b;
$ / code>
编译为以下汇编指令: (gdb)break natural_generator
(gdb)run $($)
$ b
(行号注释 ,(2)和(1,2)由我添加。) p>
问题:为什么在编译的代码中是静态变量 b的地址相对于不断变化的指令指针(RIP)(见(1)和(2) code>),并因此生成更复杂的汇编代码,而不是相对于可执行文件的特定部分,这些变量存储在哪里?
根据在上面的教程中,是这样的部分:
(Emphasis mine。)
RIP相对寻址被用来访问的原因有两个静态变量 b 。首先是它使代码位置独立,意味着它是否用于共享库或代码可以更容易地重新定位。其次,它允许将代码加载到64位地址空间中的任何位置,而无需在指令中编码大量的8字节(64位)位移,而64位x86 CPU不支持。
您提到编译器可以生成代码,以引用变量相对于其所在部分的开始部分。虽然其真正的做法也会具有相同的优点如上所述,它不会使组件变得更加复杂。实际上它会使它更加复杂。生成的汇编代码首先必须计算变量所在部分的地址,因为它只知道它相对于指令指针的位置。然后它将不得不将它存储在一个寄存器中,因此可以访问 b (以及该节中的任何其他变量)相对于该地址。
由于32位x86代码不支持RIP相对寻址,因此您的替代解决方案实际上是编译器在生成32位位置无关代码时所做的工作。它将变量 b 放入全局偏移表(GOT)中,然后访问相对于GOT基础的变量。编译时使用 gcc -m32 -O3 -fPIC -S test.c :
natural_generator:
call __x86.get_pc_thunk.cx
addl $ _GLOBAL_OFFSET_TABLE_,%ecx
movl b.1392@GOTOFF( %ecx),%eax
leal 1(%eax),%edx
addl $ 2,%eax
movl%edx,b.1392@GOTOFF(%ecx)
ret
第一个函数调用将下列指令的地址放入ECX中。下一条指令通过从指令开始处添加GOT的相对偏移量来计算GOT的地址。变量ECX现在包含GOT的地址,并在访问其余代码中的变量 b 时用作基础。
将其与由 gcc -m64 -O3 -S test.c 生成的64位代码进行比较:
movl b.1745(%rip),%eax
leal 1(%rax),%edx
addl $ 2,%eax
movl%edx,b.1745(%rip)
ret
(代码与您的问题中的示例不同,因为优化已打开。一般而言,仅查看优化的输出是一个好主意,因为没有优化,编译器通常会生成可怕的代码,注意不需要使用 -fPIC 标志,因为无论如何编译器都会生成64位的位置无关代码。)
注意在64位中有两条较少的汇编指令版本使其成为较不复杂的版本。您还可以看到代码使用少一个寄存器(ECX)。虽然它在代码中没有太大的区别,但是在一个更复杂的例子中,它可能已经被用于其他事情。这使得代码更加复杂,因为编译器需要更多地处理寄存器。
I am following this tutorial about assembly.
According to the tutorial (which I also tried locally, and got similar results), the following source code:
Compiles to these assembly instructions:
(Line number comments (1), (2) and (1, 2) added by me.)
Question: why is, in the compiled code, the address of the static variable b relative to the instruction pointer (RIP), which constantly changes (see lines (1) and (2)), and thus generates more complicated assembly code, rather than being relative to the specific section of the executable, where such variables are stored?
According to the mentioned tutorial, there is such a section:
(Emphasis mine.)
There are two main reasons why RIP-relative addressing is used to access the static variable b. The first is that it makes the code position independent, meaning if it's used in a shared library or position independent executable the code can be more easily relocated. The second is that it allows the code to be loaded anywhere in the 64-bit address space without requiring huge 8 byte (64-bit) displacements to be encoded in the instruction, which aren't supported by 64-bit x86 CPUs anyways.
You mention that the compiler could instead generate code that referenced the variable relative to the beginning of the section it lives in. While its true doing this would also have the same advantages as given above, it wouldn't make the assembly any less complicated. In fact it will make it more complicated. The generated assembly code would first have to calculate the address of the section the variable lives in, since it would only know its location relative to the instruction pointer. It would then have to store it in a register, so accesses to b (and any other variables in the section) can be made relative to that address.
Since 32-bit x86 code doesn't support RIP-relative addressing, your alternate solution is fact what the compiler does when generating 32-bit position independent code. It places the variable b in the global offset table (GOT), and then accesses the variable relative to the base of the GOT. Here's the assembly generated by your code when compiled with gcc -m32 -O3 -fPIC -S test.c:
natural_generator: call __x86.get_pc_thunk.cx addl $_GLOBAL_OFFSET_TABLE_, %ecx movl b.1392@GOTOFF(%ecx), %eax leal 1(%eax), %edx addl $2, %eax movl %edx, b.1392@GOTOFF(%ecx) ret
The first function call places the address of the following instruction in ECX. The next instruction calculates the address of the GOT by adding the relative offset of the GOT from the start of the instruction. The variable ECX now contains the address of the GOT and is used as a base when accessing the variable b in the rest of the code.
Compare that to 64-bit code generated by gcc -m64 -O3 -S test.c:
natural_generator: movl b.1745(%rip), %eax leal 1(%rax), %edx addl $2, %eax movl %edx, b.1745(%rip) ret
(The code is different than the example in your question because optimization is turned on. In general its a good idea to only look at optimized output, as without optimization the compiler often generates terrible code that does a lot of useless things. Also note that the -fPIC flag doesn't need to be used, as the compiler generates 64-bit position independent code regardless.)
Notice how there's two fewer assembly instructions in the 64-bit version making it the less complicated version. You can also see that the code uses one less register (ECX). While it doesn't make much of a difference in your code, in a more complicated example that's a register that could've been used for something else. That makes the code even more complicated as the compiler needs to do more juggling of registers.
这篇关于为什么静态变量相对于指令指针的地址?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!