本文介绍了究竟如何调用堆栈工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想获得的编程语言的低级操作是如何工作的,尤其是如何与OS / CPU互动更深入的了解。我可能是阅读对堆栈&NBSP每个堆栈/堆相关的线程每一个答案;溢出,而且都是辉煌。但是,仍然有一件事,我没有完全理解呢。

考虑伪code此功能,这往往是有效的防锈code; - )

  FN富(){
    让= 1;
    令b = 2;
    让C = 3;
    令d = 4;    //线X    doSomething的(A,B);
    doAnotherThing(C,D);
}

这是我如何假定栈的样子就行X:

 堆栈A + ------------- +
  | 1 |
B + ------------- +
  | 2 |
C + ------------- +
  | 3 |
D + ------------- +
  | 4 |
  + ------------- +

现在,一切我读过有关堆栈的工作原理是,它严格服从规则LIFO(后入先出)。就像在.NET,Java或其他编程语言堆栈的数据类型。

但如果是这样的话,那么什么线X之后会发生什么?因为很明显,我们需要接下来的事情就是与 A B 来工作,但是这将意味着OS / CPU(?)有弹出 D C 首先要回 A b 。但随后它会拍摄自己的脚,因为它需要 C D 下一行

所以,我不知道什么的究竟会在幕后?

另一个相关的问题。考虑到我们传递一个参考这样的其他功能之一:

  FN富(){
    让= 1;
    令b = 2;
    让C = 3;
    令d = 4;    //线X    doSomething的(&放大器;一,和b);
    doAnotherThing(C,D);
}

这是我所理解的东西,这将意味着,在 DoSomething的的参数基本上指向相同的内存地址,比如 A b 。不过话又说回来,这意味着没有任何的弹出堆栈,直到我们到达 A B 的发生。

这些2案件让我觉得我还没有完全掌握如何究竟堆栈工作,以及如何严格遵循的 LIFO 的规则。


解决方案

调用堆栈也可称为一帧堆栈。结果
被的堆叠的后进先出的原则后,事情是不是本地变量,但功能全栈帧(来电)被称为。局部变量推,并与所谓的和后记,分别为。

在框架里面的变量的顺序完全是不确定的;编译器\"reorder\"一个框架内局部变量的位置适当地优化他们的排列,从而使处理器能够尽可能快地获取它们。关键的事实是,的相对于某些固定地址的变量的偏移量是在整个帧的生存期内保持不变 - 因此只须采取锚地址,比方说,帧本身的地址,并与该地址的变量的偏移量工作。这种锚定的地址实际上是包含在所谓的基础帧指针存储在EBP寄存器。偏移,在另一方面,显然在编译时已知,因此很难codeD放入机器code。

这从图形显示了典型的呼叫堆栈的结构类似于:

添加我们想访问包含在帧指针地址的变量的偏移量,我们可以得到我们的变量的地址。所以不久表示,code刚刚访问他们通过直接从基地指针常量编译时间偏移;这是简单的指针运算。

示例

的#include<&iostream的GT;诠释的main()
{
    焦C =的std :: cin.get();
    性病::法院LT&;< C;
}

gcc.godbolt.org 给我们

 主:
    pushq%RBP
    MOVQ%RSP,RBP%
    SUBQ $ 16%,可吸入悬浮粒子    MOVL给std :: cin,EDI%
    调用的std :: basic_istream<焦炭,的std :: char_traits<&烧焦GT; > ::得到()
    MOVB%人,-1(RBP%)
    movsbl -1(RBP%),%EAX
    MOVL%EAX,ESI%
    MOVL的std ::法院,EDI%
    调用[...插入运算符的字符,长长的东西...]    MOVL $ 0,%EAX
    离开
    RET

..适用。我分了code分为三个小节。
该函数序言由前三操作:


  • 基本指针被压入堆栈。

  • 堆栈指针被保存在基指针

  • 堆栈指针减去,以腾出空间给本地变量。

然后 CIN 移动到EDI寄存器和 GET 被调用;返回值是在EAX。

到目前为止好。现在,有趣的事情发生了:

EAX的低位字节,由8位寄存器指定AL,是采取和存储在基指针后字节右:这是 -1 (RBP%),该基址指针的偏移量为 1 此字节为我们的变量 C 。因为栈在x86向下增长的偏移量为负。接下来的操作店 C 在EAX:EAX被移到ESI, COUT 移动到EDI,然后插入运算符被称为与 COUT C 作为参数。

最后,


  • 的返回值主存储在EAX:0那是因为​​隐含的收益语句。
    您可能还会看到 xorl RAX RAX 而不是 MOVL

  • 将返回到调用站点。 离开的缩写本尾声并含蓄

    • 替换与基指针堆栈指针和

    • 弹出基指针。


此操作后和 RET 已经完成,框架已经有效弹出,尽管调用者仍然需要清理论据,我们使用cdecl调用惯例。其他约定,例如STDCALL,要求被叫方整理,例如通过传递的字节数为 RET

帧指针省略

另外,也可以不使用偏移从基部/帧指针但是从堆栈指针(ESB)代替。这使得EBP-注册,否则将包含可任意使用的帧指针值 - 但它可以使的的,并且将。编译对于只有少数寄存器,包括x86处理器时,它是特别有用的。

这优化称为FPO(帧指针省略),并设置由 -fomit-frame-pointer的在GCC和 -oy 在锵;注意它是隐式由每个优化级别> 0触发当且仅当调试仍然是可能的,因为它没有任何费用除了这一点。
欲了解更多信息,请参阅here和。


由于在评论中指出,帧指针是presumably意味着返回地址后,指向的地址。

需要注意的是,与R开始寄存器是与E. EAX开始的那些的64位同行RAX指定的四个低位字节。我用32位寄存器的名称为清楚起见

I'm trying to get a deeper understanding of how the low level operations of programming languages work and especially how they interact with the OS/CPU. I've probably read every answer in every stack/heap related thread here on Stack Overflow, and they are all brilliant. But there is still one thing that I didn't fully understand yet.

Consider this function in pseudo code which tends to be valid Rust code ;-)

fn foo() {
    let a = 1;
    let b = 2;
    let c = 3;
    let d = 4;

    // line X

    doSomething(a, b);
    doAnotherThing(c, d);
}

This is how I assume the stack to look like on line X:

Stack

a +-------------+
  | 1           |
b +-------------+
  | 2           |
c +-------------+
  | 3           |
d +-------------+
  | 4           |
  +-------------+

Now, everything I've read about how the stack works is that it strictly obeys LIFO rules (last in, first out). Just like a stack datatype in .NET, Java or any other programming language.

But if that's the case, then what happens after line X? Because obviously, the next thing we need is to work with a and b, but that would mean that the OS/CPU (?) has to pop out d and c first to get back to a and b. But then it would shoot itself in the foot, because it needs c and d in the next line.

So, I wonder what exactly happens behind the scenes?

Another related question. Consider we pass a reference to one of the other functions like this:

fn foo() {
    let a = 1;
    let b = 2;
    let c = 3;
    let d = 4;

    // line X

    doSomething(&a, &b);
    doAnotherThing(c, d);
}

From how I understand things, this would mean that the parameters in doSomething are essentially pointing to the same memory address like a and b in foo. But then again this means that there is no pop up the stack until we get to a and b happening.

Those two cases make me think that I haven't fully grasped how exactly the stack works and how it strictly follows the LIFO rules.

解决方案

The call stack could also be called a frame stack.
The things that are stacked after the LIFO principle are not the local variables but the entire stack frames ("calls") of the functions being called. The local variables are pushed and popped together with those frames in the so-called function prologue and epilogue, respectively.

Inside the frame the order of the variables is completely unspecified; Compilers "reorder" the positions of local variables inside a frame appropriately to optimize their alignment so the processor can fetch them as quickly as possible. The crucial fact is that the offset of the variables relative to some fixed address is constant throughout the lifetime of the frame - so it suffices to take an anchor address, say, the address of the frame itself, and work with offsets of that address to the variables. Such an anchor address is actually contained in the so-called base or frame pointer which is stored in the EBP register. The offsets, on the other hand, are clearly known at compile time and are therefore hardcoded into the machine code.

This graphic from Wikipedia shows what the typical call stack is structured like:

Add the offset of a variable we want to access to the address contained in the frame pointer and we get the address of our variable. So shortly said, the code just accesses them directly via constant compile-time offsets from the base pointer; It's simple pointer arithmetic.

Example

#include <iostream>

int main()
{
    char c = std::cin.get();
    std::cout << c;
}

gcc.godbolt.org gives us

main:
    pushq   %rbp
    movq    %rsp, %rbp
    subq    $16, %rsp

    movl    std::cin, %edi
    call    std::basic_istream<char, std::char_traits<char> >::get()
    movb    %al, -1(%rbp)
    movsbl  -1(%rbp), %eax
    movl    %eax, %esi
    movl    std::cout, %edi
    call    [... the insertion operator for char, long thing... ]

    movl    $0, %eax
    leave
    ret

.. for main. I divided the code into three subsections.The function prologue consists of the first three operations:

  • Base pointer is pushed onto the stack.
  • The stack pointer is saved in the base pointer
  • The stack pointer is subtracted to make room for local variables.

Then cin is moved into the EDI register and get is called; The return value is in EAX.

So far so good. Now the interesting thing happens:

The low-order byte of EAX, designated by the 8-bit register AL, is taken and stored in the byte right after the base pointer: That is -1(%rbp), the offset of the base pointer is -1. This byte is our variable c. The offset is negative because the stack grows downwards on x86. The next operation stores c in EAX: EAX is moved to ESI, cout is moved to EDI and then the insertion operator is called with cout and c being the arguments.

Finally,

  • The return value of main is stored in EAX: 0. That is because of the implicit return statement.You might also see xorl rax rax instead of movl.
  • leave and return to the call site. leave is abbreviating this epilogue and implicitly
    • Replaces the stack pointer with the base pointer and
    • Pops the base pointer.

After this operation and ret have been performed, the frame has effectively been popped, although the caller still has to clean up the arguments as we're using the cdecl calling convention. Other conventions, e.g. stdcall, require the callee to tidy up, e.g. by passing the amount of bytes to ret.

Frame Pointer Omission

It is also possible not to use offsets from the base/frame pointer but from the stack pointer (ESB) instead. This makes the EBP-register that would otherwise contain the frame pointer value available for arbitrary use - but it can make debugging impossible on some machines, and will be implicitly turned off for some functions. It is particularly useful when compiling for processors with only few registers, including x86.

This optimization is known as FPO (frame pointer omission) and set by -fomit-frame-pointer in GCC and -Oy in Clang; note that it is implicitly triggered by every optimization level > 0 if and only if debugging is still possible, since it doesn't have any costs apart from that.For further information see here and here.


As pointed out in the comments, the frame pointer is presumably meant to point to the address after the return address.

Note that the registers that start with R are the 64-bit counterparts of the ones that start with E. EAX designates the four low-order bytes of RAX. I used the names of the 32-bit registers for clarity.

这篇关于究竟如何调用堆栈工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-02 08:02