本文介绍了何时使用特定的调用约定的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

x86-64 中是否有关于函数何时应遵守 System V 准则以及何时无关紧要的准则?这是对此处的回答的回应,其中提到使用其他调用约定来简化内部/本地函数.

Are there any guidelines in x86-64 for when a function should abide by the System V guidelines and when it doesn't matter? This is in response to an answer here which mentions using other calling conventions for simplifying an internal/local function.

# gcc 32-bit regparm calling convention
is_even:          # input in RAX, bool return value in AL
    not   %eax             # 2 bytes
    and   $1, %al          # 2 bytes
    ret
# custom calling convention:
is_even:   # input in RDI
           # returns in ZF.  ZF=1 means even
    test  $1, %dil         # 4 bytes.  Would be 2 for AL, 3 for DL or CL (or BL)
    ret

有关上下文,请参阅该答案.

例如,是否应该使用:

  • 仅在被外部更高级别的 C 函数调用时才需要.
  • 仅当标签/函数为 globl 时才需要.
  • Only needed when called by an external higher-level C function.
  • Only needed when that label/function is globl.

或者关于何时随意"使用寄存器的最佳指南是什么?然后根据 System V 约定何时使用它们?

Or what's the best guideline as to when to use registers "as I please" and then when to use them according to the System V convention?

推荐答案

这取决于你在 asm 中编写什么样的东西.如果您正在编写一个纯粹用 asm 编写的小型自包含 asm 程序,例如 16 位引导加载程序,请务必继续为所有内容制定自定义调用约定(如果您编写任何函数)完全,而不仅仅是内联).例如查看 作为一个有趣的例子,并在评论中查看关于让 disp_al 破坏更多寄存器的讨论.

It depends what kind of thing you're writing in asm. If you're writing a small self-contained asm program that's purely written in asm, such as a 16-bit bootloader, definitely go ahead and make up custom calling conventions for everything (if you make any functions at all, instead of just inlining). e.g. have a look at the disp_ax_hex function in @ecm's legacy BIOS bootloader as an interesting example, and see discussion in comments about letting disp_al clobber more registers.

我想说的是,在大多数其他代码(包括编译器生成的代码的较大程序的一部分)中,通常会遵循标准调用约定;x86-64 System V 的设计非常好.仅考虑对私有"使用自定义约定.辅助函数,尤其是那些仅从其他函数的不同部分调用的函数.通常这些调用者都在一个文件中,所以不是 global.

I'd say generally do follow the standard calling convention in most other code (part of a larger program that includes compiler-generated code); x86-64 System V is quite well designed. Only consider using a custom convention for "private" helper functions, especially ones that are only called from different parts of one other functions. Typically these have their callers all in one file, so not global.

可以有用地返回 2 个单独值的函数绝对可以从自定义调用约定中受益,这对 asm 调用者有利.
例如C memcmp 不返回第一个差异的位置,只返回 -/0/+.这是非常愚蠢和无用,剥夺了我们利用现有手工优化的 asm 的好方法来找到位置不匹配.在 asm 中,我们可以轻松地返回两者,例如指向 RDI 中位置的指针和 FLAGS 中的 cmp 结果.

Functions that can usefully return 2 separate values can definitely benefit from a custom calling convention, for the benefit of asm callers.
e.g. C memcmp doesn't return the position of the first difference, only - / 0 / +. This is really stupid and useless, depriving us of a good way to take advantage of the existing hand-optimized asm to find the mismatch position. In asm we can easily just return both, like a pointer to the position in RDI and a cmp result in FLAGS.

在这种情况下,您可以编写一个与 x86-64 System V 调用约定 100% 兼容的 memcmp 函数(因此您需要对两个字节进行零扩展并执行双字sub,而不是仅仅执行一个字节 cmp),RDI 输出作为 asm 调用者的奖励.

In that case, you could write a memcmp function that was 100% compatible with the x86-64 System V calling convention (so you'd need to zero-extend both bytes and do a dword sub, instead of just doing a byte cmp), with the RDI output as a bonus for asm callers.

你链接的那部分答案是我决定提及的一个随机想法.这不是您通常会做的事情(尽管首先手动编写 asm 也不是),并且您永远不想将 test 单独放入函数中,除非作为代码的解决方案 -高尔夫运动.这就是它背后的真正想法:大部分成本"都在背后.只是因为你把它变成了一个函数,而在现实生活中,你总是会内联这么简单的东西.

The part of my answer you linked was kind of a random thought I decided to mention. It's not something you normally do (although neither is writing asm by hand in the first place), and you'd never want to actually put test in a function by itself except as a solution to a code-golf exercise. That was the real idea behind it: most of the "cost" of that function is just because you made it a function, and in real life you'd always inline something that simple.

通常您一开始不会编写小函数.您只需在较大函数中间的几条指令中实现逻辑,就像编译器内联一个小辅助函数一样.那么为您的所有功能遵循平台 ABI(在本例中为 x86-64 System V)并不昂贵.

Usually you don't write tiny functions in the first place. You just implement the logic in a couple instructions in the middle of a larger function, just like a compiler would inline a small helper function. Then it's not costly to follow the platform ABI (x86-64 System V in this case) for all your functions.

优化逻辑以返回 0/1 int(不仅仅是 8 位 bool),坚持标准调用约定,可能是一个有趣的练习,但通常没有用,除非您的实际用例想要做类似 even_count += is_even(x); 的事情.但在这种情况下,您应该执行 odds += x&1; 并在最后需要时计算一次偶数,如 even = total-odd.除了消除调用/返回开销外,内联还允许考虑优化作为实际用例的一部分的小函数的逻辑.

Optimizing the logic to return a 0 / 1 int (not just an 8-bit bool), and sticking to the standard calling convention, could be a fun exercise but often not useful unless it turns out your actual use-case wants to do something like even_count += is_even(x);. But in that case, you should do odds += x&1; and calculate the even count once at the end when you need it, as even = total-odd. Besides removing the call/return overhead, inlining also allows thinking about optimizing the logic of a tiny function as part of the actual use-case.

有一个私有帮助函数的用例:

There is a use-case for private helper functions:

有时您想重复一个包含多个指令的块作为私有的助手";用于更大功能的功能,例如使用mov eax, 1/call func/做别的事情/mov eax 123/call func.然后你可以想到功能"更像是一个循环体或更大函数中的东西,而调用者更像是自定义迭代.

Sometimes you want to repeat a block of several instructions as a private "helper" function for a larger function, uses like e.g. mov eax, 1 / call func / do something else / mov eax 123 / call func. Then you can think of the "function" more like a loop body or something inside a larger function, and the caller more like custom iteration.

有时使用宏重复一段代码是有意义的,但如果序列有点长,那会使您的代码膨胀.(宏每次使用都会扩展;不像 5 字节的 call rel32.)

Sometimes it makes sense to repeat a block of code using a macro, but if the sequence is somewhat long that will bloat your code. (Macros expand every time you use them; unlike a 5-byte call rel32.)

澄清一下,is_even 太简单了,将它放在自己的函数中是没有意义的. 调用函数而不是仅仅运行 对某些寄存器测试 $1、%reg/jzjnz 将是完全疯狂和混淆的,并且更大更慢.或者 和 $1, %eax 从 reg 中得到一个 0/​​1 整数是奇数,你可以用 add 来计算奇数.(总奇数以偶数结尾).大多数程序员也不会将它包装在宏中;理解二进制是汇编语言的标准,只需要对 test 或 jcc 指令进行简单的注释来描述语义(#ifod)即可.

Just to be clear, is_even is so simple that it would never make sense to put it in its own function. Calling a function instead of just running test $1, %reg / jz or jnz for some register would be completely insane and obfuscated, as well as larger and slower. Or and $1, %eax to get a 0/1 integer from the reg being odd, which you could use with add to count odd numbers. (total-odd at the end to count even). Most programmers would not wrap it in a macro either; understanding binary is standard for assembly language, and a simple comment on the test or jcc instruction to describe the semantic meaning (# if odd) is all that would be needed.

理论上,对于纯手写程序,您可以根据每个函数的具体情况使用任何最方便的调用约定,并用注释记录.但通常情况下,与遵循标准调用约定相比,好处很小,并跟踪哪些函数 clobbers 注册并希望其 args 将很快成为具有多个不同调用者的通用函数的维护噩梦,这些调用者与除了被调用的函数.

In theory, for a purely hand-written program, you can just use whatever calling convention is most convenient on a case-by-case basis for every function, documenting with comments. But normally the benefit is small vs. following a standard calling convention, and keeping track which function clobbers which registers and wants its args where would quickly become a maintenance nightmare for general-purpose functions that have multiple different callers that aren't highly related to each other than the function being called.

当然,出于同样的原因,我们用高级语言编写应用程序,很少真正手动编写任何 asm.您提议在 asm 中手动编写函数这一事实意味着值得考虑像编译器一样思考"是否可行.太拘束了.这就是我的 codegolf 答案:如果值得从函数中挤出每个最后一个字节或循环,那么整个程序(或至少它的调用者)可能会以类似的方式编写.

Of course, for the same reason, we write applications in high-level languages and only rarely actually write any asm by hand. The fact that you're proposing to write functions by hand in asm means it's worth considering whether "thinking like a compiler" too constraining. That's the point of my codegolf answer: if it's worth squeezing every last byte or cycle out of a function, the whole program (or at least its caller) is probably written similarly.

如今在 asm 中编写整个程序的唯一好理由是优化机器代码大小的垃圾,例如演示场景. https://en.wikipedia.org/wiki/Demoscene.(或者,如果程序"确实是一个引导加载程序,它可以在没有操作系统之前运行.)

The only good reason for writing whole programs in asm these days is to optimize the crap out of their machine-code size, e.g. the demo scene. https://en.wikipedia.org/wiki/Demoscene. (Or if the "program" is really a bootloader that runs without / before an OS.)

此时,不要让 ABI 和调用约定限制您的优化.而且您的程序通常足够小,可以跟踪不同的函数及其调用约定,特别是如果它们具有一定的逻辑意义(或者主要匹配其调用者碰巧保留正确变量的寄存器).

At that point, don't let ABIs and calling conventions constrain your optimization. And your program will generally be small enough that it's possible to keep track of the different functions and their calling conventions, especially if they make some logical sense (or mostly match the registers where their callers happen to be keeping the right variables anyway).

这篇关于何时使用特定的调用约定的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 09:39