在组装中做某事与让组装者做这件事

宏但是，您可以通过定义一个汇编时间常量(例如 XMASK = (1<<X) - 1)来避免在使用它的任何地方重复该表达式.或者你可以做类似的事情#define SHIFT2MASK(x_) ((1并使用 gcc -c foo.S 编译以通过 C 预处理器运行您的 asm 源.(GAS 原生宏像指令一样工作，而不是针对其他指令的单个操作数，因此像 C 预处理器这样的宏语言更方便.)这种方法的难点在于选择一个清晰宏名称，它清楚地传达了这样一个事实:它将移位计数转换为带有设置位向上的掩码到那个位置.不是 0xffffffff0 或其他东西，也不仅仅是 1.为了测试位图，您将执行诸如 test $1<<3, %al 之类的操作，并且掩码可以很容易地描述在适当位置设置 1 位的值.>需要明确的是，SHIFT2MASK 并非完全明确地命名.除了从它的使用方式之外，希望如此.理想情况下，它可以是不言自明的，注释可以是更高级别的，描述算法，而不是读者在代码中已经可以看到的细节.Which of the following two methods is preferred to get 2^n - 1? Why is one preferred over the other?# (2a) -- instructionsmov $1, %eaxshl $X, %eaxdec %eax# (2b) -- assemblermov $((1 << X) - 1), %eaxI find the first more readable personally but I'm pretty sure readability isn't the point of asm. 解决方案 Always do as much as possible at assemble-time (once per build), not at runtime where it costs code size, and costs time every time this block executes.They both start with the same mov $imm32, %eax form of mov, but then the first version wastes 2 extra instructions so it's total garbage with zero advantages, and looks super ugly and insane to anyone used to thinking about efficiency.There is no compiler to optimize your code into something the CPU can run efficiently, it's up to you to make that happen. If you don't care about performance, there's basically no reason to be messing around with asm in the first place, either writing it by hand or thinking about compiler output.The fact that you even have to ask this is a sign you're either missing the point of assembly language (usually performance), or you're mistakenly thinking of asm the same way as you would a compiled language like C++. You need to adjust your mental model to think about the machine code you're creating, that the CPU will execute, and how to make that as efficient as possible.You need to think like a compiler; "how can I do this in as few uops as possible for the front-end?" (https://agner.org/optimize), with minimum code-size in bytes as a tie-breaker. Or depending on your goals, maybe optimizing for code-size over speed. But anyway, compilers aggressively evaluate expressions and do constant-propagation as much as possible to combine constants in the source code into compile-time work instead of run-time.Footnote 1:In that case, write in a language that has a nice optimizing compiler, e.g. C or Rust, and let it create machine code for you. (Although to be fair, a few things are easier in asm than C if you know both equally well, such as extended precision math. Very few high-level languages make it easy to use the carry output from other operations.)Readability:You are 100% correct that readability is usually not the top priority in asm; it always takes a back seat to code-size and/or performance in any case where it's worth writing asm by hand in the first place. But within those constraints, we can certainly aim for as much readability as possible.Your runtime computation way is extremely surprising to experienced asm users reading your code, and not idiomatic at all. If I came across that in otherwise-sane code, it would take me some time to double-check and make sure I was understanding it properly (e.g. maybe there's some non-constant input to this after all, or maybe this sets FLAGS a certain way that's also needed later).The only reason to do work at run-time is when it couldn't have been done at compile time (because it's not constant) so it would be very surprising to see a shift whose input came from. If I saw that sequence of 3 instructions to create a 32-bit constant in production code (not beginner questions on Stack Overflow), I'd be shocked at the incompetence of whoever wrote it, after figuring out it was just creating a 32-bit constant.Apart from that, the runtime version is 2 more instructions to read, if this appears as part of a larger block of code. Code density (in terms of amount done per source line) is already low in asm, so minimizing instruction count is generally good for overall readability of a function.(As well as usually being good for efficiency, except for cases like replacing a slow instruction like div with a multiplicative inverse + shift. But that's bad enough for readability that it's not too weird for hand-written asm to mov an immediate to a register and then div by it, if performance wasn't the top priority of that one function or block of code, e.g. because it doesn't run often. Unless the divisor is a power of 2, then it's just a really stupid less convenient alternative to a right shift.)(1<<n) - 1 is a pretty common idiom that most experienced asm programmers are familiar with. See also https://catonmat.net/low-level-bit-hacks (Many people will also be familiar with binary tricks like this from low-level experience in other languages, it's definitely not unique to asm.)So for this case specifically, I'd really say just get used to seeing stuff like and $(1<<X) - 1, %eax. Or and $-16, %eax as a convenient way to write an AND mask that zeros the low 4 bits, rounding EAX down to a multiple of 16. (Taking advantage of 2's complement).MacrosHowever, you can avoid repeating that expression everywhere you use it by defining an assemble-time constant like XMASK = (1<<X) - 1 that you can use instead.Or you can do something like#define SHIFT2MASK(x_) ((1<<x_)-1)...X=3mov $SHIFT2MASK(X), %eaxand $SHIFT2MASK(4), %ecxand compile with gcc -c foo.S to run your asm source through the C preprocessor.(GAS native macros work like instructions, not for single operands to other instructions, so a macro language like the C preprocessor is more convenient for this.)The hard part with this approach is choosing a clear macro name that unambiguously conveys the fact that it turns a shift count into a mask with set bits up to that position. Not 0xfffffff0 or something, and not just 1<<4 either. For testing a bitmap, you would be doing stuff like test $1<<3, %al, and a mask could just as easily describe the value with 1 bit set at the appropriate position.To be clear, SHIFT2MASK is not fully unambiguously named. Other than from context of how it's getting used, hopefully. Ideally it can be self-explanatory enough that the comments can be higher-level, describing the algorithm, not the nuts and bolts that the reader can already see in the code itself. 这篇关于在组装中做某事与让组装者做这件事的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！上岸，阿里云！