问题描述
让我们用一个简单的 C 代码来设置寄存器:
Let us take a simple C code for setting a register:
int main()
{
int *a = (int*)111111;
*a = 0x1000;
return 0;
}
当我使用 1 级优化为 ARM (arm-none-eabi-gcc) 编译此代码时,汇编代码类似于:
When I compile this code for ARM (arm-none-eabi-gcc) with level 1 optimization, the assembly code is something like:
mov r2, #4096
mov r3, #110592
str r2, [r3, #519]
mov r0, #0
bx lr
看起来地址 111111 被解析到最近的 4K 边界(110592)并移动到 r3,然后通过将 519 添加到 110592(=111111)来存储值 4096(0x1000).为什么会发生这种情况?
Looks like the address 111111 was resolved to the closest 4K boundary (110592) and moved to r3, and then the value 4096(0x1000) was stored by adding 519 to 110592 (=111111). Why does this happen?
在 x86 中,汇编很简单:
In x86, the assembly is straightforward:
movl $4096, 111111
movl $0, %eax
ret
推荐答案
这种编码背后的原因,是因为 x86 具有可变大小的指令——从 1 字节到 16 字节(甚至可能更多带有前缀).
The reason behind this encoding, is because x86 has variable sized instructions -- from 1 byte up to 16 bytes (and possibly even more with prefixes).
ARM 指令是 32 位宽(不包括 Thumb 模式),这意味着根本不可能在单个操作码中编码所有 32 位宽常量(立即数).
ARM instruction is 32 bits wide (not counting Thumb modes), which means that it's simply not possible to encode all 32-bit wide constants (immediates) in a single opcode.
固定大小的架构通常使用几种方法来加载大常量:
Fixed sized architectures typically use a few methods to load large constants:
1) movi #r1, Imm8 ; // Here Imm8 or ImmX is simply X least significant bits
2) movhi #r1, Imm16 ; // Here Imm16 loads the 16 MSB of the register
3) load #r1, (PC + ImmX); // use PC-relative address to put constant in code
4) movn #r1, Imm8 ; // load the inverse of Imm8 (for signed constants)
5) mov(i/n) #1, Imm8 << N; // where N=0,8,16,24
可变大小的架构 OTOH 可以将所有常量放在一条指令中:
Variable sized architectures OTOH can put all the constants in a single instruction:
xx xx xx 00 10 00 00 11 11 11 00 ; // assuming that it takes 3 bytes to encode
; // the instruction and the addressing mode
; added with 4 bytes to encode the 4096 and 4 bytes to encode 0x00111111
这篇关于GCC生成的ARM和x86汇编代码的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!