


I have to do a 64 bits stack. To make myself comfortable with malloc I managed to write two integers(32 bits) into memory and read from there:


But, when i try to do this with 64 bits:



The first snippet of code works perfectly fine. As Jester suggested, you are writing a 64-bit value in two separate (32-bit) halves. This is the way you have to do it on a 32-bit architecture. You don't have 64-bit registers available, and you can't write 64-bit chunks of memory at once. But you already seemed to know that, so I won't belabor it.

在第二段代码中,您尝试针对64位体系结构(x86-64). 现在,您不再需要分两个32位写64位值,因为64位体系结构本机支持64位整数.您有64位宽的寄存器可用,并且可以将64位块直接写入内存.利用它来简化(并加快)代码.

In the second snippet of code, you tried to target a 64-bit architecture (x86-64). Now, you no longer have to write 64-bit values in two 32-bit halves, since 64-bit architectures natively support 64-bit integers. You have 64-bit wide registers available, and you can write a 64-bit chunk to memory directly. Take advantage of that to simplify (and speed up) the code.

64位寄存器是Rxx而不是Exx.使用QWORD PTR时,您将要使用Rxx;否则,将使用Rxx.当您使用DWORD PTR时,您将要使用Exx.两者在64位代码中都是合法的,但是只有32位DWORD在32位代码中才是合法的.

The 64-bit registers are Rxx instead of Exx. When you use QWORD PTR, you will want to use Rxx; when you use DWORD PTR, you will want to use Exx. Both are legal in 64-bit code, but only 32-bit DWORDs are legal in 32-bit code.


  1. 尽管使用MOV xxx, 0,会更小,更快,因此通常应编写此内容.这是一个非常古老的技巧,任何汇编语言程序员都应该知道这一点,并且,如果您尝试阅读别人的汇编程序,则需要熟悉这个习惯用法. (但是实际上,在您编写的代码中,您根本不需要这样做.因此,请参阅第2点.)

  1. Although it is perfectly valid to clear a register using MOV xxx, 0, it is smaller and faster to use XOR eax, eax, so this is generally what you should write. It is a very old trick, something that any assembly-language programmer should know, and if you ever try to read other people's assembly programs, you'll need to be familiar with this idiom. (But actually, in the code you're writing, you don't need to do this at all. For the reason why, see point #2.)

在64位模式下,时会将高32位隐式清零,因此您可以简单地写入XOR eax, eax而不是XOR rax, rax.再次,它更小,更快.

In 64-bit mode, all instructions implicitly zero the upper 32 bits when writing the lower 32 bits, so you can simply write XOR eax, eax instead of XOR rax, rax. This is, again, smaller and faster.

64位程序的调用约定与32位程序中使用的约定不同.调用约定的确切规范将有所不同,具体取决于您所使用的操作系统.正如Peter Cordes所说,在 x86标签Wiki 中有关于此的信息. 两者 Windows和Linux x64调用约定至少在寄存器中传递前4个整数参数(而不是像x86-32调用约定那样在堆栈上传递),但是哪个寄存器是实际使用的是不同的.另外,对于在调用函数之前必须如何设置堆栈,64位调用约定与32位调用约定有不同的要求.

The calling convention for 64-bit programs is different than the one used in 32-bit programs. The exact specification of the calling convention is going to vary, depending on which operating system you're using. As Peter Cordes commented, there is information on this in the x86 tag wiki. Both Windows and Linux x64 calling conventions pass at least the first 4 integer parameters in registers (rather than on the stack like the x86-32 calling convention), but which registers are actually used is different. Also, the 64-bit calling conventions have different requirements than do the 32-bit calling conventions for how you must set up the stack before calling functions.


(Since your screenshot says something about "MASM", I'll assume that you're using Windows in the sample code below.)

; Set up the stack, as required by the Windows x64 calling convention.
; (Note that we use the 64-bit form of the instruction, with the RSP register,
; to support stack pointers larger than 32 bits.)
sub  rsp, 40

; Dynamically allocate 8 bytes of memory by calling malloc().
; (Note that the x64 calling convention passes the parameter in a register, rather
; than via the stack. On Windows, the first parameter is passed in RCX.)
; (Also note that we use the 32-bit form of the instruction here, storing the
; value into ECX, which is safe because it implicitly zeros the upper 32 bits.)
mov  ecx, 8
call malloc

; Write a single 64-bit value into memory.
; (The pointer to the memory block allocated by malloc() is returned in RAX.)
mov  qword ptr [rax], 1

; ... do whatever

; Clean up the stack space that we allocated at the top of the function.
add  rsp, 40


If you wanted to do this in 32-bit halves, even on a 64-bit architecture, you certainly could. That would look like the following:

sub  rsp, 40                   ; set up stack

mov  ecx, 8                    ; request 8 bytes
call malloc                    ; allocate memory

mov  dword ptr [eax],   1      ; write "1" into low 32 bits
mov  dword ptr [eax+4], 2      ; write "2" into high 32 bits

; ... do whatever

add  rsp, 40                   ; clean up stack


Note that these last two MOV instructions are identical to what you wrote in the 32-bit version of the code. That makes sense, because you're doing exactly the same thing.

您最初编写的代码不起作用的原因是,由于EAX不包含QWORD PTR,而它包含了DWORD PTR.因此,由于存在不匹配,汇编程序生成了无效的指令操作数"错误.这与不偏移8的原因相同,因为DWORD PTR仅4个字节. QWORD PTR的确是8个字节,但是EAX中没有其中一个.

The reason the code you originally wrote didn't work is because EAX doesn't contain a QWORD PTR, it contains a DWORD PTR. Hence, the assembler generated the "invalid instruction operands" error, because there was a mismatch. This is the same reason that you don't offset by 8, because a DWORD PTR is only 4 bytes. A QWORD PTR is indeed 8 bytes, but you don't have one of those in EAX.


Or, if you wanted to write 16 bytes:

sub  rsp, 40                   ; set up stack

mov  ecx, 16                   ; request 16 bytes
call malloc                    ; allocate memory

mov  qword ptr [rax],   1      ; write "1" into low 64 bits
mov  qword ptr [rax+8], 2      ; write "2" into high 64 bits

; ... do whatever

add  rsp, 40                   ; clean up stack


Compare these three snippets of code, and make sure you understand the differences and why they need to be written as they are!


