问题描述
英特尔的32位处理器,如奔腾具有64位宽的数据总线,从而获取每次访问8个字节。基于此,我假定这些处理器发出的地址总线上的物理地址总是8的倍数
Intel's 32-bit processors such as Pentium have 64-bit wide data bus and therefore fetch 8 bytes per access. Based on this, I'm assuming that the physical addresses that these processors emit on the address bus are always multiples of 8.
首先,是这个结论是否正确?
Firstly, is this conclusion correct?
其次,如果是正确的,那么应该对齐的8字节边界的数据结构成员。不过,我已经看到了使用一个4字节对齐,而不是在这些处理器上的人。
Secondly, if it is correct, then one should align data structure members on an 8 byte boundary. But I've seen people using a 4-byte alignment instead on these processors.
他们怎么能这样做是有道理的?
How can they be justified in doing so?
推荐答案
根据经验,通常的规则(直接从英特尔和AMD的优化手册)是每个数据类型应该由它自己的大小来排列。一个 INT32
应在32位边界,同比的Int64
在64位的边界上对齐,等等。一个char适合就好任何地方。
The usual rule of thumb (straight from Intels and AMD's optimization manuals) is that every data type should be aligned by its own size. An int32
should be aligned on a 32-bit boundary, an int64
on a 64-bit boundary, and so on. A char will fit just fine anywhere.
经验的另一条规则是,当然,编译器已被告知有关对齐要求。你并不需要担心,因为编译器知道添加的右填充和偏移,允许有效的数据访问。
Another rule of thumb is, of course "the compiler has been told about alignment requirements". You don't need to worry about it because the compiler knows to add the right padding and offsets to allow efficient access to data.
使用SIMD指令,在那里你必须手动确保大多数编译器调整工作时,唯一的例外是。
The only exception is when working with SIMD instructions, where you have to manually ensure alignment on most compilers.
第二,如果它是正确的,再一个 要调整数据结构成员 8字节边界。但我见过 使用4字节对齐人 而不是在这些处理器。
我看不出有差别。 CPU可以简单地发出读为包含那些4个字节的64位的块。这意味着,要么得到4个额外的字节所请求的数据之前,或之后。但在这两种情况下,只需要一个单一的读出。的32位宽的数据的32位对准确保它不会跨越64位边界
I don't see how that makes a difference. The CPU can simply issue a read for the 64-bit block that contains those 4 bytes. That means it either gets 4 extra bytes before the requested data, or after it. But in both cases, it only takes a single read. 32-bit alignment of 32-bit-wide data ensures that it won't cross a 64-bit boundary.
这篇关于在32位Intel处理器内存对齐的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!