问题描述
可能重复:结果
结果
考虑一个32位的x86机器上下面的例子:
Consider the following example on a 32 bit x86 machine:
由于对齐约束,下面的结构
Due to alignment constraints, the following struct
struct s1 {
char a;
int b;
char c;
char d;
char e;
}
可以重新presented内存更有效(12对8个字节),如果成员在
could be represented more memory-efficiently (12 vs. 8 bytes) if the members were reordered as in
struct s2 {
int b;
char a;
char c;
char d;
char e;
}
我知道,C / C ++编译器是不允许这样做。我的问题是,为什么语言是这样设计的。毕竟,我们最终可能会浪费大量的内存和引用,如 struct_ref-> B
不会在乎差
修改:谢谢大家的非常有用的答案。你很好地解释为什么清理不因为语言设计的方式工作。但是,它让我想到:请问这些论点仍然持有,如果重排是语言的一部分?比方说,有一些指定重排的规则,从中我们要求至少是
EDIT: Thank you all for your extremely useful answers. You explain very well why rearranging doesn't work because of the way the language was designed. However, it makes me think: Would these arguments would still hold if rearrangement was part of the language? Let's say that there was some specified rearrangement rule, from which we required at least that
- 我们只能重新组织结构,如果确有必要(如果结构已经是从紧什么也不做)
- 的规则只着眼于结构的定义,而不是内侧内结构。这保证了一个结构类型具有相同的布局是否是内部在另一结构
- 给定结构的编译内存布局是predictable赋予其定义(即,规则是固定的)
Adressing你的论点一一我的原因:
Adressing your arguments one by one I reason:
-
的最惊喜的元素低级别的数据映射,的:只写你的结构在紧张的风格自己(像@佩里的回答),并没有发生什么变化(需求1) 。如果由于某种奇怪的原因,你想内部填充,在那里,你可以插入使用手动虚拟变量,和/或有可能是关键字/指令。
Low-level data mapping, "element of least surprise": Just write your structs in a tight style yourself (like in @Perry's answer) and nothing has changed (requirement 1). If, for some weird reason, you want internal padding to be there, you could insert it manually using dummy variables, and/or there could be keywords/directives.
的编译器的差异的:要求3消除了这种担忧。其实,从@大卫赫弗南的评论,似乎我们今天这个问题,因为不同的编译器垫不同?
Compiler differences: Requirement 3 eliminates this concern. Actually, from @David Heffernan's comments, it seems that we have this problem today because different compilers pad differently?
的优化的:重排的整点(内存)优化。我看到很多潜力在这里。我们可能无法移除填充都在一起,但我看不出重新排序如何能以任何方式限制优化。
Optimization: The whole point of reordering is (memory) optimization. I see lots of potential here. We may not be able to remove padding all together, but I don't see how reordering could limit optimization in any way.
的类型转换的:在我看来,这是最大的问题。尽管如此,应该有围绕此方式。由于规则是固定的语言,编译器能够想出如何在成员重新排序,并做出相应的反应。正如上面提到的,它总能在你想要完全控制的情况下prevent重新排序。此外,要求2保证了类型安全code不会破坏。
Type casting: It seems to me that this is the biggest problem. Still, there should be ways around this. Since the rules are fixed in the language, the compiler is able to figure out how the members were reordered, and react accordingly. As mentioned above, it will always be possible to prevent reordering in the cases where you want complete control. Also, requirement 2 ensures that type-safe code will never break.
我觉得这样的规则可以使感,因为我觉得它更通过其内容比其类型自然组结构成员的原因。此外,它是编译器来选择最佳的排序比它更容易对我来说,当我有很多内部结构的。最佳的布局,甚至可能是一个我无法在类型安全的方式前preSS。另一方面,它会出现使语言更复杂,这当然是一个缺点。
The reason I think such a rule could make sense is because I find it more natural to group struct members by their contents than by their types. Also it is easier for the compiler to choose the best ordering than it is for me when I have a lot of inner structs. The optimal layout may even be one that I cannot express in a type-safe way. On the other hand, it would appear to make the language more complicated, which is of course a drawback.
请注意,我说的不是改变语言 - 只有当它能够(/应)已经被不同的设计
Note that I am not talking about changing the language - only if it could(/should) have been designed differently.
我知道我的问题是假设性的,但我认为讨论较低级别的机器和语言的设计提供了更深入的了解。
I know my question is hypothetical, but I think the discussion provides deeper insight in the lower levels of the machine and language design.
我很新来的,所以我不知道我是否应该产卵这个新问题。请告诉我,如果是这样的话。
I'm quite new here, so I don't know if I should spawn a new question for this. Please tell me if this is the case.
推荐答案
有多种原因,C编译器不能自动重新排序字段:
There are multiple reasons why the C compiler cannot automatically reorder the fields:
-
C编译器不知道是否
结构
重新presents超越当前编译单元的对象(例如内存结构:一个外国库中,在光盘上的文件时,网络数据,CPU的页表,......)。在这种情况下的数据的二进制结构中无法进入到编译器的地方也被定义,所以重新排序结构
字段将创建与其他定义不一致的数据类型。例如,在一个ZIP文件文件的头包含多个对齐的32位字段。重新排序字段,就不可能对C code直接读取或写入头(假设ZIP实现想直接访问数据):
The C compiler doesn't know whether the
struct
represents the memory structure of objects beyond the current compilation unit (for example: a foreign library, a file on disc, network data, CPU page tables, ...). In such a case the binary structure of data is also defined in a place inaccessible to the compiler, so reordering thestruct
fields would create a data type that is inconsistent with the other definitions. For example, the header of a file in a ZIP file contains multiple misaligned 32-bit fields. Reordering the fields would make it impossible for C code to directly read or write the header (assuming the ZIP implementation would like to access the data directly):
struct __attribute__((__packed__)) LocalFileHeader {
uint32_t signature;
uint16_t minVersion, flag, method, modTime, modDate;
uint32_t crc32, compressedSize, uncompressedSize;
uint16_t nameLength, extraLength;
};
的包装
属性$ P $从根据自己的自然对齐排列领域pvents编译器,它有没有关系领域排序的问题。这将有可能重新排列 LocalFileHeader
的字段,以便该结构具有两个最小尺寸,并具有对齐到其自然对齐所有字段。然而,编译器不能选择重新排序字段,因为它不知道该结构实际上是由ZIP文件规范定义的。
The packed
attribute prevents the compiler from aligning the fields according to their natural alignment, and it has no relation to the problem of field ordering. It would be possible to reorder the fields of LocalFileHeader
so that the structure has both minimal size and has all fields aligned to their natural alignment. However, the compiler cannot choose to reorder the fields because it does not know that the struct is actually defined by the ZIP file specification.
C是一种不安全的语言。 C编译器不知道数据是否会通过不同类型的一个比由编译器看到的访问,例如:
C is an unsafe language. The C compiler doesn't know whether the data will be accessed via a different type than the one seen by the compiler, for example:
struct S {
char a;
int b;
char c;
};
struct S_head {
char a;
};
struct S_ext {
char a;
int b;
char c;
int d;
char e;
};
struct S s;
struct S_head *head = (struct S_head*)&s;
fn1(head);
struct S_ext ext;
struct S *sp = (struct S*)&ext;
fn2(sp);
这是一个的广泛使用的低级编程模式,特别是如果头包含位于刚好超出标题数据的类型ID
This is a widely used low-level programming pattern, especially if the header contains the type ID of data located just beyond the header.
如果一个结构
类型被嵌入在另一个结构
类型,它是不可能的内联内结构
:
If a struct
type is embedded in another struct
type, it is impossible to inline the inner struct
:
struct S {
char a;
int b;
char c, d, e;
};
struct T {
char a;
struct S s; // Cannot inline S into T, 's' has to be compact in memory
char b;
};
这也意味着,从移动取值
某些字段到一个单独的结构将禁用一些优化:
This also means that moving some fields from S
to a separate struct disables some optimizations:
// Cannot fully optimize S
struct BC { int b; char c; };
struct S {
char a;
struct BC bc;
char d, e;
};
由于大多数C编译器的优化编译器,重新排列结构领域需要新的优化实施。值得怀疑的是这些优化是否能够做到比程序员能够写出更好。手工设计数据结构是的要少得多的耗时比其他编译器的任务,如寄存器分配,内联函数,常量折叠,switch语句转化为二进制搜索,等等。这样的好处被获得让编译器优化数据结构似乎比传统的编译器优化无形的。
Because most C compilers are optimizing compilers, reordering struct fields would require new optimizations to be implemented. It is questionable whether those optimizations would be able to do better than what programmers are able to write. Designing data structures by hand is much less time consuming than other compiler tasks such as register allocation, function inlining, constant folding, transformation of a switch statement into binary search, etc. Thus the benefits to be gained by allowing the compiler to optimize data structures appear to be less tangible than traditional compiler optimizations.
这篇关于为什么不能C编译器重新排列结构成员以消除对齐填充?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!