


    字符模式; [8]
    字符的uid [8];
    焦炭GID [8];
    焦炭的mtime [12];
    字符CHKSUM [8];

我已经看到了它在code使用的网络上也是如此。只是freading,fwriting这个结构的文件中一个块,假设不会有任何填充。当然,还假设 CHAR_BITS == 8 。我想这样的C code是如此普遍,标准将处理这种情况,但我无法找到它的话,也许我不会是一个好律师。


接受的答案将给予严格的,或者根据的C类标准的一个最严格的移植实现,这让我治疗与标准库字符串函数这些领域。考虑到 CHAR_BITS 和所有。我想人们需要阅读512 阵列uint8_t有对于这一点,而在那之后,也许它们转换成字符,一个接一个。任何更简单的方法?



的平台ABI可能对填充更严格的要求,但是根据这将是特定于平台的,其他平台可以具有其它的填充要求。该的x86-64 ABI在Unix / Linux的字符 1字节对齐,并规定:


如果你想成为便携,而在一个单一的通话读取结构,你可能想要看的。这是一个向量或分散/集中I / O操作,它允许您指定数组和长度的数组读入。例如,对于这种情况下,你可能会这样写:

结构iovec的IOV [10];
IOV [0] .iov_base =安培; h.name;
IOV [0] = .iov_len的sizeof(h.name);
IOV [1] .iov_base =安培; h.mode;
IOV [1] .iov_len = sizeof的(h.mode);
/* ... 等等 ... */
bytes_read缓存= readv(FD,IOV,10);

注意 readv 在POSIX /单一Unix规范的定义,而不是在C标准。在标准C,做最简单的事情就是阅读每个单独这些元素(甚至与量化的I / O可用,只是阅读和写作每个元素分别将可能会更清楚,除非你绝对需要使用一个单一的呼吁整个I / O操作)。


在C规范并不保证 uint8_t有可用:typedef名 UINT N _t 指定宽度为N的无符号整数类型和无填充位....这些类型都是可选的。 (C11草案,§,¶2-3)。然而,如果8位的值是可用的,则保证是一个8位的值的,因为它保证是至少8位,并保证是最小对象不是一个位字段(§¶1):

所以,如果你没有一个8位字节可用,您将无法读取直接作为单独的数组元素这些字段的访问八位来自他们。你必须手工拆出用位移位和屏蔽个别字节。但是,有没有现代建筑,我知道缺少8位字节(用于通用计算,在那里文件I / O是所有关心;有些DSP的可能,但他们可能不会有标准的C文件I / O )。

如果你有一个8位字节,那么字符保证是8位,所以没有多大的好处比的清晰度等,使用 uint8_t有 VS 字符。如果你真的很担心,我只想确保你有一个支票的地方在构建过程中的 CHAR_BIT 8,并调用它好。

Let us suppose I would like to read/write a tar file header.Considering standard C (C89, C99, or C11),do char arrays have any special treatment in structs, regarding padding? Can the compiler add padding to such a struct:

struct header {
    char name[100];
    char mode[8];
    char uid[8];
    char gid[8];
    char size[12];
    char mtime[12];
    char chksum[8];
    char typeflag;
    char linkname[100];
    char tail[255];

I've seen it used in code on the web as well. Just freading, fwriting this struct to the file in one chunk, assuming there will not be any padding. Of course also assuming CHAR_BITS == 8.I'm thinking such C code is so common, the standard would deal with this case, but I just can't find it in it, maybe I would not be a good lawyer.


The accepted answer would give a strict, or the strictest possible portable implementation according one of the C standards, that lets me treat these fields with standard library string functions. Considering CHAR_BITS and all. I'm thinking one needs to read an array of 512 uint8_t for this, and after that maybe convert them to chars, one by one. Any easier way?


C11 (the latest freely available draft) says only "There may be unnamed padding within a structure object, but not at its beginning" (§ ¶15) and "There may be unnamed padding at the end of a structure or union" (§ ¶17). It gives no further restriction on padding within a structure.

The platform ABI may have more stringent requirements on padding, but depending on this will be platform-specific, as other platforms may have other padding requirements. The x86-64 ABI for Unix/Linux gives char 1 byte alignment, and specifies:

This seems to imply that on this platform, there will be no padding within the struct. However, there are cases in which array variables have stricter alignment restriction in order to be able to be used with vector instructions; other platforms may impose such restrictions on array structure members as well.

If you would like to be portable, while reading the structure in a single call, you might want to look at readv. This is a vectored or scatter/gather I/O operation, which allows you to specify an array of arrays and lengths to read into. For instance, for this case you might write:

struct header h;
struct iovec iov[10];
iov[0].iov_base = &h.name;
iov[0].iov_len = sizeof(h.name);
iov[1].iov_base = &h.mode;
iov[1].iov_len = sizeof(h.mode);
/* ... etc ... */
bytes_read = readv(fd, iov, 10);

Note that readv is defined in POSIX/Single Unix Specification, not in the C standard. In standard C, the easiest thing to do is just read each of these elements individually (and even with vectored I/O available, just reading and writing each element individually will probably be more clear unless you absolutely need to use a single call for the whole I/O operation).

In your edit, you write:

The C specification does not guarantee that uint8_t is available: "The typedef name uintN_t designates an unsigned integer type with width N and no padding bits.... These types are optional." (C11 draft, §, ¶2–3). However, if 8 bit values are available, then char is guaranteed to be an 8 bit value, as it is guaranteed to be at least 8 bits and is guaranteed to be the smallest object that is not a bit-field (§ ¶1):

So, if you don't have an 8-bit bytes available, you won't be able to read these fields in directly and access octets from them as individual array elements; you would have to manually split out individual bytes using bit shifting and masking. However, there are no modern architectures that I know of which lack 8 bit bytes (for general purpose computing, where file I/O is at all a concern; some DSPs might, but they probably won't have standard C file I/O).

If you do have an 8-bit bytes, then char is guaranteed to be 8 bits, so there's not much benefit other than clarity for using uint8_t vs char. If you're really concerned, I would just ensure that you have a check somewhere in your build process that CHAR_BIT is 8 and call it good.


07-31 21:32