问题描述
假设我们使用fopen()
打开了一个文件,并从接收到的文件指针中使用fileno()
获取文件描述符.然后,我们从该文件中进行很多(> 10 ^ 8)个随机的read()
个相对较小的块,大小在4Bytes到10KBytes之间:
Let's assume we opened a file using fopen()
and from the file-pointer received, fetch the file-descriptor using fileno()
. Then we do lots (>10^8) of random read()
s of relativly small chunks, between a size of 4Bytes to 10KBytes from this file:
如果文件系统是一个
-
ext3
NFS
OCFS2
2和3的组合(OCFS2
通过NFS
)
combination of 2 and 3 (OCFS2
via NFS
)
?
我的读物得出的结论是:应该不会出现1.(如果文件未设置O_NONBLOCK
,如果可能的话会设置ext3
),但是其他三个(2.,3). .,4.)我不确定.
My readings gave me the conclusion it should not be possible for 1. (if the file has not O_NONBLOCK
set, if ever possible for ext3
to have it set) but for the other three (2., 3., 4.) I'm uncertain.
(顺便说一句:在任何情况下,我都可以假定O_NONBLOCK
未被设置为默认值吗?)
(Btw: Could I assume having O_NONBLOCK
not set to be the default in any case?)
之所以出现此问题,是因为我观察到read()
返回的字节数少于情况4中未设置errno
的请求字节数.
This questions arose because I observed read()
s returning less bytes then requested without errno
set in case 4.
要通过测试进行深入研究的问题是,这种行为发生在< 1/1000000000情况下...-仍然很常见:-}
The problem to drill this down by testing is that such behaviour happens in <1/1000000000 cases ... - which is still too often :-}
更新:平均文件大小在大约TB到1GB之间.
Update: The average file size is between some TBytes and around 1GByte.
推荐答案
您不应假定read()返回的字节数不会少于任何文件系统所请求的字节数.在大读取的情况下尤其如此,因为POSIX.1表示大于SSIZE_MAX的大小的read()行为取决于实现.我现在使用的这个主流Unix机器上,SSIZE_MAX是32767字节. read()今天总是返回全部金额的事实并不意味着将来也会返回.
You should not assume that read() will not return less bytes than requested for any filesystem. This is particularly true in the case of large reads, as POSIX.1 indicates that read() behavior for sizes larger than SSIZE_MAX is implementation-dependent. On this mainstream Unix box I'm using right now, SSIZE_MAX is 32767 bytes. The fact that read() always returns the full amount today does not mean that it will in the future.
一个可能的原因可能是,将来I/O优先级会在内核中得到更充分的体现.例如.您尝试从同一设备读取另一个优先级更高的进程,并且如果您的进程没有导致磁头从另一个进程想要的扇区移开,则另一个进程将获得更好的吞吐量.内核可能会选择给您的read()一个短计数,以让您暂时摆脱困境,而不是继续进行效率低下的交错块读取. 已完成陌生事情 I/O效率.禁止的东西通常是强制性的.
One possible reason might be that I/O priorities are more fully fleshed out in the kernel in the future. E.g. you're trying to read from the same device as another higher priority process and the other process would get better throughput if your process wasn't causing head movement away from the sectors the other process wants. The kernel might choose to give your read() a short count to get you out of the way for a while, instead of continuing to do inefficient interleaved block reads. Stranger things have been done for the sake of I/O efficiency. What is not prohibited often becomes compulsory.
这篇关于从文件中读取()-阻塞与非阻塞行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!