问题描述
我目前正在尝试获取一个脚本,以将其他启动命令的输出正确写入日志文件.该脚本将使用 echo 将其自己的消息写入日志文件,并且有一种方法可以将来自其他程序的行通过管道传输到其中.
I am currently trying to get a script to write output from other started commands correctly into a log file. The script will write it's own messages to the log file using echo and there is a method to which I can pipe the lines from the other program.
主要问题是,产生输出的程序是在后台启动的,所以我的读取函数可能会同时写入日志文件.这可能是个问题吗?Echo 总是只写一行,所以保证原子性应该不难.但是,我在 google 中查看过,但没有发现任何方法可以确保它实际上是原子的.
The main problem is, that the program which produces the output is started in the background, so my function that does the read may write concurently to the logfile. Could this be a problem? Echo always only writes a single line, so it should not be to hard to ensure atomicity. However I have looked in google and I have found no way to make sure it actually is atomic.
这是当前的脚本:
LOG_FILE=/path/to/logfile
write_log() {
echo "$(date +%Y%m%d%H%M%S);$1" >> ${LOG_FILE}
}
write_output() {
while read data; do
write_log "Message from SUB process: [ $data ]"
done
}
write_log "Script started"
# do some stuff
call_complicated_program 2>&1 | write_output &
SUB_PID=$!
#do some more stuff
write_log "Script exiting"
wait $SUB_PID
如您所见,脚本可能会自行编写,也可能由于重定向输出而编写.这是否会导致文件损坏?
As you can see, the script might write both on it's own as well as because of redirected output. Could this cause havok in the file?
推荐答案
echo
只是 write
的一个简单包装器(这是一个简化;请参阅下面的编辑以了解血腥详细信息),因此要确定 echo 是否是原子的,查找 write 很有用.来自单一 UNIX 规范:
echo
just a simple wrapper around write
(this is a simplification; see edit below for the gory details), so to determine if echo is atomic, it's useful to look up write. From the single UNIX specification:
原子性/非原子性:如果在一次操作中写入的全部量不与来自任何其他进程的数据交错,则写入是原子的.当有多个写入器向单个读取器发送数据时,这很有用.应用程序需要知道可以预期以原子方式执行的写入请求有多大.这个最大值称为 {PIPE_BUF}.IEEE Std 1003.1-2001 的这一卷并没有说明超过 {PIPE_BUF} 个字节的写请求是否是原子的,但要求 {PIPE_BUF} 或更少字节的写是原子的.
您可以使用简单的 C 程序检查系统上的 PIPE_BUF
.如果您只是打印单行输出,那不会太长,它应该是原子的.
You can check PIPE_BUF
on your system with a simple C program. If you're just printing a single line of output, that is not ridiculously long, it should be atomic.
这里有一个简单的程序来检查PIPE_BUF
的值:
Here is a simple program to check the value of PIPE_BUF
:
#include <limits.h>
#include <stdio.h>
int main(void) {
printf("%d
", PIPE_BUF);
return 0;
}
在 Mac OS X 上,这给了我 512(最小允许值PIPE_BUF
).在 Linux 上,我得到 4096.因此,如果您的行相当长,请确保在相关系统上检查.
On Mac OS X, that gives me 512 (the minimum allowed value for PIPE_BUF
). On Linux, I get 4096. So if your lines are fairly long, make sure you check it on the system in question.
编辑添加:我决定检查 echo
在 Bash 中的实现,以确认它会自动打印.事实证明,echo
使用 putchar
或 printf
取决于您是否使用 -e
选项.这些是缓冲的 stdio 操作,这意味着它们填充缓冲区,并且仅在到达换行符(在行缓冲模式下)、缓冲区被填充(在块缓冲模式下)或您显式刷新时才真正将其写出fflush
的输出.默认情况下,如果流是交互式终端,则流将处于行缓冲模式,如果是任何其他文件,则将处于阻塞缓冲模式.Bash 从不设置缓冲类型,因此对于您的日志文件,它应该默认为块缓冲模式.然后 内置 echo
的结尾, Bash 调用fflush
刷新输出流.因此,输出将始终在 echo
结束时刷新,但如果它不适合缓冲区,则可能会更早刷新.
edit to add: I decided to check the implementation of echo
in Bash, to confirm that it will print atomically. It turns out, echo
uses putchar
or printf
depending on whether you use the -e
option. These are buffered stdio operations, which means that they fill up a buffer, and actually write it out only when a newline is reached (in line-buffered mode), the buffer is filled (in block-buffered mode), or you explicitly flush the output with fflush
. By default, a stream will be in line buffered mode if it is an interactive terminal, and block buffered mode if it is any other file. Bash never sets the buffering type, so for your log file, it should default to block buffering mode. At then end of the echo
builtin, Bash calls fflush
to flush the output stream. Thus, the output will always be flushed at the end of echo
, but may be flushed earlier if it doesn't fit into the buffer.
使用的缓冲区的大小可能是BUFSIZ
,虽然它可能不同;如果您使用 setbuf
显式设置缓冲区,则 BUFSIZ
是默认大小,但没有可移植的方法来确定缓冲区的实际大小.也没有关于 BUFSIZ
是什么的可移植指南,但是当我在 Mac OS X 和 Linux 上测试它时,它的大小是 PIPE_BUF
的两倍.
The size of the buffer used may be BUFSIZ
, though it may be different; BUFSIZ
is the default size if you set the buffer explicitly using setbuf
, but there's no portable way to determine the actual the size of your buffer. There are also no portable guidelines for what BUFSIZ
is, but when I tested it on Mac OS X and Linux, it was twice the size of PIPE_BUF
.
这一切意味着什么?由于 echo
的输出都是缓冲的,所以它不会真正调用 write
直到缓冲区被填满或 fflush
被调用.在这一点上,输出应该被写入,并且我上面提到的原子性保证应该适用.如果 stdout 缓冲区大小大于 PIPE_BUF
,则 PIPE_BUF
将是可以写出的最小原子单元.如果 PIPE_BUF
大于 stdout 缓冲区大小,则当缓冲区填满时,流将写出缓冲区.
What does this all mean? Since the output of echo
is all buffered, it won't actually call the write
until the buffer is filled or fflush
is called. At that point, the output should be written, and the atomicity guarantee I mentioned above should apply. If the stdout buffer size is larger than PIPE_BUF
, then PIPE_BUF
will be the smallest atomic unit that can be written out. If PIPE_BUF
is larger than the stdout buffer size, then the stream will write the buffer out when the buffer fills up.
因此,echo
只能保证原子地写入短于 PIPE_BUF
和 stdout 缓冲区大小中的较小者的序列,这很可能是 BUFSIZ代码>.在大多数系统上,
BUFSIZ
大于 PIPE_BUF
.
So, echo
is only guaranteed to atomically write sequences shorter than the smaller of PIPE_BUF
and the size of the stdout buffer, which is most likely BUFSIZ
. On most systems, BUFSIZ
is larger that PIPE_BUF
.
tl;dr:echo
将自动输出行,只要这些行足够短.在现代系统上,最多 512 字节可能是安全的,但无法轻松确定限制.
tl;dr: echo
will atomically output lines, as long as those lines are short enough. On modern systems, you're probably safe up to 512 bytes, but it's not possible to determine the limit portably.
这篇关于写单行时回声是原子的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!