问题描述
现在我在一个框架中有一个中央模块,它使用 Python 2.6 生成多个进程 multiprocessing
模块.因为它使用multiprocessing
,所以有模块级的multiprocessing-aware log,LOG = multiprocessing.get_logger()
.根据文档,这个记录器有进程共享锁,这样你就不会通过让多个进程同时向 sys.stderr
(或任何文件句柄)写入内容,在 sys.stderr
(或任何文件句柄)中乱码.
Right now I have a central module in a framework that spawns multiple processes using the Python 2.6 multiprocessing
module. Because it uses multiprocessing
, there is module-level multiprocessing-aware log, LOG = multiprocessing.get_logger()
. Per the docs, this logger has process-shared locks so that you don't garble things up in sys.stderr
(or whatever filehandle) by having multiple processes writing to it simultaneously.
我现在遇到的问题是框架中的其他模块不支持多处理.在我看来,我需要使对这个中央模块的所有依赖都使用多处理感知日志记录.这在框架内很烦人,更不用说框架的所有客户端了.有没有我没有想到的替代方案?
The issue I have now is that the other modules in the framework are not multiprocessing-aware. The way I see it, I need to make all dependencies on this central module use multiprocessing-aware logging. That's annoying within the framework, let alone for all clients of the framework. Are there alternatives I'm not thinking of?
推荐答案
非侵入式地处理这个问题的唯一方法是:
The only way to deal with this non-intrusively is to:
- 生成每个工作进程,使其日志转到不同的文件描述符(到磁盘或到管道).理想情况下,所有日志条目都应带有时间戳.
- 然后,您的控制器进程可以执行以下一个:
- 如果使用磁盘文件:在运行结束时合并日志文件,按时间戳排序
- 如果使用管道(推荐): 将所有管道中的日志条目即时合并到中央日志文件中.(例如,从管道的文件描述符中定期
select
,对可用的日志条目执行归并排序,并刷新到集中日志.重复.)
- Spawn each worker process such that its log goes to a different file descriptor (to disk or to pipe.) Ideally, all log entries should be timestamped.
- Your controller process can then do one of the following:
- If using disk files: Coalesce the log files at the end of the run, sorted by timestamp
- If using pipes (recommended): Coalesce log entries on-the-fly from all pipes, into a central log file. (E.g., Periodically
select
from the pipes' file descriptors, perform merge-sort on the available log entries, and flush to centralized log. Repeat.)
这篇关于在 Python 中使用多处理时我应该如何记录?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!