使用线程和进程并发写入同一文件

本文介绍了使用线程和进程并发写入同一文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

什么是正确的解决方案，以确保在使用许多线程和进程时文件不会被破坏.

what is the correct solution to be sure that file will never be corrupted while using many threads and processes.

版本的线程，它关心打开错误.

version for threads, which care about opening errors.

lock = threading.RLock()
with lock:
   try:
     f = open(file, 'a')
     try:
        f.write('sth')
     finally:
        f.close() # try close in any circumstances if open passed
   except:
     pass # when open failed

对于进程，我猜必须使用multiprocessing.Lock

for processes I guess must use multiprocessing.Lock

但是如果我需要2个进程，并且第一个进程拥有2个线程(每个线程使用一个文件)

but if I want 2 processes, and the first process own 2 threads (each one use file)

只是理论而已，但我想知道如何将同步与线程和进程混合在一起.是线程从进程继承"了它吗，所以只需要进程之间的同步化?

there is just theory, but I want know how to mix synchronization with threads and processes.are threads "inherit" it from process?, so only synchonization between processes are required ?

和2.我不确定上述代码是否需要嵌套尝试，以防万一写入失败，并且我们想关闭打开的文件(如果在锁释放后仍保持打开状态)

and 2. I'm not sure if above code need nested try in case when write will fail, and we want close opened file (what if it will remain opened after lock released)

推荐答案

虽然，实际上，多处理同步原语也确实同步线程.

While this isn't entirely clear from the docs, multiprocessing synchronization primitives do in fact synchronize threads as well.

例如，如果您运行以下代码:

For example, if you run this code:

import multiprocessing
import sys
import threading
import time

lock = multiprocessing.Lock()

def f(i):
    with lock:
        for _ in range(10):
            sys.stderr.write(i)
            time.sleep(1)

t1 = threading.Thread(target=f, args=['1'])
t2 = threading.Thread(target=f, args=['2'])
t1.start()
t2.start()
t1.join()
t2.join()

…输出将始终为1111111111222222222或22222222221111111111，而不是两者的混合.

… the output will always be 1111111111222222222 or 22222222221111111111, not a mixture of the two.

这些锁是在Windows上的Win32内核同步对象，支持它们的POSIX平台上的信号量之上实现的，而在其他平台上根本没有实现. (您可以使用import multiprocessing.semaphore进行测试，这将在其他平台上引发ImportError，如文档中所述.)

The locks are implemented on top of Win32 kernel sync objects on Windows, semaphores on POSIX platforms that support them, and not implemented at all on other platforms. (You can test this with import multiprocessing.semaphore, which will raise an ImportError on other platforms, as explained in the docs.)

话虽如此，只要您始终按正确的顺序使用它们，拥有两个级别的锁当然是安全，也就是说，除非能够保证您的进程具有multiprocessing.Lock.

That being said, it's certainly safe to have two levels of locks, as long as you always use them in the right order—that is, never grab the threading.Lock unless you can guarantee that your process has the multiprocessing.Lock.

如果足够巧妙地执行此操作，则可以带来性能优势. (在Windows和某些POSIX平台上，跨进程锁可能比进程内锁慢几个数量级.)

If you do this cleverly enough, it can have performance benefits. (Cross-process locks on Windows, and on some POSIX platforms, can be orders of magnitude slower than intra-process locks.)

如果仅以明显的方式执行此操作(仅在with processlock:块内执行with threadlock:)，则显然对性能无济于事，并且实际上会使速度变慢(尽管可能不足以衡量) )，它不会增加任何直接的好处.当然，即使读者不知道multiprocessing锁在线程之间的工作，读者也会知道您的代码是正确的，并且在某些情况下，调试进程内死锁比调试进程间死锁要容易得多……但是我不认为在大多数情况下，这两个都是足够增加复杂性的充分理由.

If you just do it in the obvious way (only do with threadlock: inside with processlock: blocks), it obviously won't help performance, and in fact will slow things down a bit (although quite possibly not enough to measure), and it won't add any direct benefits. Of course your readers will know that your code is correct even if they don't know that multiprocessing locks work between threads, and in some cases debugging intraprocess deadlocks can be a lot easier than debugging interprocess deadlocks… but I don't think either of those is a good enough reason for the extra complexity in most cases.

这篇关于使用线程和进程并发写入同一文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！