问题描述
我有一个简单的任务,
def worker(queue):
while True:
try:
_ = queue.get_nowait()
except Queue.Empty:
break
if __name__ == '__main__':
manager = multiprocessing.Manager()
# queue = multiprocessing.Queue()
queue = manager.Queue()
for i in range(5):
queue.put(i)
processes = []
for i in range(2):
proc = multiprocessing.Process(target=worker, args=(queue,))
processes.append(proc)
proc.start()
for proc in processes:
proc.join()
似乎multiprocessing.Queue可以完成我需要的所有工作,但是另一方面,我看到了很多manager().Queue()的示例,并且无法理解我的真正需求.看起来Manager().Queue()使用某种代理对象,但我不明白这些目的,因为multiprocessing.Queue()在没有任何代理对象的情况下完成了相同的工作.
It seems that multiprocessing.Queue can do all work that i needed, but on the other hand I see many examples of manager().Queue() and can't understand what I really need. Looks like Manager().Queue() use some sort of proxy objects, but I doesn't understand those purpose, because multiprocessing.Queue() do the same work without any proxy objects.
所以,我的问题是:
1)multiprocessing.Queue和multiprocessing.manager().Queue()返回的对象之间的真正区别是什么?
1) What really difference between multiprocessing.Queue and object returned by multiprocessing.manager().Queue()?
2)我需要使用什么?
2) What do I need to use?
推荐答案
尽管我对此主题的理解有限,但是从我的理解中我可以看出multiprocessing.Queue()和multiprocessing.Manager()之间有一个主要区别.Queue():
Though my understanding is limited about this subject, from what I did I can tell there is one main difference between multiprocessing.Queue() and multiprocessing.Manager().Queue():
- multiprocessing.Queue()是一个对象,而multiprocessing.Manager().Queue()是指向由multiprocessing.Manager()对象管理的共享队列的地址(代理).
- 因此,您无法将普通的multiprocessing.Queue()对象传递给Pool方法,因为它不能被腌制.
- 此外, Python文档告诉我们在使用多处理时要特别注意. Queue(),因为它会产生不良影响
- multiprocessing.Queue() is an object whereas multiprocessing.Manager().Queue() is an address (proxy) pointing to shared queue managed by the multiprocessing.Manager() object.
- therefore you can't pass normal multiprocessing.Queue() objects to Pool methods, because it can't be pickled.
- Moreover the python doc tells us to pay particular attention when using multiprocessing.Queue() because it can have undesired effects
警告如上所述,如果子进程已将项目放入队列中(并且未使用JoinableQueue.cancel_join_thread),则该进程将不会终止,直到所有缓冲的项目都已刷新到管道. 这意味着,如果您尝试加入该进程,则可能会陷入僵局,除非您确定已放入队列中的所有项目都已消耗完.同样,如果子进程是非守护进程,则当父进程尝试加入其所有非守护进程子进程时,其父进程可能会在退出时挂起. 请注意,使用管理器创建的队列不存在此问题.
Warning As mentioned above, if a child process has put items on a queue (and it has not used JoinableQueue.cancel_join_thread), then that process will not terminate until all buffered items have been flushed to the pipe. This means that if you try joining that process you may get a deadlock unless you are sure that all items which have been put on the queue have been consumed. Similarly, if the child process is non-daemonic then the parent process may hang on exit when it tries to join all its non-daemonic children. Note that a queue created using a manager does not have this issue.
有一种变通办法,可以通过将队列设置为全局变量并在初始化时为所有进程设置它,来将Pool与multiprocessing.Queue()一起使用:
There is a workaround to use multiprocessing.Queue() with Pool by setting the queue as a global variable and setting it for all processes at initialization :
queue = multiprocessing.Queue()
def initialize_shared(q):
global queue
queue=q
pool= Pool(nb_process,initializer=initialize_shared, initargs(queue,))
将创建具有正确共享队列的池进程,但是我们可以争辩说不是为此用途创建了multiprocessing.Queue()对象.
will create pool processes with correctly shared queues but we can argue that the multiprocessing.Queue() objects were not created for this use.
另一方面,可以将pool.manager.Queue()作为函数的正常参数传递,从而在池子进程之间共享manager.Queue().
On the other hand the manager.Queue() can be shared between pool subprocesses by passing it as normal argument of a function.
我认为,在每种情况下都可以使用multiprocessing.Manager().Queue(),而且麻烦也较少.使用管理器可能会有一些弊端,但我不知道.
In my opinion, using multiprocessing.Manager().Queue() is fine in every case and less troublesome. There might be some drawbacks using a manager but I'm not aware of it.
这篇关于Python multiprocessing.Queue与multiprocessing.manager().Queue()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!