多处理和垃圾收集 | 多处理和垃圾收集

本文介绍了多处理和垃圾收集的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在py2.6 +中， multiprocessing 模块提供了一个 Pool 类，所以可以这样做：

  class不稳定（object）：
 def do_stuff（self，...）：
 pool = multiprocessing。 Pool（）
 return pool.imap（...）

然而，随着标准Python实现在2.7.2，这种方法很快就会导致IOError：[Errno 24]太多打开的文件。很明显， pool 对象永远不会被垃圾回收，所以它的进程永远不会终止，并累积内部打开的任何描述符。我认为这是因为以下工作：

$ p $ 类不稳定（对象）：
def do_stuff（self，... ）：
pool = multiprocessing.Pool（）
result = pool.map（...）
pool.terminate（）
返回结果

我希望保持 imap ;垃圾收集器在这种情况下如何工作？如何修复代码？

解决方案

最后，我最终传递了池引用并在 pool.imap 迭代器完成后手动终止：

  class Volatile（object）：
 def do_stuff（self，...）：
 pool = multiprocessing.Pool（）
 return pool，pool.imap（.. ）
 
 def call_stuff（self）：
 pool，results = self.do_stuff（）
表示结果的结果：
＃懒惰评估imap 
 pool.terminate（）

如果有人将来会碰到这个解决方案：chunksize参数在 Pool.imap 中非常重要（与普通的 Pool .map ，这并不重要）。我手动设置它，以便每个进程接收 1 + len（输入）/ len（池）作业。将它保留为默认值 chunksize = 1 给了我相同的性能，就好像我根本不使用并行处理一样...... bad。

我想用订购的 imap 与订购的 map 并没有真正的好处，我只是个人喜欢迭代器更好。

In py2.6+, the multiprocessing module offers a Pool class, so one can do:
class Volatile(object): def do_stuff(self, ...): pool = multiprocessing.Pool() return pool.imap(...)
However, with the standard Python implementation at 2.7.2, this approach soon leads to "IOError: [Errno 24] Too many open files". Apparently the pool object never gets garbage collected, so its processes never terminate, accumulating whatever descriptors are opened internally. I think this because the following works:
class Volatile(object): def do_stuff(self, ...): pool = multiprocessing.Pool() result = pool.map(...) pool.terminate() return result
I would like to keep the "lazy" iterator approach of imap; how does the garbage collector work in that case? How to fix the code?
解决方案
In the end, I ended up passing the pool reference around and terminating it manually once the pool.imap iterator was finished:
class Volatile(object): def do_stuff(self, ...): pool = multiprocessing.Pool() return pool, pool.imap(...) def call_stuff(self): pool, results = self.do_stuff() for result in results: # lazy evaluation of the imap pool.terminate()
In case anyone stumbles upon this solution in the future: the chunksize parameter is very important in Pool.imap (as opposed to plain Pool.map, where it didn't matter). I manually set it so that each process receives 1 + len(input) / len(pool) jobs. Leaving it to the default chunksize=1 gave me the same performance as if I didn't use parallel processing at all... bad.
I guess there's no real benefit to using ordered imap vs. ordered map, I just personally like iterators better.

这篇关于多处理和垃圾收集的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！