本文介绍了BeautifulSoup 对象不会pickle,导致解释器悄悄崩溃的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一种来自 BeautifulSoup 的汤,我不能腌制.当我尝试腌制对象时,python 解释器默默地崩溃(因此它不能作为异常处理).我必须能够腌制对象,以便使用 multiprocessing 包(它腌制对象以在进程之间传递它们)返回对象.如何解决/解决问题?不幸的是,我无法发布该页面的 html(它不是公开可用的),而且我一直无法找到该问题的可重现示例.我试图通过循环汤和酸洗单个组件来隔离问题,产生错误的最小因素是 .当我打印对象时,它会打印出 u''.

I have a soup from BeautifulSoup that I cannot pickle. When I try to pickle the object the python interpreter silently crashes (such that it cannot be handled as an exception). I have to be able to pickle the object in order to return the object using the multiprocessing package (which pickles objects to pass them between processes). How can I troubleshoot/work around the problem? Unfortunately, I cannot post the html for the page (it is not publicly available), and I have been unable to find a reproducible example of the problem. I have tried to isolate the problem by looping over the soup and pickling individual components, the smallest thing that produces the error is <class 'BeautifulSoup.NavigableString'>. When I print the object it prints out u''.

推荐答案

NavigableString 类不能用 picklecPickle 序列化,这multiprocessing 使用.但是,您应该能够使用 dill 序列化此类.dill 拥有pickle 接口的超集,可以序列化大部分python.multiprocessing 仍然会失败,除非您使用 multiprocessing 的分支,它使用 dill,称为 pathos.multiprocessing.

The class NavigableString is not serializable with pickle or cPickle, which multiprocessing uses. You should be able to serialize this class with dill, however. dill has a superset of the pickle interface, and can serialize most of python. multiprocessing will still fail, unless you use a fork of multiprocessing which uses dill, called pathos.multiprocessing.

在此处获取代码:https://github.com/uqfoundation.

有关更多信息,请参阅:multiprocessing 和 dill 可以一起做什么?

For more information see:What can multiprocessing and dill do together?

http://matthewrocklin.com/blog/work/2013/12/05/并行与序列化/

http://nbviewer.ipython.org/gist/minrk/5241793

这篇关于BeautifulSoup 对象不会pickle,导致解释器悄悄崩溃的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 09:58
查看更多