问题描述
我有一种来自 BeautifulSoup
的汤,我不能腌制.当我尝试腌制对象时,python 解释器默默地崩溃(因此它不能作为异常处理).我必须能够腌制对象,以便使用 multiprocessing
包(它腌制对象以在进程之间传递它们)返回对象.如何解决/解决问题?不幸的是,我无法发布该页面的 html(它不是公开可用的),而且我一直无法找到该问题的可重现示例.我试图通过循环汤和酸洗单个组件来隔离问题,产生错误的最小因素是 .当我打印对象时,它会打印出
u''
.
I have a soup from BeautifulSoup
that I cannot pickle. When I try to pickle the object the python interpreter silently crashes (such that it cannot be handled as an exception). I have to be able to pickle the object in order to return the object using the multiprocessing
package (which pickles objects to pass them between processes). How can I troubleshoot/work around the problem? Unfortunately, I cannot post the html for the page (it is not publicly available), and I have been unable to find a reproducible example of the problem. I have tried to isolate the problem by looping over the soup and pickling individual components, the smallest thing that produces the error is <class 'BeautifulSoup.NavigableString'>
. When I print the object it prints out u''
.
推荐答案
NavigableString
类不能用 pickle
或 cPickle
序列化,这multiprocessing
使用.但是,您应该能够使用 dill
序列化此类.dill
拥有pickle
接口的超集,可以序列化大部分python.multiprocessing
仍然会失败,除非您使用 multiprocessing
的分支,它使用 dill
,称为 pathos.multiprocessing
.
The class NavigableString
is not serializable with pickle
or cPickle
, which multiprocessing
uses. You should be able to serialize this class with dill
, however. dill
has a superset of the pickle
interface, and can serialize most of python. multiprocessing
will still fail, unless you use a fork of multiprocessing
which uses dill
, called pathos.multiprocessing
.
在此处获取代码:https://github.com/uqfoundation.
有关更多信息,请参阅:multiprocessing 和 dill 可以一起做什么?
For more information see:What can multiprocessing and dill do together?
http://matthewrocklin.com/blog/work/2013/12/05/并行与序列化/
http://nbviewer.ipython.org/gist/minrk/5241793
这篇关于BeautifulSoup 对象不会pickle,导致解释器悄悄崩溃的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!