问题描述
我正在使用Scipy的KDTree实现来读取300 MB的大文件.现在,是否有一种方法可以将数据结构保存到磁盘并再次加载,还是每次启动程序时都从文件中读取原始点并构造数据结构?我正在按如下方式构造KDTree:
I am using Scipy's KDTree implementation to read a large file of 300 MB. Now, is there a way I can just save the datastructure to disk and load it again or am I stuck with reading raw points from file and constructing the data structure each time I start my program? I am constructing the KDTree as follows:
def buildKDTree(self):
self.kdpoints = numpy.fromfile("All", sep=' ')
self.kdpoints.shape = self.kdpoints.size / self.NDIM, NDIM
self.kdtree = KDTree(self.kdpoints, leafsize = self.kdpoints.shape[0]+1)
print "Preparing KDTree... Ready!"
有什么建议吗?
推荐答案
KDtree使用嵌套类来定义其节点类型(innernode,leafnode). Pickle仅适用于模块级别的类定义,因此嵌套类会使其崩溃:
KDtree uses nested classes to define its node types (innernode, leafnode). Pickle only works on module-level class definitions, so a nested class trips it up:
import cPickle
class Foo(object):
class Bar(object):
pass
obj = Foo.Bar()
print obj.__class__
cPickle.dumps(obj)
<class '__main__.Bar'>
cPickle.PicklingError: Can't pickle <class '__main__.Bar'>: attribute lookup __main__.Bar failed
但是,有一个(棘手的)解决方法,可以通过在模块范围内将类定义猴子修补到scipy.spatial.kdtree
中,以便选择器可以找到它们.如果您所有读取和写入腌制KDtree对象的代码都安装了这些修补程序,则此hack应该可以正常工作:
However, there is a (hacky) workaround by monkey-patching the class definitions into the scipy.spatial.kdtree
at module scope so the pickler can find them. If all of your code which reads and writes pickled KDtree objects installs these patches, this hack should work fine:
import cPickle
import numpy
from scipy.spatial import kdtree
# patch module-level attribute to enable pickle to work
kdtree.node = kdtree.KDTree.node
kdtree.leafnode = kdtree.KDTree.leafnode
kdtree.innernode = kdtree.KDTree.innernode
x, y = numpy.mgrid[0:5, 2:8]
t1 = kdtree.KDTree(zip(x.ravel(), y.ravel()))
r1 = t1.query([3.4, 4.1])
raw = cPickle.dumps(t1)
# read in the pickled tree
t2 = cPickle.loads(raw)
r2 = t2.query([3.4, 4.1])
print t1.tree.__class__
print repr(raw)[:70]
print t1.data[r1[1]], t2.data[r2[1]]
输出:
<class 'scipy.spatial.kdtree.innernode'>
"ccopy_reg\n_reconstructor\np1\n(cscipy.spatial.kdtree\nKDTree\np2\nc_
[3 4] [3 4]
这篇关于在Python中保存KDTree对象?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!