I have a tree structure of widgets e.g. collection contains models and model contains widgets. I want to copy whole collection, copy.deepcopy
is faster in comparison to 'pickle and de-pickle'ing the object but cPickle as being written in C is much faster, so
- 为什么我(我们)不总是使用cPickle而不是deepcopy?
- 还有其他复印选择吗?因为pickle比Deepcopy慢,但是cPickle更快,所以可能是Deepcopy的C实现会成为赢家
import copy
import pickle
import cPickle
class A(object): pass
d = {}
for i in range(1000):
d[i] = A()
def copy1():
return copy.deepcopy(d)
def copy2():
return pickle.loads(pickle.dumps(d, -1))
def copy3():
return cPickle.loads(cPickle.dumps(d, -1))
>python -m timeit -s "import c" "c.copy1()"
10 loops, best of 3: 46.3 msec per loop
>python -m timeit -s "import c" "c.copy2()"
10 loops, best of 3: 93.3 msec per loop
>python -m timeit -s "import c" "c.copy3()"
100 loops, best of 3: 17.1 msec per loop
问题是,pickle + unpickle可以更快(在C语言实现中),因为它比deepcopy的通用性更强:许多对象可以深层复制但不腌制.例如,假设您的班级A
Problem is, pickle+unpickle can be faster (in the C implementation) because it's less general than deepcopy: many objects can be deepcopied but not pickled. Suppose for example that your class A
were changed to...:
class A(object):
class B(object): pass
def __init__(self): self.b = self.B()
仍然可以正常工作(A的复杂性使其速度降低,但绝对不能阻止它); copy2
now, copy1
still works fine (A's complexity slows it downs but absolutely doesn't stop it); copy2
and copy3
break, the end of the stack trace says...:
File "./c.py", line 20, in copy3
return cPickle.loads(cPickle.dumps(d, -1))
PicklingError: Can't pickle <class 'c.B'>: attribute lookup c.B failed
I.e., pickling always assumes that classes and functions are top-level entities in their modules, and so pickles them "by name" -- deepcopying makes absolutely no such assumptions.
So if you have a situation where speed of "somewhat deep-copying" is absolutely crucial, every millisecond matters, AND you want to take advantage of special limitations that you KNOW apply to the objects you're duplicating, such as those that make pickling applicable, or ones favoring other forms yet of serializations and other shortcuts, by all means go ahead - but if you do you MUST be aware that you're constraining your system to live by those limitations forevermore, and document that design decision very clearly and explicitly for the benefit of future maintainers.
For the NORMAL case, where you want generality, use deepcopy