本文介绍了在 Sage 中使用 Python 的 pickle 导致内存使用率高的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用基于 Python 的 Sage Mathematics 软件来创建一个很长的向量列表.该列表包含大约 100,000,000 个元素,sys.getsizeof() 告诉我它的大小略小于 1GB.

I am using the Python based Sage Mathematics software to create a very long list of vectors. The list contains roughly 100,000,000 elements and sys.getsizeof() tells me that it is of size a little less than 1GB.

这个列表我腌制到一个文件中(这已经花费了很长时间——但足够公平).只有当我解开这个列表时,它才会变得烦人.RAM 使用量从 1.15GB 增加到 4.3GB,我想知道这是怎么回事?

This list I pickle into a file (which already takes a long time -- but fair enough). Only when I unpickle this list it gets annoying. The RAM usage increases from 1.15GB to 4.3GB, and I am wondering what's going on?

如何在 Sage 中找出所有内存的用途?你有什么想法可以通过应用 Python 技巧来优化它吗?

How can I find out in Sage what all the memory is used for? And do you have any ideas how to optimize this by maybe applying Python tricks?

这是对kcrisman评论的回复.

This is a reply to the comment of kcrisman.

我无法发布确切的代码,因为它太长了.但这里有一个简单的例子,可以观察到这些现象.我正在使用 Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.51-1 x86_64 GNU/Linux.

The exact code I cannot post since it would be too long. But here is a simple example where the phenomena can be observed. I am working on Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.51-1 x86_64 GNU/Linux.

启动 Sage 并执行:

Start Sage and execute:

import pickle
L = [vector([1,2,3]) for k in range(1000000)]
f = open("mylist", 'w')
pickle.dump(L, f)

在我的系统上,该列表有 8697472 字节大,而我腌制的文件大约有 130MB.现在关闭 Sage 并观察您的记忆(例如使用 htop).然后执行以下几行:

On my system the list is 8697472 bytes big, and the file I pickled into has roughly 130MB. Now close Sage and watch your memory (with htop, for example). Then execute the following lines:

import pickle
f = open("mylist", 'r')
pickle.load(f)

没有 sage,我的 Linux 系统使用 1035MB 的内存,当 Sage 运行时,使用量增加到 1131MB.在我解压文件后,它使用了 2535MB,我觉得很奇怪.

Without sage my Linux system uses 1035MB of memory, when Sage is running the usage increases to 1131MB. After I unpickled the file it uses 2535MB which I find odd.

推荐答案

最好不要直接使用 python 的 pickle 模块.cPickle 已经好一点了,但是 sage 中的很多酸洗都采用协议 2,而 (c)Pickle 并未默认使用该协议.您可以使用 sage 自己的泡菜包装.如果我用

It's probably better to not use python's pickle module directly. cPickle is already a bit better, but a lot of pickling in sage assumes protocol 2, which (c)Pickle doesn't default to. You can use sage's own wrappers of pickle. If I do your example with

sage: open("mylist",'w').write(dumps(L))

然后通过

sage: L = loads(open("mylist",'r').read())

我没有发现任何问题.

请注意,上面的接口并不是将 sage 中的 pickle/unpickle 到文件的最佳接口.最好使用 save/load.我这样做是为了尽可能接近您的示例.

Note that the above interface is not the best one to pickle/unpickle in sage to a file. You'd be better off using save/load. I just did it that way to stay as close as possible to your example.

这篇关于在 Sage 中使用 Python 的 pickle 导致内存使用率高的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

05-19 19:52