问题描述
我正在使用 python 并且在索引文档(用于搜索引擎)时需要大量 RAM,在我停止索引过程后,内存仍然已满(例如 8GB 的 RAM).这很糟糕,因为我需要我的搜索引擎一直工作,而不是在我完成索引后重置操作系统.有没有什么有效的方法来管理庞大的数组、字典和列表,以及如何释放它们.有什么想法吗?
I am using python and when indexing documents (for search engine) it takes a lot of RAM, after i stop the indexing process the memory is still full (like 8gb of RAM). This is bad because i need my search engine to work all the time and not to reset the OS when i finished indexing. Is there any efficient way how to manage with huge arrays,dictionaries and lists, and how to free them. Any ideas?
我在 stackoverflow 上也看到了一些关于它的问题,但它们已经过时了:
I saw also some questions about it on stackoverflow, but they are old:
信息:
free -t
total used free shared buffers cached
Mem: 5839 5724 114 0 15 1011
-/+ buffers/cache: 4698 1141
Swap: 1021 186 835
Total: 6861 5910 950
top | grep python
3164 root 20 0 68748 31m 1404 R 17 0.5 53:43.89 python
6716 baddc0re 20 0 84788 30m 1692 S 0 0.5 0:06.81 python
ps aux | grep python
root 3164 57.1 0.4 64876 29824 pts/0 R+ May27 54:23 python SE_doc_parse.py
baddc0re 6693 0.0 0.2 53240 16224 pts/1 S+ 00:46 0:00 python index.py
uptime
01:02:40 up 1:43, 3 users, load average: 1.22, 1.46, 1.39
sysctl vm.min_free_kbytes
vm.min_free_kbytes = 67584
真正的问题是当我启动脚本时索引速度很快,但是当使用量增加时它会变慢.
The real problem is when i start the script the indexing is fast, but when the usage is increasing it is getting slower.
Document wikidoc_18784 added on 2012-05-28 01:03:46 "fast"
wikidoc_18784
-----------------------------------
Document wikidoc_21934 added on 2012-05-28 01:04:00 "slower"
wikidoc_21934
-----------------------------------
Document wikidoc_22903 added on 2012-05-28 01:04:01 "slower"
wikidoc_22903
-----------------------------------
Document wikidoc_20274 added on 2012-05-28 01:04:10 "slower"
wikidoc_20274
-----------------------------------
Document wikidoc_23013 added on 2012-05-28 01:04:53 "even more slower"
wikidoc_23013
文档的大小最多为一页或两页文本.10页的索引大约需要2-3秒.
The size of the documents is one or two pages of text maximum. The indexing of 10 pages takes about 2-3 seconds.
谢谢大家的帮助:)
推荐答案
从讨论来看,您似乎只是将数据存储在一个巨大的巨大字典中(我很少会板着脸说;))也许将数据偏移到适当的数据库(如 redis)上可能会减少 python 的内存使用量.它还可以让您的数据更高效、更快速地处理.
From discussion it seems you are storing the data in nothing but a giant huge dict (not often I get to say that with a straight face ;) )Maybe offsetting the data onto a proper database like redis might reduce the memory usage of python. It might also make your data more efficient and faster to work with.
这篇关于内存使用,如何释放内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!