问题描述
我在我的 Rails 代码中发现了内存泄漏 - 也就是说,我发现了什么代码泄漏,但没有发现它为什么泄漏.我已将其简化为不需要 Rails 的测试用例:
I've discovered a memory leak in my Rails code - that is to say, I've found what code leaks but not why it leaks. I've reduced it down to a test case that doesn't require Rails:
require 'csspool'
require 'ruby-mass'
def report
puts 'Memory ' + `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`.strip.split.map(&:to_i)[1].to_s + 'KB'
Mass.print
end
report
# note I do not store the return value here
CSSPool::CSS::Document.parse(File.new('/home/jason/big.css'))
ObjectSpace.garbage_collect
sleep 1
report
ruby-mass 据说可以让我看到内存中的所有对象.CSSPool 是一个基于 racc./home/jason/big.css 是一个 1.5MB 的 CSS 文件.
ruby-mass supposedly lets me see all the objects in memory. CSSPool is a CSS parser based on racc. /home/jason/big.css is a 1.5MB CSS file.
输出:
Memory 9264KB
==================================================
Objects within [] namespace
==================================================
String: 7261
RubyVM::InstructionSequence: 1151
Array: 562
Class: 313
Regexp: 181
Proc: 111
Encoding: 99
Gem::StubSpecification: 66
Gem::StubSpecification::StubLine: 60
Gem::Version: 60
Module: 31
Hash: 29
Gem::Requirement: 25
RubyVM::Env: 11
Gem::Specification: 8
Float: 7
Gem::Dependency: 7
Range: 4
Bignum: 3
IO: 3
Mutex: 3
Time: 3
Object: 2
ARGF.class: 1
Binding: 1
Complex: 1
Data: 1
Gem::PathSupport: 1
IOError: 1
MatchData: 1
Monitor: 1
NoMemoryError: 1
Process::Status: 1
Random: 1
RubyVM: 1
SystemStackError: 1
Thread: 1
ThreadGroup: 1
fatal: 1
==================================================
Memory 258860KB
==================================================
Objects within [] namespace
==================================================
String: 7456
RubyVM::InstructionSequence: 1151
Array: 564
Class: 313
Regexp: 181
Proc: 113
Encoding: 99
Gem::StubSpecification: 66
Gem::StubSpecification::StubLine: 60
Gem::Version: 60
Module: 31
Hash: 30
Gem::Requirement: 25
RubyVM::Env: 13
Gem::Specification: 8
Float: 7
Gem::Dependency: 7
Range: 4
Bignum: 3
IO: 3
Mutex: 3
Time: 3
Object: 2
ARGF.class: 1
Binding: 1
Complex: 1
Data: 1
Gem::PathSupport: 1
IOError: 1
MatchData: 1
Monitor: 1
NoMemoryError: 1
Process::Status: 1
Random: 1
RubyVM: 1
SystemStackError: 1
Thread: 1
ThreadGroup: 1
fatal: 1
==================================================
您可以看到内存方式上升.一些计数器上升,但不存在特定于 CSSPool 的对象.我使用 ruby-mass 的索引"方法来检查具有引用的对象,如下所示:
You can see the memory going way up. Some of the counters go up, but no objects specific to CSSPool are present. I used ruby-mass's "index" method to inspect the objects that have references like so:
Mass.index.each do |k,v|
v.each do |id|
refs = Mass.references(Mass[id])
puts refs if !refs.empty?
end
end
但同样,这并没有给我任何与 CSSPool 相关的信息,只有 gem 信息等.
But again, this doesn't give me anything related to CSSPool, just gem info and such.
我也试过输出GC.stat"...
I've also tried outputting "GC.stat"...
puts GC.stat
CSSPool::CSS::Document.parse(File.new('/home/jason/big.css'))
ObjectSpace.garbage_collect
sleep 1
puts GC.stat
结果:
{:count=>4, :heap_used=>126, :heap_length=>138, :heap_increment=>12, :heap_live_num=>50924, :heap_free_num=>24595, :heap_final_num=>0, :total_allocated_object=>86030, :total_freed_object=>35106}
{:count=>16, :heap_used=>6039, :heap_length=>12933, :heap_increment=>3841, :heap_live_num=>13369, :heap_free_num=>2443302, :heap_final_num=>0, :total_allocated_object=>3771675, :total_freed_object=>3758306}
据我所知,如果一个对象没有被引用并且发生了垃圾回收,那么应该从内存中清除该对象.但这似乎不是这里发生的事情.
As I understand it, if an object is not referenced and garbage collection happens, then that object should be cleared from memory. But that doesn't seem to be what's happening here.
我还阅读了有关 C 级内存泄漏的信息,并且由于 CSSPool 使用使用 C 代码的 Racc,我认为这是一种可能性.我已经通过 Valgrind 运行了我的代码:
I've also read about C-level memory leaks, and since CSSPool uses Racc which uses C code, I think this is a possibility. I've run my code through Valgrind:
valgrind --partial-loads-ok=yes --undef-value-errors=no --leak-check=full --fullpath-after= ruby leak.rb 2> valgrind.txt
结果在这里.我不确定这是否证实了 C 级泄漏,因为我还读到 Ruby 使用 Valgrind 无法理解的内存进行处理.
Results are here. I'm not sure if this confirms a C-level leak, as I've also read that Ruby does things with memory that Valgrind doesn't understand.
使用的版本:
- Ruby 2.0.0-p247(这是我的 Rails 应用程序运行的内容)
- Ruby 1.9.3-p392-ref(用于使用 ruby-mass 进行测试)
- 红宝石质量 0.1.3
- CSSPool 4.0.0 来自这里
- CentOS 6.4 和 Ubuntu 13.10
推荐答案
您似乎正在进入失落的世界.我也不认为 racc
中的 c 绑定有问题.
It looks like you are entering The Lost World here. I don’t think the problem is with c-bindings in racc
either.
Ruby 内存管理既优雅又麻烦.它将对象(名为RVALUE
s)存储在所谓的堆中,大小约为 16KB.在低级别上,RVALUE
是一个 c 结构,包含不同标准 ruby 对象表示的 union
.
Ruby memory management is both elegant and cumbersome. It stores objects (named RVALUE
s) in so-called heaps of size of approx 16KB. On a low level, RVALUE
is a c-struct, containing a union
of different standard ruby object representations.
因此,堆存储 RVALUE
对象,其大小不超过 40 字节.对于诸如 String
、Array
、Hash
等对象,这意味着小对象可以放入堆中,但一旦它们到达阈值,将分配 Ruby 堆之外的额外内存.
So, heaps store RVALUE
objects, which size is not more than 40 bytes. For such objects as String
, Array
, Hash
etc. this means that small objects can fit in the heap, but as soon as they reach a threshold, an extra memory outside of the Ruby heaps will be allocated.
这个额外的内存是灵活的;一旦一个对象被 GC 处理,它就会被释放.这就是为什么你的带有 big_string
的测试用例显示内存上下行为:
This extra memory is flexible; is will be freed as soon as an object became GC’ed. That’s why your testcase with big_string
shows the memory up-down behaviour:
def report
puts 'Memory ' + `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`
.strip.split.map(&:to_i)[1].to_s + 'KB'
end
report
big_var = " " * 10000000
report
big_var = nil
report
ObjectSpace.garbage_collect
sleep 1
report
# ⇒ Memory 11788KB
# ⇒ Memory 65188KB
# ⇒ Memory 65188KB
# ⇒ Memory 11788KB
但是堆(参见GC[:heap_length]
)本身不会释放回到操作系统,一旦获得.看,我将对您的测试用例进行单调的更改:
But the heaps (see GC[:heap_length]
) themselves are not released back to OS, once acquired. Look, I’ll make a humdrum change to your testcase:
- big_var = " " * 10000000
+ big_var = 1_000_000.times.map(&:to_s)
而且,瞧:
# ⇒ Memory 11788KB
# ⇒ Memory 65188KB
# ⇒ Memory 65188KB
# ⇒ Memory 57448KB
内存不再释放回操作系统,因为我引入的数组的每个元素适合RVALUE
的大小,并且存储在红宝石堆.
The memory is not released back to OS anymore, because each element of the array I introduced suits the RVALUE
size and is stored in the ruby heap.
如果您在 GC 运行后检查 GC.stat
的输出,您会发现 GC[:heap_used]
值按预期减少.Ruby 现在有很多空堆,准备好了.
If you’ll examine the output of GC.stat
after the GC was run, you’ll find that GC[:heap_used]
value is decreased as expected. Ruby now has a lot of empty heaps, ready.
总结:我不认为,c
代码泄漏.我认为问题出在 css
中巨大图像的 base64 表示中.我不知道解析器内部发生了什么,但看起来巨大的字符串迫使 ruby 堆计数增加.
The summing up: I don’t think, the c
code leaks. I think the problem is within base64 representation of huge image in your css
. I have no clue, what’s happening inside parser, but it looks like the huge string forces the ruby heap count to increase.
希望有帮助.
这篇关于查找 Ruby 内存泄漏的原因的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!