我正在处理大型数据文件的Amazon EC2 Ubuntu实例上运行python进程。最初,一切都很好,我没有注意到RAM或CPU使用率的任何持续增长。然后,在处理了一部分输入数据之后,该过程将耗尽内存并死亡。 dmesg -T
产生以下内容,但不会告诉我任何事情:
[Thu Jan 3 17:47:27 2013] python invoked oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0
[Thu Jan 3 17:47:27 2013] python cpuset=/ mems_allowed=0
[Thu Jan 3 17:47:27 2013] Pid: 1108, comm: python Not tainted 3.2.0-25-virtual #40-Ubuntu
[Thu Jan 3 17:47:27 2013] Call Trace:
[Thu Jan 3 17:47:27 2013] [<ffffffff810bdb9d>] ? cpuset_print_task_mems_allowed+0x9d/0xb0
[Thu Jan 3 17:47:27 2013] [<ffffffff81118231>] dump_header+0x91/0xe0
[Thu Jan 3 17:47:27 2013] [<ffffffff811185b5>] oom_kill_process+0x85/0xb0
[Thu Jan 3 17:47:27 2013] [<ffffffff8111895a>] out_of_memory+0xfa/0x220
[Thu Jan 3 17:47:27 2013] [<ffffffff8111e38a>] __alloc_pages_nodemask+0x7ea/0x800
[Thu Jan 3 17:47:27 2013] [<ffffffff810063dd>] ? pte_mfn_to_pfn+0x8d/0x110
[Thu Jan 3 17:47:27 2013] [<ffffffff811569fa>] alloc_pages_vma+0x9a/0x150
[Thu Jan 3 17:47:27 2013] [<ffffffff8113705c>] do_anonymous_page.isra.38+0x7c/0x2f0
[Thu Jan 3 17:47:27 2013] [<ffffffff8113acc1>] handle_pte_fault+0x1e1/0x200
[Thu Jan 3 17:47:27 2013] [<ffffffff8100647e>] ? xen_pmd_val+0xe/0x10
[Thu Jan 3 17:47:27 2013] [<ffffffff810052d9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
[Thu Jan 3 17:47:27 2013] [<ffffffff8113b098>] handle_mm_fault+0x1f8/0x350
[Thu Jan 3 17:47:27 2013] [<ffffffff81659f9b>] do_page_fault+0x14b/0x520
[Thu Jan 3 17:47:27 2013] [<ffffffff811425fd>] ? mprotect_fixup+0x17d/0x2b0
[Thu Jan 3 17:47:27 2013] [<ffffffff81142920>] ? sys_mprotect+0x1f0/0x250
[Thu Jan 3 17:47:27 2013] [<ffffffff81656bf5>] page_fault+0x25/0x30
[Thu Jan 3 17:47:27 2013] Mem-Info:
[Thu Jan 3 17:47:27 2013] Node 0 DMA per-cpu:
[Thu Jan 3 17:47:27 2013] CPU 0: hi: 0, btch: 1 usd: 0
[Thu Jan 3 17:47:27 2013] Node 0 DMA32 per-cpu:
[Thu Jan 3 17:47:27 2013] CPU 0: hi: 186, btch: 31 usd: 0
[Thu Jan 3 17:47:27 2013] active_anon:142435 inactive_anon:14 isolated_anon:0
[Thu Jan 3 17:47:27 2013] active_file:0 inactive_file:11 isolated_file:0
[Thu Jan 3 17:47:27 2013] unevictable:0 dirty:0 writeback:0 unstable:0
[Thu Jan 3 17:47:27 2013] free:1389 slab_reclaimable:1528 slab_unreclaimable:1686
[Thu Jan 3 17:47:27 2013] mapped:2 shmem:45 pagetables:793 bounce:0
[Thu Jan 3 17:47:27 2013] Node 0 DMA free:2460kB min:72kB low:88kB high:108kB active_anon:12296kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:14524kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:8kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:16kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[Thu Jan 3 17:47:27 2013] lowmem_reserve[]: 0 597 597 597
[Thu Jan 3 17:47:27 2013] Node 0 DMA32 free:3096kB min:3088kB low:3860kB high:4632kB active_anon:557444kB inactive_anon:56kB active_file:0kB inactive_file:44kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:611856kB mlocked:0kB dirty:0kB writeback:0kB mapped:8kB shmem:180kB slab_reclaimable:6104kB slab_unreclaimable:6744kB kernel_stack:1024kB pagetables:3156kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:27445 all_unreclaimable? yes
[Thu Jan 3 17:47:27 2013] lowmem_reserve[]: 0 0 0 0
[Thu Jan 3 17:47:27 2013] Node 0 DMA: 1*4kB 2*8kB 1*16kB 0*32kB 0*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 2468kB
[Thu Jan 3 17:47:27 2013] Node 0 DMA32: 151*4kB 10*8kB 23*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 3100kB
[Thu Jan 3 17:47:27 2013] 55 total pagecache pages
[Thu Jan 3 17:47:27 2013] 0 pages in swap cache
[Thu Jan 3 17:47:27 2013] Swap cache stats: add 0, delete 0, find 0/0
[Thu Jan 3 17:47:27 2013] Free swap = 0kB
[Thu Jan 3 17:47:27 2013] Total swap = 0kB
[Thu Jan 3 17:47:27 2013] 159472 pages RAM
[Thu Jan 3 17:47:27 2013] 8383 pages reserved
[Thu Jan 3 17:47:27 2013] 261 pages shared
[Thu Jan 3 17:47:27 2013] 149349 pages non-shared
[Thu Jan 3 17:47:27 2013] [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name
[Thu Jan 3 17:47:27 2013] [ 238] 0 238 4306 47 0 0 0 upstart-udev-br
[Thu Jan 3 17:47:27 2013] [ 242] 0 242 5396 119 0 -17 -1000 udevd
[Thu Jan 3 17:47:27 2013] [ 287] 0 287 5362 99 0 -17 -1000 udevd
[Thu Jan 3 17:47:27 2013] [ 288] 0 288 5362 99 0 -17 -1000 udevd
[Thu Jan 3 17:47:27 2013] [ 361] 0 361 3795 48 0 0 0 upstart-socket-
[Thu Jan 3 17:47:27 2013] [ 419] 0 419 1814 123 0 0 0 dhclient3
[Thu Jan 3 17:47:27 2013] [ 643] 0 643 12487 151 0 -17 -1000 sshd
[Thu Jan 3 17:47:27 2013] [ 657] 101 657 63427 102 0 0 0 rsyslogd
[Thu Jan 3 17:47:27 2013] [ 663] 102 663 5981 89 0 0 0 dbus-daemon
[Thu Jan 3 17:47:27 2013] [ 725] 0 725 3624 42 0 0 0 getty
[Thu Jan 3 17:47:27 2013] [ 732] 0 732 3624 41 0 0 0 getty
[Thu Jan 3 17:47:27 2013] [ 741] 0 741 3624 42 0 0 0 getty
[Thu Jan 3 17:47:27 2013] [ 743] 0 743 3624 41 0 0 0 getty
[Thu Jan 3 17:47:27 2013] [ 747] 0 747 3624 41 0 0 0 getty
[Thu Jan 3 17:47:27 2013] [ 755] 0 755 1080 37 0 0 0 acpid
[Thu Jan 3 17:47:27 2013] [ 756] 0 756 4776 50 0 0 0 cron
[Thu Jan 3 17:47:27 2013] [ 757] 0 757 4225 39 0 0 0 atd
[Thu Jan 3 17:47:27 2013] [ 787] 0 787 3624 41 0 0 0 getty
[Thu Jan 3 17:47:27 2013] [ 790] 103 790 46895 300 0 0 0 whoopsie
[Thu Jan 3 17:47:27 2013] [ 797] 0 797 20467 216 0 0 0 sshd
[Thu Jan 3 17:47:27 2013] [ 800] 0 800 146074 260 0 0 0 console-kit-dae
[Thu Jan 3 17:47:27 2013] [ 867] 0 867 46645 154 0 0 0 polkitd
[Thu Jan 3 17:47:27 2013] [ 983] 1000 983 20467 213 0 0 0 sshd
[Thu Jan 3 17:47:27 2013] [ 984] 1000 984 6557 1766 0 0 0 bash
[Thu Jan 3 17:47:27 2013] [ 1108] 1000 1108 163815 138085 0 0 0 python
[Thu Jan 3 17:47:27 2013] Out of memory: Kill process 1108 (python) score 915 or sacrifice child
[Thu Jan 3 17:47:27 2013] Killed process 1108 (python) total-vm:655260kB, anon-rss:552336kB, file-rss:4kB
有没有一种方法可以描述该过程以找出正在发生的情况以及导致RAM使用率突然激增的原因?谢谢
最佳答案
我使用Dowser
来帮助跟踪一个项目中的内存使用情况。它作为一个简单的Web界面运行,并产生大量信息,可帮助您查找问题。
Dowser Blog giving an example.
Dowser Wiki