init从不获取僵尸

init从不获取僵尸

本文介绍了init从不获取僵尸/无效进程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的内核为2.6.18的Fedora Core 9 Web服务器上,init不会获得僵尸进程.如果进程表最终没有达到无法分配新进程的上限,这将是可以接受的.

On my Fedora Core 9 webserver with kernel 2.6.18, init isn't reaping zombie processes. This would be bearable if it wasn't for the process table eventually reaching an upper limit where no new processes can be allocated.

ps -el | grep 'Z'的示例输出:

F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
5 Z     0  2648     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
1 Z    51  2656     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
1 Z     0  2670     1  0  75   0 -     0 exit   ?        00:00:02 crond <defunct>
4 Z     0  2874     1  0  82   0 -     0 exit   ?        00:00:00 mysqld_safe <defunct>
5 Z     0 28104     1  0  76   0 -     0 exit   ?        00:00:00 httpd <defunct>
5 Z     0 28716     1  0  76   0 -     0 exit   ?        00:00:06 lfd <defunct>
5 Z    74 10172     1  0  75   0 -     0 exit   ?        00:00:00 sshd <defunct>
5 Z     0 11199     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11202     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11205     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11208     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11211     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11240     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11246     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11249     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
5 Z     0 11252     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>
1 Z     0 14106     1  0  80   0 -     0 exit   ?        00:00:00 anacron <defunct>
5 Z     0 14631     1  0  75   0 -     0 exit   ?        00:00:00 sendmail <defunct>

这是OS的错误吗?配置错误?我正在寻找有关此问题根源的灵感.谢谢

Is this an OS bug? misconfiguration? I'm looking for inspiration as to the source of this problem.Thanks

推荐答案

这在Ubuntu上有2种表现:

This has hit me on Ubuntu in 2 ways:

  1. 内核出错.以我为例,内核驱动程序崩溃了,进程内部变得异常混乱.最好的测试方法是检查/var/log/syslog(和dmesg),看是否有问题-例如"BUG:无法处理0000000000000028处的内核NULL指针取消引用",

  1. Something wrong with the kernel. In my case a kernel driver had crashed and process internals went bonkers. The best way to test this is checking /var/log/syslog (and dmesg) to see if anything looks awry - for example "BUG: unable to handle kernel NULL pointer dereference at 0000000000000028",

我第二次看到这是init并不是大多数情况下子进程的父级"(实际手册页引用).当您使用ptrace syscall(strace程序在内部使用)连接到进程时,可能会发生这种情况.例如,我遇到了将strace附加到子进程B的情况.最终,进程B终止,其父进程也终止(不确定顺序).然后,进程B看起来像是init拥有的僵尸.但是,其最主要的目的"父级实际上是strace程序.杀死该痕迹后,收获了进程B

The other time I've seen this is when init is not the "parent of the child process for most purposes" (actual manpage quote). This can happen when you use the ptrace syscall (which the strace program uses internally) to attach on a process. For instance, I've gotten into a situation where I attach strace to child process B. Eventually, process B terminates as does its parent (not sure what order). Process B then looks like a zombie owned by init. However, its "most purposes" parent was actually the strace program. After killing the strace, process B was reaped

这篇关于init从不获取僵尸/无效进程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-23 05:33