

我知道,如果有足够的上下文,人们可能希望从 segfault 条件中建设性地使用(即恢复).

I know that, given enough context, one could hope to use constructively (i.e. recover) from a segfault condition.


But, is the effort worth it? If yes, in what situation(s) ?



You can't really hope to recover from a segfault. You can detect that it happened, and dump out relevant application-specific state if possible, but you can't continue the process. This is because (amongst others)

  • 失败的线程无法继续,因此您唯一的选择是 longjmp 或终止线程.在大多数情况下,两者都不安全.
  • 无论哪种方式,您都可能使互斥锁/锁处于锁定状态,从而导致其他线程永远等待
  • 即使没有发生这种情况,您也可能会泄漏资源
  • 即使您不执行上述任何一项操作,发生段错误的线程也可能在失败时导致应用程序的内部状态不一致.不一致的内部状态可能会导致数据错误或随后出现进一步的不良行为,从而导致比简单退出更多的问题


So in general, there is no point in trapping it and doing anything EXCEPT terminating the process in a fairly abrupt fashion. There's no point in attempting to write (important) data back to disc, or continue to do other useful work. There is some point in dumping out state to logs- which many applications do - and then quitting.

一个可能有用的事情可能是 exec() 你自己的进程,或者有一个看门狗进程在崩溃的情况下重新启动它.(注意:如果您的进程有 >1 个线程,则 exec 并不总是具有明确定义的行为)

A possibly useful thing to do might be to exec() your own process, or have a watchdog process which restarts it in the case of a crash. (NB: exec does not always have well defined behaviour if your process has >1 thread)


08-01 12:18