我有一个用python编写的事件驱动应用程序。一段时间后(通常> 1周),它似乎只是停止响应事件。当发生这种情况时,我只需按ctrl-C并重新运行,一切都很好。但是,这种情况一直很烦人,我不知道是什么原因造成的。 有没有一种方法可以运行我的应用程序,当这种情况发生并且应用程序不再接受连接时,我可以进入调试器,查看其运行情况以及为什么不建立连接?
I have an event-driven application, written in python. After a while (usually >1 week) it appears to just stop responding to events. When this happens, I just ctrl-C and re-run and all is well-again. However, it's kind of annoying that this keeps happening and I have no idea what's causing it. Is there a way I can run my application that when this occurs and the application is no longer accepting connections, I can drop into a debugger and see what it's doing and why it's not taking connections?
I've used pdb before, but the way I've used it (if condition: pdb.set_trace()
) doesn't really apply here, because I have no idea what it's doing in the code when it fails. My ideal situation would be instead of Ctrl-C maybe I hit Ctrl-somethingelse and that causes it to stop and drop into the debugger. Is such a thing easily done?
Triggering pdb in your case is probably not simple. However, whenever I need to debug such hangs, I inspect a "snapshot" of tracebacks of all the threads in the process, using the dumpstacks()
You can either use a timer to call it periodically and print the output to a log file, and refer to it when you notice the hanging, or harness some RPC mechanism (e.g. signals) to trigger the function call in your process on demand. I usually do the latter, because the processes in my system already listen to such RPC requests (using rpyc).