问题描述
虽然在Windows上的python中使用多处理,但是可以保护程序的入口点.该文档说:确保新的Python解释器可以安全地导入主模块,而不会引起意外的副作用(例如,启动新进程)".谁能解释这到底是什么意思?
While using multiprocessing in python on windows, it is expected to protect the entry point of the program.The documentation says "Make sure that the main module can be safely imported by a new Python interpreter without causing unintended side effects (such a starting a new process)". Can anyone explain what exactly does this mean ?
推荐答案
对已经获得的好答案进行一些扩展,如果您了解Linux-y系统的功能,将很有帮助.他们使用fork()
生成新进程,这会产生两个 good 后果:
Expanding a bit on the good answer you already got, it helps if you understand what Linux-y systems do. They spawn new processes using fork()
, which has two good consequences:
- 主程序中存在的所有数据结构对子进程可见.他们实际上是在数据的副本上工作.
- 子进程在主程序中紧随
fork()
之后的指令处开始执行-因此,已在模块中执行的任何模块级代码都不会再次执行.
- All data structures existing in the main program are visible to the child processes. They actually work on copies of the data.
- The child processes start executing at the instruction immediately following the
fork()
in the main program - so any module-level code already executed in the module will not be executed again.
在Windows中无法使用
fork()
,因此在Windows上,每个子进程都会重新导入每个模块.所以:
fork()
isn't possible in Windows, so on Windows each module is imported anew by each child process. So:
- 在Windows上,子程序可以看到主程序中存在的 no 数据结构;并且,
- 所有模块级代码在每个子进程中执行.
- On Windows, no data structures existing in the main program are visible to the child processes; and,
- All module-level code is executed in each child process.
因此,您需要考虑一下想要仅在主程序中执行的代码.最明显的例子是您希望创建子进程的代码仅在主程序中运行-因此应受__name__ == '__main__'
保护.作为一个更好的示例,请考虑构建一个巨大列表的代码,您打算将该列表传递给辅助进程以进行爬网.您可能也希望对此加以保护,因为在这种情况下,没有必要让每个工作进程浪费RAM并浪费时间来建立自己的无用的巨型列表副本.
So you need to think a bit about which code you want executed only in the main program. The most obvious example is that you want code that creates child processes to run only in the main program - so that should be protected by __name__ == '__main__'
. For a subtler example, consider code that builds a gigantic list, which you intend to pass out to worker processes to crawl over. You probably want to protect that too, because there's no point in this case to make each worker process waste RAM and time building their own useless copies of the gigantic list.
请注意,即使在Linux-y系统上,也要适当使用__name__ == "__main__"
是个好主意,因为它可以使预期的工作分工更加明确.并行程序可能会令人困惑-一点点帮助;-)
Note that it's a Good Idea to use __name__ == "__main__"
appropriately even on Linux-y systems, because it makes the intended division of work clearer. Parallel programs can be confusing - every little bit helps ;-)
这篇关于强制使用if __name __ =="__ main__"在Windows中使用多重处理时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!