问题描述
我有一个python应用程序,它可以获取数据集合,并针对该集合中的每条数据执行一项任务.由于涉及到延迟,因此该任务需要一些时间才能完成.由于这种延迟,我不希望每个数据都随后执行任务,而是希望它们全部并行发生.我应该使用多进程吗?或执行此操作的线程?
I have a python application that grabs a collection of data and for each piece of data in that collection it performs a task. The task takes some time to complete as there is a delay involved. Because of this delay, I don't want each piece of data to perform the task subsequently, I want them to all happen in parallel. Should I be using multiprocess? or threading for this operation?
我尝试使用线程,但是遇到了一些麻烦,通常某些任务实际上不会执行.
I attempted to use threading but had some trouble, often some of the tasks would never actually fire.
推荐答案
如果您是真正的计算绑定者,请使用多处理模块可能是最轻量级的解决方案(就内存消耗和实现难度而言).
If you are truly compute bound, using the multiprocessing module is probably the lightest weight solution (in terms of both memory consumption and implementation difficulty.)
如果您受I/O约束,请使用线程模块通常会给您带来良好的效果.确保使用线程安全存储(例如Queue)将数据移交给线程.否则,将它们生成时交给他们的唯一数据.
If you are I/O bound, using the threading module will usually give you good results. Make sure that you use thread safe storage (like the Queue) to hand data to your threads. Or else hand them a single piece of data that is unique to them when they are spawned.
PyPy 专注于性能.它具有许多功能,可以帮助进行计算绑定处理.他们还支持软件事务存储,尽管这还不是生产质量.保证您可以使用比多处理(有一些尴尬的要求)更简单的并行或并发机制.
PyPy is focused on performance. It has a number of features that can help with compute-bound processing. They also have support for Software Transactional Memory, although that is not yet production quality. The promise is that you can use simpler parallel or concurrent mechanisms than multiprocessing (which has some awkward requirements.)
无堆栈Python 也是一个好主意.如上所述,Stackless具有可移植性问题. 空燕子很有前途,但现已废止. Pyston 是另一个专注于速度的(未完成的)Python实现.它采用的方法不同于PyPy,可能会产生更好的(或略有不同)加速.
Stackless Python is also a nice idea. Stackless has portability issues as indicated above. Unladen Swallow was promising, but is now defunct. Pyston is another (unfinished) Python implementation focusing on speed. It is taking an approach different to PyPy, which may yield better (or just different) speedups.
这篇关于python中的多进程或线程?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!