问题描述
所以我尝试在 python 中进行多处理,并尝试使用这两种技术执行一个简单的 map 函数并进行基准测试.然而,发生的奇怪事情是在我创建 4 个池的代码中实际上花费了更多时间.以下是我的通用代码:
So I tried my hands on multiprocessing in python and tried to execute a simple map function using both the techniques and did the benchmarking. However the strange thing that occurred is that it actually took more time in the code where I created 4 pools. Following is my general code:
from datetime import datetime
from multiprocessing.dummy import Pool as ThreadPool
def square(x):
return x*x
l = xrange(10000000)
map(square, l)
执行这段代码大约需要 1.5 秒
Executing this code took about 1.5 secs
现在我使用以下代码为多处理创建了 4 个池:
Now I created 4 pools for multiprocessing using following code:
from datetime import datetime
from multiprocessing.dummy import Pool as ThreadPool
def square(x):
return x*x
l = xrange(10000000)
pool = ThreadPool(4)
results = pool.map(square, l)
pool.close()
pool.join()
现在,当我对其进行基准测试时,多处理代码实际上花费了更多时间(大约 2.5 秒).由于它是一个 cpu 绑定任务,我有点困惑,为什么它实际上应该花费更少的时间却花费了更多的时间.对我做错了什么有任何看法吗?
Now when I benchmarked it, multiprocessed code actually took more time(around 2.5 secs). Since it is a cpu bound task, I am a bit confused as in why it took more time when it actually should have taken less. Any views on what I am doing wrong?
编辑 - 我没有使用 multiprocessing.dummy,而是使用了 multiprocessing,但它仍然较慢.更慢.
Edit - Instead of multiprocessing.dummy I used multiprocessing and it was still slower. Even more slower.
推荐答案
这并不奇怪.你的测试是一个很差的测试.您将线程用于长时间运行的任务.但是您正在测试的是一个几乎立即返回的函数.这里的主要因素是设置线程的开销.这远远超过您可能从线程中获得的任何好处.
This is not surprising. Your test is a very poor test. You use threads for long running tasks. But what you are testing is a function that returns almost instantly. Here the primary factor is the overhead of setting up threads. That far outweighs any benefits you will possibly get from threading.
这篇关于为什么给定代码中的多处理代码比通常的顺序执行花费更多的时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!