问题描述
我认为这是一个愚蠢的问题,但我仍然找不到.实际上,最好将其分为两个问题:
I believe it is a stupid question but I still can't find it. Actually it's better to separate it into two questions:
1)我是对的,我们可以有很多线程,但是由于GIL在一瞬间只有一个线程正在执行?
1) Am I right that we could have a lot of threads but because of GIL in one moment only one thread is executing?
2)如果是,为什么我们仍然需要锁?我们使用锁来避免当两个线程试图读取/写入某个共享库时发生的情况,因为GIL twi线程无法在同一时间执行,可以吗?
2) If so, why do we still need locks? We use locks to avoid the case when two threads are trying to read/write some shared object, because of GIL twi threads can't be executed in one moment, can they?
推荐答案
GIL保护Python插入项.这意味着:
GIL protects the Python interals. That means:
- 您不必担心解释器中由于多线程而出问题
- 大多数事情并不是真正并行运行的,因为python代码由于GIL而被顺序执行
但是GIL不会保护您自己的代码.例如,如果您具有以下代码:
But GIL does not protect your own code. For example, if you have this code:
self.some_number += 1
这将读取self.some_number
的值,计算some_number+1
,然后将其写回到self.some_number
.
That is going to read value of self.some_number
, calculate some_number+1
and then write it back to self.some_number
.
如果在两个线程中执行此操作,则一个线程和另一个线程的操作(读取,添加,写入)可能会混合在一起,从而导致结果错误.
If you do that in two threads, the operations (read, add, write) of one thread and the other may be mixed, so that the result is wrong.
这可能是执行顺序:
- 线程1读取
self.some_number
(0) - thread2读取
self.some_number
(0) - thread1计算
some_number+1
(1) - thread2计算
some_number+1
(1) - thread1将1写入
self.some_number
- thread2将1写入
self.some_number
- thread1 reads
self.some_number
(0) - thread2 reads
self.some_number
(0) - thread1 calculates
some_number+1
(1) - thread2 calculates
some_number+1
(1) - thread1 writes 1 to
self.some_number
- thread2 writes 1 to
self.some_number
您使用锁来强制执行以下顺序:
You use locks to enforce this order of execution:
- 线程1读取
self.some_number
(0) - thread1计算
some_number+1
(1) - thread1将1写入
self.some_number
- thread2读取
self.some_number
(1) - thread2计算
some_number+1
(2) - thread2将2写入
self.some_number
- thread1 reads
self.some_number
(0) - thread1 calculates
some_number+1
(1) - thread1 writes 1 to
self.some_number
- thread2 reads
self.some_number
(1) - thread2 calculates
some_number+1
(2) - thread2 writes 2 to
self.some_number
让我们用一些代码解释这个行为来完成这个答案:
import threading
import time
total = 0
lock = threading.Lock()
def increment_n_times(n):
global total
for i in range(n):
total += 1
def safe_increment_n_times(n):
global total
for i in range(n):
lock.acquire()
total += 1
lock.release()
def increment_in_x_threads(x, func, n):
threads = [threading.Thread(target=func, args=(n,)) for i in range(x)]
global total
total = 0
begin = time.time()
for thread in threads:
thread.start()
for thread in threads:
thread.join()
print('finished in {}s.\ntotal: {}\nexpected: {}\ndifference: {} ({} %)'
.format(time.time()-begin, total, n*x, n*x-total, 100-total/n/x*100))
有两个实现增量的功能.一个使用锁,另一个不使用锁.
There are two functions which implement increment. One uses locks and the other does not.
函数increment_in_x_threads
在许多线程中并行执行递增函数.
Function increment_in_x_threads
implements parallel execution of the incrementing function in many threads.
现在使用足够多的线程来运行它几乎可以肯定会发生错误:
Now running this with a big enough number of threads makes it almost certain that an error will occur:
print('unsafe:')
increment_in_x_threads(70, increment_n_times, 100000)
print('\nwith locks:')
increment_in_x_threads(70, safe_increment_n_times, 100000)
就我而言,它打印:
unsafe:
finished in 0.9840562343597412s.
total: 4654584
expected: 7000000
difference: 2345416 (33.505942857142855 %)
with locks:
finished in 20.564176082611084s.
total: 7000000
expected: 7000000
difference: 0 (0.0 %)
因此,没有锁的情况下,会有很多错误(增量失败的33%).另一方面,带锁的速度要慢20倍.
So without locks, there were many errors (33% of increments failed). On the other hand, with locks it was 20 times slower.
当然,这两个数字都被炸掉了,因为我使用了70个线程,但这显示了总体思路.
Of course, both numbers are blown up because I used 70 threads, but this shows the general idea.
这篇关于如果有GIL,为什么我们需要线程锁?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!