本文介绍了在汇编语言层面如何实现线程同步?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
虽然我熟悉并发编程概念,例如互斥和信号量,但我从来没有理解它们是如何在汇编语言层实现的。
While I'm familiar with concurrent programming concepts such as mutexes and semaphores, I have never understood how they are implemented at the assembly language level.
是一组记忆标志:
- 锁A由线程1持有
- 锁B由线程3持有
- 锁C不由任何线程持有
- 等
- lock A is held by thread 1
- lock B is held by thread 3
- lock C is not held by any thread
- etc
但是如何访问这些标志在线程之间同步?像这个天真的例子只会创建一个竞争条件:
But how is access to these flags synchronized between threads? Something like this naive example would only create a race condition:
mov edx, [myThreadId]
wait:
cmp [lock], 0
jne wait
mov [lock], edx
; I wanted an exclusive lock but the above
; three instructions are not an atomic operation :(
推荐答案
$ b
(...和一些在放弃线程的时间片之前旋转 - 通常通过调用切换上下文的内核函数。)
xchg
。所以在严格意义上,不需要CAS来创建自旋锁 - 但是仍然需要某种原子性。在这种情况下,它使用可以向寄存器写入寄存器并返回先前内容的原子操作该单个步骤中的内存插槽。 (为了更清楚一点: lock 前缀断言#LOCK信号,确保当前CPU可以独占访问内存在今天的CPU上不一定这样,但效果通过使用 xchg
,我们确保我们不会在读写之间的某个地方被抢占,因为指令不会被中途中断,所以如果我们有一个false lock mov reg0,mem / lock mov mem,reg1 对(我们不这样做),它不会完全相同 - 它可以在两个mov之间被抢占。)- 而不是一个朴素的实现,你应该使用例如。 a ,
- 在超线程CPU上,您应该可以发出
pause
指令,作为您正在旋转的提示 - 以便您正在运行的核心可以在这 - 之前做些有用的事情。
- etc ...
- In practice, these tend to be implemented with CAS and LL/SC.(...and some spinning before giving up the time slice of the thread - usually by calling into a kernel function that switches context.)
- If you only need a spinlock, wikipedia gives you an example which trades CAS for lock prefixed
xchg
on x86/x64. So in a strict sense, a CAS is not needed for crafting a spinlock - but some kind of atomicity is still required. In this case, it makes use of an atomic operation that can write a register to memory and return the previous contents of that memory slot in a single step. (To clarify a bit more: the lock prefix asserts the #LOCK signal that ensures that the current CPU has exclusive access to the memory. On todays CPUs it is not necessarily carried out this way, but the effect is the same. By usingxchg
we make sure that we will not get preempted somewhere between reading and writing, since instructions will not be interrupted half-way. So if we had an imaginary lock mov reg0, mem / lock mov mem, reg1 pair (which we don't), that would not quite be the same - it could be preempted just between the two movs.) - On current architectures, as pointed out in the comments, you mostly end up using the atomic primitives of the CPU and the coherency protocols provided by the memory subsystem.
- For this reason, you not only have to use these primitives, but also account for the cache/memory coherency guaranteed by the architecture.
- There may be implementation nuances as well. Considering e.g. a spinlock:
- instead of a naive implementation, you should probably use e.g. a TTAS spin-lock with some exponential backoff,
- on a Hyper-Threaded CPU, you should probably issue
pause
instructions that serve as hints that you're spinning - so that the core you are running on can do something useful during this - you should really give up on spinning and yield control to other threads after a while
- etc...
这篇关于在汇编语言层面如何实现线程同步?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!