问题描述
我使用一个自旋锁来保护一个非常小的关键部分。争发生的非常的很少所以自旋锁是比常规的互斥体更合适。
I'm using a spin lock to protect a very small critical section. Contention happens very rarely so a spin lock is more appropriate than a regular mutex.
我目前的code是如下,并假定x86和海合会:
My current code is as follows, and assumes x86 and GCC:
volatile int exclusion = 0;
void lock() {
while (__sync_lock_test_and_set(&exclusion, 1)) {
// Do nothing. This GCC builtin instruction
// ensures memory barrier.
}
}
void unlock() {
__sync_synchronize(); // Memory barrier.
exclusion = 0;
}
所以我想知道:
- 这是code正确吗?它是否正确确保互斥?
- 是否在所有的x86操作系统上工作?
- 是否在x86_64的工作吗?在所有的操作系统?
- 它是最佳的?
- 我见过使用比较和交换自旋锁实现,但我不知道这是更好的。
- 根据海合会原子内建文件(http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html)这里还有
__ sync_lock_release
。我不是记忆障碍的专家,所以我不知道它是否没关系,我,而不是使用__ sync_synchronize 。
- 我在优化中没有竞争的情况。
Is this code correct? Does it correctly ensure mutual exclusion?
Does it work on all x86 operating systems?
Does it work on x86_64 too? On all operating systems?
Is it optimal?
- I've seen spin lock implementations using compare-and-swap but I'm not sure which is better.
- According to the GCC atomic builtins documentation (http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html) there's also
__sync_lock_release
. I'm not an expert on memory barriers so I'm not sure whether it's okay for me to use this instead of__sync_synchronize
. - I'm optimizing for the case in which there's no contention.
我根本不的关心的有关争。有可能是1,也许2其他线程试图锁定自旋锁每隔数天的
推荐答案
所以我想知道:
* Is it correct?
在提到的情况下,我会说是的。
In the context mentioned, I would say yes.
* Is it optimal?
这是一个意味深长的问题。通过重新发明轮子,你也重塑了很多问题,这些问题都已经解决了其他实施
That's a loaded question. By reinventing the wheel you are also reinventing a lot of problems that have been solved by other implementations
-
我期望失败的浪费循环,你是不是尝试访问锁定字。
I'd expect a waste loop on failure where you aren't trying to access the lock word.
使用在解锁只需要有释放的语义完整的屏障(这就是为什么你会使用__sync_lock_release,这样你会得到关于安腾而非MF,或PowerPC处理器的lwsync st1.rel, ...)。如果你真的只在乎的x86或x86_64的类型,在这里或不使用不事尽可能多的障碍(但如果你在哪里做跳转到英特尔的安腾对于HP-IPF端口,那么你不会想这一点)。
Use of a full barrier in the unlock only needs to have release semantics (that's why you'd use __sync_lock_release, so that you'd get st1.rel on itanium instead of mf, or a lwsync on powerpc, ...). If you really only care about x86 or x86_64 the types of barriers used here or not don't matter as much (but if you where to make the jump to intel's itanium for an HP-IPF port then you wouldn't want this).
你没有,你会浪费你的循环之前,通常将暂停()指令。
you don't have the pause() instruction that you'd normally put before your waste loop.
。如果你真的需要这买你的表现那么futex的建议可能是一个很好的一个。如果您需要的性能这给你买够坏的保养的这个code你有大量的研究要做。
when there is contention you want something, semop, or even a dumb sleep in desperation. If you really need the performance that this buys you then the futex suggestion is probably a good one. If you need the performance this buys you bad enough to maintain this code you have a lot of research to do.
请注意,有评论说,并没有要求释放屏障。这是不正确的,甚至在x86,因为释放屏障也作为编译器的指令,不推诿其他存储器访问围绕着障碍。非常喜欢,如果你使用你会得到什么样的 ASM (:::内存)。
Note that there was a comment saying that the release barrier wasn't required. That isn't true even on x86 because the release barrier also serves as an instruction to the compiler to not shuffle other memory accesses around the "barrier". Very much like what you'd get if you used asm ("" ::: "memory" ).
* on compare and swap
在x86的sync_lock_test_and_set将映射到它有一个隐含的锁preFIX一个XCHG指令。绝对是最小巧的产生code(特别是如果你使用一个字节为锁定字,而不是一个int),但没有比如果你使用LOCK CMPXCHG少正确的。的比较和交换,可用于票友algorthims(就像把一个非零指针的元数据为先店小二到失败的锁定字)。使用
On x86 the sync_lock_test_and_set will map to a xchg instruction which has an implied lock prefix. Definitely the most compact generated code (esp. if you use a byte for the "lock word" instead of an int), but no less correct than if you used LOCK CMPXCHG. Use of compare and swap can be used for fancier algorthims (like putting a non-zero pointer to metadata for the first "waiter" into the lockword on failure).
这篇关于是我的自旋锁执行正确的和最佳的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!