问题描述
在此链接中实现GCC cas功能对于版本4.1.2和更早版本,我问一个问题,要使用compare_and_swap
函数来实现内置函数__sync_fetch_and_add
,这是我的最终代码,可以在x86和x64上很好地运行(在 CentOS上进行了测试5.0 32bit 和 CentOS 7 64bit ).
In this link achieve GCC cas function for version 4.1.2 and earlier I ask a question to use compare_and_swap
function to achieve the Built-in function __sync_fetch_and_add
here is my final code, run well in x86 and x64 (tested on CentOS 5.0 32bit and CentOS 7 64bit ).
这是我的代码:
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <pthread.h>
static unsigned long count = 0;
int sync_add_and_fetch(int* reg, int oldval, int incre)
{
register char result;
#ifdef __i386__
__asm__ volatile ("lock; cmpxchgl %3, %0; setz %1"
: "=m"(*reg), "=q" (result)
: "m" (*reg), "r" (oldval + incre), "a" (oldval)
: "memory");
return result;
#elif defined(__x86_64__)
__asm__ volatile ("lock; cmpxchgq %3, %0; setz %1"
: "=m"(*reg), "=q" (result)
: "m" (*reg), "r" (newval + incre), "a" (oldval)
: "memory");
return result;
#else
#error:architecture not supported and gcc too old
#endif
}
void *test_func(void *arg)
{
int i = 0;
int result = 0;
for(i = 0; i < 2000; ++i)
{
result = 0;
while(0 == result)
{
result = sync_add_and_fetch((int *)&count, count, 1);
}
}
return NULL;
}
int main(int argc, const char *argv[])
{
pthread_t id[10];
int i = 0;
for(i = 0; i < 10; ++i){
pthread_create(&id[i], NULL, test_func, NULL);
}
for(i = 0; i < 10; ++i){
pthread_join(id[i], NULL);
}
//10*2000=200000
printf("%u\n", count);
return 0;
}
现在我还有另一个问题,如何在Linux中实现功能InterlockedExchange
, InterlockedExchange 与上面的代码一样,具有__i386__
和__x86_64__
版本.只需使用参数类型不匹配上方的代码,也许汇编代码将被重写.
Now I have another question, how to implement function InterlockedExchange
in Linux, InterlockedExchange just like the code above, have a __i386__
and __x86_64__
version. Just use the code above the parameter type not match, and maybe the assembly code will be rewritten.
推荐答案
交换很容易;您无需返回任何状态或任何内容.它总是 xchg
交换,因此您只需要返回旧值和/或状态类似于cmpxchg .
Exchange is easy; you don't have to return any status or anything. It xchg
always swaps, so you just have to return the old value and/or a status like for cmpxchg.
static inline
int sync_xchg(int *p, int newval)
{
// xchg mem,reg has an implicit lock prefix
asm volatile("xchg %[newval], %[mem]"
: [mem] "+m" (*p), [newval] "+r" (newval) // read/write operands
:
: "memory" // compiler memory barrier, giving us seq-cst memory ordering
);
return newval;
}
使用int32_t
可能更好,但是在所有相关的ABI中int是32位.另外,此asm可用于任何大小的整数. GCC将选择一个16位,32位或64位寄存器,这将暗示xchg的操作数大小. (在没有寄存器操作数的情况下,只需后缀addl
,例如lock; addl $1, (%0)
)
It might be better to use int32_t
, but int is 32 bits in all relevant ABIs.Also, this asm works for any size of integer. GCC will pick a 16-bit, 32-bit, or 64-bit register, and that will imply that operand-size for xchg. (You only need a suffix like addl
when there's no register operand, e.g. lock; addl $1, (%0)
)
无论如何,我对此进行了测试,并使用 gcc4.1.2资源管理器.我没有访问gcc4.1.1的权限,但希望"+m"
和[named]
asm内存操作数在该发行版中不是新的.
Anyway, I tested this and it compiles correctly with gcc4.1.2 on the Godbolt compiler explorer. I don't have access to gcc4.1.1, but hopefully "+m"
and [named]
asm memory operands weren't new in that release.
+m
使很多编写防弹交换功能变得更加容易.例如gcc没有为相同的变量选择两个不同的存储位置作为输入vs.作为输出的风险.
+m
makes it much easier to write a bulletproof swap function. e.g. no risk of gcc choosing two different memory locations for the same variable as an input vs. as an output.
gcc -O3
输出:
# 32-bit: -m32 -fomit-frame-pointer
sync_xchg(int*, int):
movl 4(%esp), %edx
movl 8(%esp), %eax
xchg %eax, (%edx)
ret # returns in EAX
# 64-bit: just -O3
sync_xchg(int*, int):
xchg %esi, (%rdi)
movl %esi, %eax
ret # returns in EAX
我还使用更新的gcc并单击Godbolt上的11010
二进制按钮,检查了此asm是否实际组装.有趣的事实:因为GAS接受xchg mem,reg或xchg reg,mem,所以您可以使用AT& T或Intel语法(-masm=intel
)编译此函数.但是,您的其他功能将无法正常工作.
I also checked that this asm actually assembles, using a newer gcc and clicking the 11010
binary button on Godbolt. Fun fact: because GAS accepts xchg mem,reg or xchg reg,mem, you can compile this function with AT&T or Intel syntax (-masm=intel
). Your other functions won't work, though.
这篇关于在Linux中实现窗口功能InterlockedExchange的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!