问题描述
根据此 https://www.cl.cam.ac.uk/〜pes20/cpp/cpp0xmappings.html ,在x86(包括x86-64)上,已发布的存储区被实现为 MOV
(进入内存).
According to this https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html,a released store is implemented as MOV
(into memory) on x86 (including x86-64).
根据他的 http://en.cppreference.com/w/cpp/原子/内存顺序
memory_order_release :
我了解到,使用 memory_order_release 时,以前完成的所有内存存储都应在此操作之前完成.
I understand that when memory_order_release is used, all memory stores done previously should finish before this one.
int a;
a = 10;
std::atomic<int> b;
b.store(50, std::memory_order_release); // i can be sure that 'a' is already 10, so processor can't reorder the stores to 'a' and 'b'
问题:对于这种行为,一条简单的 MOV
指令(没有显式的内存隔离)怎么可能足够? MOV
如何告诉处理器完成之前的所有存储?
QUESTION: how is it possible that a bare MOV
instruction (without an explicit memory fence) is sufficient for this behaviour? How does MOV
tell the processor to finish all previous stores?
推荐答案
至少在使用Intel编译器编译的代码中,这确实是映射,
That does appear to be the mapping, at least in code compiled with the Intel compiler, where I see:
0000000000401100 <_Z5storeRSt6atomicIiE>:
401100: 48 89 fa mov %rdi,%rdx
401103: b8 32 00 00 00 mov $0x32,%eax
401108: 89 02 mov %eax,(%rdx)
40110a: c3 retq
40110b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
0000000000401110 <_Z4loadRSt6atomicIiE>:
401110: 48 89 f8 mov %rdi,%rax
401113: 8b 00 mov (%rax),%eax
401115: c3 retq
401116: 0f 1f 00 nopl (%rax)
401119: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
获取代码:
#include <atomic>
#include <stdio.h>
void store( std::atomic<int> & b ) ;
int load( std::atomic<int> & b ) ;
int main()
{
std::atomic<int> b ;
store( b ) ;
printf("%d\n", load( b ) ) ;
return 0 ;
}
void store( std::atomic<int> & b )
{
b.store(50, std::memory_order_release ) ;
}
int load( std::atomic<int> & b )
{
int v = b.load( std::memory_order_acquire ) ;
return v ;
}
当前英特尔架构文档,第3卷(系统编程指南)在解释这一方面做得很好.参见:
The current Intel architecture documents, Volume 3 (System Programming Guide), does a nice job explaining this. See:
8.2.2 P6和更新的处理器家族中的内存顺序
8.2.2 Memory Ordering in P6 and More Recent Processor Families
- 读取不会与其他读取重新排序.
- 写入不会随着较早的读取而重新排序.
- 对内存的写入不会与其他写入一起重新排序,但以下情况除外:...
完整的内存模型在此处进行了说明.我以为英特尔和C ++标准人员已经详细合作,确定了与第3卷中描述的内存模型相符的每种内存顺序操作的最佳映射,并且已经确定了简单的存储方式和装入方式在这些情况下就足够了.
The full memory model is explained there. I'd assume that Intel and the C++ standard folks have worked together in detail to nail down the best mapping for each of the memory order operations possible with that conforms to the memory model described in Volume 3, and plain stores and loads have been determined to be sufficient in those cases.
请注意,仅仅因为x86-64上的此有序存储不需要特殊说明,并不意味着这将是普遍适用的.对于powerpc,我希望在存储区中看到类似lwsync指令的内容,在hpux(ia64)上,编译器应使用st4.rel指令.
Note that just because no special instructions are required for this ordered store on x86-64, doesn't mean that will be universally true. For powerpc I'd expect to see something like a lwsync instruction along with the store, and on hpux (ia64) the compiler should be using a st4.rel instruction.
这篇关于MOV x86指令是否实现C ++ 11 memory_order_release原子存储?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!