问题描述
标准C ++ 11是否保证memory_order_seq_cst
防止StoreLoad在非原子内存访问的原子操作周围重新排序?
Does standard C++11 guarantee that memory_order_seq_cst
prevents StoreLoad reordering around an atomic operation for non-atomic memory accesses?
众所周知,C ++ 11中有6个std::memory_order
,并且它指定如何围绕原子操作对常规非原子存储进行排序-工作草案,标准适用于C ++ 2016-07-12编程语言: http ://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4606.pdf
As known, there are 6 std::memory_order
s in C++11, and its specifies how regular, non-atomic memory accesses are to be ordered around an atomic operation - Working Draft, Standard for Programming Language C++ 2016-07-12: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4606.pdf
§29.3/1
枚举 memory_order 指定详细的常规 (非原子)内存的同步顺序(如1.10中所述),并且可能 提供操作订购.其枚举值及其 含义如下:
The enumeration memory_order specifies the detailed regular (non-atomic) memory synchronization order as defined in 1.10 and may provide for operation ordering. Its enumerated values and their meanings are as follows:
众所周知,这6个memory_order阻止了其中某些重新排序:
Also known, that these 6 memory_orders prevent some of these reordering:
但是,memory_order_seq_cst
是否阻止针对常规,非原子的内存访问或仅针对具有相同memory_order_seq_cst
的其他原子的原子操作围绕StoreLoad重新排序?
But, does memory_order_seq_cst
prevent StoreLoad reordering around an atomic operation for regular, non-atomic memory accesses or only for other atomic with the same memory_order_seq_cst
?
即为了防止这种StoreLoad重新排序,我们应该同时对STORE和LOAD使用std::memory_order_seq_cst
还是仅对其中之一使用?
I.e. to prevent this StoreLoad-reordering should we use std::memory_order_seq_cst
for both STORE and LOAD, or only for one of it?
std::atomic<int> a, b;
b.store(1, std::memory_order_seq_cst); // Sequential Consistency
a.load(std::memory_order_seq_cst); // Sequential Consistency
关于Acquire-Release语义非常清楚,它指定了跨原子操作的完全非原子的内存访问重新排序: http://en.cppreference.com/w/cpp/atomic/memory_order
为防止StoreLoad重新排序,我们应使用std::memory_order_seq_cst
.
To prevent StoreLoad-reordering we should use std::memory_order_seq_cst
.
两个例子:
-
std::memory_order_seq_cst
用于存储和加载:存在MFENCE
std::memory_order_seq_cst
for both STORE and LOAD: there isMFENCE
StoreLoad不能重新排序-GCC 6.1.0 x86_64: https://godbolt.org/g/mVZJs0
StoreLoad can't be reordered - GCC 6.1.0 x86_64: https://godbolt.org/g/mVZJs0
std::atomic<int> a, b;
b.store(1, std::memory_order_seq_cst); // can't be executed after LOAD
a.load(std::memory_order_seq_cst); // can't be executed before STORE
-
std::memory_order_seq_cst
仅用于LOAD:没有MFENCE
std::memory_order_seq_cst
for LOAD only: there isn'tMFENCE
StoreLoad可以重新排序-GCC 6.1.0 x86_64: https://godbolt.org/g/2NLy12
StoreLoad can be reordered - GCC 6.1.0 x86_64: https://godbolt.org/g/2NLy12
std::atomic<int> a, b;
b.store(1, std::memory_order_release); // can be executed after LOAD
a.load(std::memory_order_seq_cst); // can be executed before STORE
如果C/C ++编译器使用了C/C ++ 11到x86的替代映射,则在LOAD:MFENCE,MOV (from memory)
之前刷新了存储缓冲区,因此我们也必须对LOAD使用std::memory_order_seq_cst
: http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html 由于此示例在方法(3)的另一个问题中进行了讨论:
Also if C/C++-compiler used alternative mapping of C/C++11 to x86, which flushes the Store Buffer before the LOAD: MFENCE,MOV (from memory)
, so we must use std::memory_order_seq_cst
for LOAD too: http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html As this example is discussed in another question as approach (3): Does it make any sense instruction LFENCE in processors x86/x86_64?
即我们应该对存储和加载使用std::memory_order_seq_cst
来保证生成MFENCE
,以防止StoreLoad重新排序.
I.e. we should use std::memory_order_seq_cst
for both STORE and LOAD to generate MFENCE
guaranteed, that prevents StoreLoad reordering.
是真的,memory_order_seq_cst
用于原子加载或存储:
Is it true, that memory_order_seq_cst
for atomic Load or Store:
-
特定获取发布语义-防止:针对常规,非原子内存访问,
,但防止StoreLoad在原子操作周围重新排序仅适用于其他具有相同memory_order_seq_cst
?
but prevent StoreLoad reordering around an atomic operation only for other atomic operations with the same memory_order_seq_cst
?
推荐答案
不,标准C ++ 11 不保证memory_order_seq_cst
阻止 StoreLoad 对non-atomic
周围的non-atomic
.
No, standard C++11 doesn't guarantee that memory_order_seq_cst
prevents StoreLoad reordering of non-atomic
around an atomic(seq_cst)
.
即使是标准C ++ 11,也不都不能保证memory_order_seq_cst
阻止atomic(non-seq_cst)
周围atomic(seq_cst)
的 StoreLoad 重新排序.
Even standard C++11 doesn't guarantee that memory_order_seq_cst
prevents StoreLoad reordering of atomic(non-seq_cst)
around an atomic(seq_cst)
.
编程语言C ++标准工作草案2016-07-12: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4606.pdf
Working Draft, Standard for Programming Language C++ 2016-07-12: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4606.pdf
- 所有
memory_order_seq_cst
操作上都应有一个总订单S-C ++ 11 Standard:
- There shall be a single total order S on all
memory_order_seq_cst
operations - C++11 Standard:
3
所有memory_order_seq_cst上应有一个总订单S 操作,与之前发生"的命令一致,并且 所有受影响位置的修改订单,以便每个 memory_order_seq_cst操作B,该操作从原子加载值 对象M观察到以下值之一:...
There shall be a single total order S on all memory_order_seq_cst operations, consistent with the "happens before" order and modification orders for all affected locations, such that each memory_order_seq_cst operation B that loads a value from an atomic object M observes one of the following values: ...
- 但是,任何原子顺序小于
memory_order_seq_cst
的原子操作都没有顺序一致性,也没有单一的总顺序,即非memory_order_seq_cst
操作可以用memory_order_seq_cst
操作在允许的方向上重新排序-C ++ 11标准: - But, any atomic operations with ordering weaker than
memory_order_seq_cst
hasn't sequential consistency and hasn't single total order, i.e. non-memory_order_seq_cst
operations can be reordered withmemory_order_seq_cst
operations in allowed directions - C++11 Standard: - 在x86_64上
- 在PowerPC上
- 在x86_64上
-
LOAD
(无围栏)和STORE
+MFENCE
-
LOAD
(无围栏)和LOCK XCHG
-
MFENCE
+LOAD
和STORE
(无围栏) -
LOCK XADD
(0)和STORE
(无围栏) LOAD
(without fence) andSTORE
+MFENCE
LOAD
(without fence) andLOCK XCHG
MFENCE
+LOAD
andSTORE
(without fence)LOCK XADD
( 0 ) andSTORE
(without fence)- 1和2种方式:
LOAD
和(STORE
+MFENCE
)/(LOCK XCHG
)-我们在上面进行了回顾 - 3和4种方式:(
MFENCE
+LOAD
)/LOCK XADD
和STORE
-允许下一次重新排序: - 1 and 2 ways:
LOAD
and (STORE
+MFENCE
)/(LOCK XCHG
) - we reviewed above - 3 and 4 ways: (
MFENCE
+LOAD
)/LOCK XADD
andSTORE
- allow next reordering: - 在PowerPC上
8 [注意: memory_order_seq_cst确保顺序一致性 仅适用于没有数据争用并且专门使用的程序 memory_order_seq_cst操作.任何使用较弱的排序都会 除非使用了特别小心的措施,否则此保证将失效.特别是, memory_order_seq_cst栅栏确保仅针对栅栏的总订单 他们自己.栅栏通常不能用于恢复顺序 具有较弱订购规范的原子操作的一致性. —尾注]
8 [ Note: memory_order_seq_cst ensures sequential consistency only for a program that is free of data races and uses exclusively memory_order_seq_cst operations. Any use of weaker ordering will invalidate this guarantee unless extreme care is used. In particular, memory_order_seq_cst fences ensure a total order only for the fences themselves. Fences cannot, in general, be used to restore sequential consistency for atomic operations with weaker ordering specifications. — end note ]
C ++编译器也允许这样的重新排序:
Also C++-compilers allows such reorderings:
通常-如果在编译器中将seq_cst实现为存储后的屏障,则:
Usually - if in compilers seq_cst implemented as barrier after store, then:
STORE-C(relaxed);
LOAD-B(seq_cst);
可以重新排序为LOAD-B(seq_cst);
STORE-C(relaxed);
STORE-C(relaxed);
LOAD-B(seq_cst);
can be reordered to LOAD-B(seq_cst);
STORE-C(relaxed);
由GCC 7.0 x86_64生成的Asm屏幕快照: https://godbolt.org/g/4yyeby
Screenshot of Asm generated by GCC 7.0 x86_64: https://godbolt.org/g/4yyeby
理论上也是可能的-如果在编译器中seq_cst在加载之前实现为屏障,则:
Also, theoretically possible - if in compilers seq_cst implemented as barrier before load, then:
STORE-A(seq_cst);
LOAD-C(acq_rel);
可以重新排序为LOAD-C(acq_rel);
STORE-A(seq_cst);
STORE-A(seq_cst);
LOAD-C(acq_rel);
can be reordered to LOAD-C(acq_rel);
STORE-A(seq_cst);
STORE-A(seq_cst);
LOAD-C(relaxed);
可以重新排序为LOAD-C(relaxed);
STORE-A(seq_cst);
STORE-A(seq_cst);
LOAD-C(relaxed);
can be reordered to LOAD-C(relaxed);
STORE-A(seq_cst);
在PowerPC上也可以这样重新排序:
Also on PowerPC can be such reordering:
STORE-A(seq_cst);
STORE-C(relaxed);
可以重新排序为STORE-C(relaxed);
STORE-A(seq_cst);
STORE-A(seq_cst);
STORE-C(relaxed);
can reordered to STORE-C(relaxed);
STORE-A(seq_cst);
如果甚至允许原子变量跨原子(seq_cst)重新排序,那么非原子变量也可以跨原子(seq_cst)重新排序.
If even atomic variables are allowed to be reordered across atomic(seq_cst), then non-atomic variables can also be reordered across atomic(seq_cst).
由GCC 4.8 PowerPC生成的Asm屏幕截图: https://godbolt.org/g/BTQBr8
Screenshot of Asm generated by GCC 4.8 PowerPC: https://godbolt.org/g/BTQBr8
更多详细信息:
STORE-C(release);
LOAD-B(seq_cst);
可以重新排序为LOAD-B(seq_cst);
STORE-C(release);
STORE-C(release);
LOAD-B(seq_cst);
can be reordered to LOAD-B(seq_cst);
STORE-C(release);
即x86_64代码:
STORE-A(seq_cst);
STORE-C(release);
LOAD-B(seq_cst);
可以重新排序为:
STORE-A(seq_cst);
LOAD-B(seq_cst);
STORE-C(release);
之所以会发生这种情况,是因为c.store
和b.load
之间不是mfence
:
This can happen because between c.store
and b.load
isn't mfence
:
x86_64-GCC 7.0 : https://godbolt.org/g/dRGTaO
C ++和asm-代码:
C++ & asm - code:
#include <atomic>
// Atomic load-store
void test() {
std::atomic<int> a, b, c;
a.store(2, std::memory_order_seq_cst); // movl 2,[a]; mfence;
c.store(4, std::memory_order_release); // movl 4,[c];
int tmp = b.load(std::memory_order_seq_cst); // movl [b],[tmp];
}
它可以重新排序为:
#include <atomic>
// Atomic load-store
void test() {
std::atomic<int> a, b, c;
a.store(2, std::memory_order_seq_cst); // movl 2,[a]; mfence;
int tmp = b.load(std::memory_order_seq_cst); // movl [b],[tmp];
c.store(4, std::memory_order_release); // movl 4,[c];
}
此外,可以通过四种方式实现x86/x86_64中的顺序一致性: http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
STORE-A(seq_cst);
LOAD-C(acq_rel);
可以重新排序为LOAD-C(acq_rel);
STORE-A(seq_cst);
STORE-A(seq_cst);
LOAD-C(acq_rel);
can be reordered to LOAD-C(acq_rel);
STORE-A(seq_cst);
STORE-A(seq_cst);
LOAD-C(relaxed);
可以重新排序为LOAD-C(relaxed);
STORE-A(seq_cst);
STORE-A(seq_cst);
LOAD-C(relaxed);
can be reordered to LOAD-C(relaxed);
STORE-A(seq_cst);
允许对存储-负载进行重新排序(表5-PowerPC ): http://www.rdrop.com/users/paulmck/scalability/paper/whymb.2010.06.07c.pdf
Allows Store-Load reordering (Table 5 - PowerPC): http://www.rdrop.com/users/paulmck/scalability/paper/whymb.2010.06.07c.pdf
即PowerPC代码:
I.e. PowerPC code:
STORE-A(seq_cst);
STORE-C(relaxed);
LOAD-C(relaxed);
LOAD-B(seq_cst);
可以重新排序为:
LOAD-C(relaxed);
STORE-A(seq_cst);
STORE-C(relaxed);
LOAD-B(seq_cst);
PowerPC-GCC 4.8 : https://godbolt.org/g/xowFD3
C ++和asm-代码:
C++ & asm - code:
#include <atomic>
// Atomic load-store
void test() {
std::atomic<int> a, b, c; // addr: 20, 24, 28
a.store(2, std::memory_order_seq_cst); // li r9<-2; sync; stw r9->[a];
c.store(4, std::memory_order_relaxed); // li r9<-4; stw r9->[c];
c.load(std::memory_order_relaxed); // lwz r9<-[c];
int tmp = b.load(std::memory_order_seq_cst); // sync; lwz r9<-[b]; ... isync;
}
通过将a.store
分为两部分-可以将其重新排序为:
By dividing a.store
into two parts - it can be reordered to:
#include <atomic>
// Atomic load-store
void test() {
std::atomic<int> a, b, c; // addr: 20, 24, 28
//a.store(2, std::memory_order_seq_cst); // part-1: li r9<-2; sync;
c.load(std::memory_order_relaxed); // lwz r9<-[c];
a.store(2, std::memory_order_seq_cst); // part-2: stw r9->[a];
c.store(4, std::memory_order_relaxed); // li r9<-4; stw r9->[c];
int tmp = b.load(std::memory_order_seq_cst); // sync; lwz r9<-[b]; ... isync;
}
从内存中加载lwz r9<-[c];
的时间要比存储至存储器stw r9->[a];
的执行时间早.
Where load-from-memory lwz r9<-[c];
executed earlier than store-to-memory stw r9->[a];
.
在PowerPC上也可以这样重新排序:
Also on PowerPC can be such reordering:
STORE-A(seq_cst);
STORE-C(relaxed);
可以重新排序为STORE-C(relaxed);
STORE-A(seq_cst);
STORE-A(seq_cst);
STORE-C(relaxed);
can reordered to STORE-C(relaxed);
STORE-A(seq_cst);
因为PowerPC的内存排序模型较弱-允许存储-存储重新排序(表5-PowerPC ):"> http://www.rdrop.com/users/paulmck/scalability/paper/whymb.2010.06.07c.pdf
Because PowerPC has weak memory ordering model - allows Store-Store reordering (Table 5 - PowerPC): http://www.rdrop.com/users/paulmck/scalability/paper/whymb.2010.06.07c.pdf
即在PowerPC操作商店中可以与其他商店重新排序,然后可以对先前的示例重新排序,例如:
I.e. on PowerPC operations Store can be reordered with other Store, then previous example can be reordered such as:
#include <atomic>
// Atomic load-store
void test() {
std::atomic<int> a, b, c; // addr: 20, 24, 28
//a.store(2, std::memory_order_seq_cst); // part-1: li r9<-2; sync;
c.load(std::memory_order_relaxed); // lwz r9<-[c];
c.store(4, std::memory_order_relaxed); // li r9<-4; stw r9->[c];
a.store(2, std::memory_order_seq_cst); // part-2: stw r9->[a];
int tmp = b.load(std::memory_order_seq_cst); // sync; lwz r9<-[b]; ... isync;
}
存储到内存stw r9->[c];
的执行要早于存储到内存stw r9->[a];
的地方.
Where store-to-memory stw r9->[c];
executed earlier than store-to-memory stw r9->[a];
.
这篇关于标准C ++ 11是否保证memory_order_seq_cst防止StoreLoad对原子周围的非原子重新排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!