本文介绍了该程序的输出 11 永远不会发生的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这次我使用 atomic_fetch_add .这是我如何获得 ra1=1 和 ra2=1 .两个线程都看到 a.fetch_add(1,memory_order_relaxed);当 a=0 时.写入进入存储缓冲区并且对另一个不可见.它们都有 ra=1 和 ra2=1.

This time i use atomic_fetch_add . Here is how i can get ra1=1 and ra2=1 .Both the threads see a.fetch_add(1,memory_order_relaxed); when a=0. The writes go into store buffer and isn't visible to the other. Both of them have ra=1 and ra2=1.

我可以推断它是如何打印 12,21 和 22 的.

I can reason how it prints 12,21 and 22.

  • 22 是由它们在 foo 和 bar 中递增 a 给出的,而 a=2 是对 a.load 都可见.
  • 类似 12 由线程 foo 完成和线程栏开始后给出线程 foo 存储.
  • 21 由第一个 bar 然后 foo 给出.
// g++ -O2 -pthread axbx.cpp ; while [ true ]; do ./a.out | grep "11"; done doesn't print 11 within 5 mins
#include<atomic>
#include<thread>
#include<cstdio>
using namespace std;
atomic<long> a,b;
long ra1,ra2;
void foo(){
        a.fetch_add(1,memory_order_relaxed);
        ra1=a.load(memory_order_relaxed);
}
void bar(){
        a.fetch_add(1,memory_order_relaxed);
        ra2=a.load(memory_order_relaxed);
}
int main(){
  thread t[2]{ thread(foo),thread(bar)};
  t[0].join();t[1].join();
  printf("%ld%ld\n",ra1,ra2); // This doesn't print 11 but it should
}

推荐答案

a.fetch_add 是原子的;这就是重点.两个单独的 fetch_adds 无法相互踩踏而只会导致单个增量.

a.fetch_add is atomic; that's the whole point. There's no way for two separate fetch_adds to step on each other and only result in a single increment.

让存储缓冲区中断的实现不是正确的实现,因为 ISO C++ 要求整个 RMW 是一个原子操作,而不是原子加载和单独的原子存储.

Implementations that let the store buffer break that would not be correct implementations, because ISO C++ requires the entire RMW to be one atomic operation, not atomic-load and separate atomic-store.

(例如在 x86 上,lock add [a], 1 是一个完整的屏障,因为它必须如何实现:确保更新的数据在 L1d 缓存中作为执行的一部分可见.int num"的 num++ 可以是原子的吗?.

(e.g. on x86, lock add [a], 1 is a full barrier because of how it has to be implemented: making sure the updated data is visible in L1d cache as part of executing. Can num++ be atomic for 'int num'?.

在其他一些实现上,例如ARMv8.1 之前的 AArch64,它将编译为 LL/SC 重试循环,如果此内核失去了加载和存储之间缓存线的独占所有权,则 Store-Conditional 将失败.)

On some other implementations, e.g. AArch64 before ARMv8.1, it will compile to an LL/SC retry loop, where the Store-Conditional will fail if this core lost exclusive ownership of the cache line between the load and store.)

脚注 1:如果省略 -march=armv8.1-a-mcpu=cortex-a76 或其他任何内容,实际上当前的 GCC 会调用 libatomic 辅助函数,因此它仍然可以通过使用新的单指令原子(如 ldadd w2, w0, [x0] 而非重试循环)在运行时 CPU 调度中受益,在可能的情况下,代码在 ARMv8 上运行.1 中央处理器.https://godbolt.org/z/vhePM9h8a)

Footnote 1: Actually current GCC will call the libatomic helper function if you omit -march=armv8.1-a or -mcpu=cortex-a76 or whatever, so it can still benefit via runtime CPU dispatching from using the new single-instruction atomics like ldadd w2, w0, [x0] instead of a retry loop, in the likely case of the code running on an ARMv8.1 CPU. https://godbolt.org/z/vhePM9h8a)

这篇关于该程序的输出 11 永远不会发生的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-24 17:42