x86_64 CPU是否使用相同的缓存行通过共享内存在2个进程之间进行通信?

本文介绍了x86_64 CPU是否使用相同的缓存行通过共享内存在2个进程之间进行通信?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

众所周知，现代x86_64上的所有高速缓存L1/L2/L3级别都是通过虚拟索引，带有物理标签.并且所有内核都通过QPI/HyperTransport上的缓存一致性协议MOESI/MESIF通过最后一级缓存-L3进行通信.

As known all levels of cache L1/L2/L3 on modern x86_64 are virtually indexed, physically tagged. And all cores communicate via Last Level Cache - cache-L3 by using cache coherent protocol MOESI/MESIF over QPI/HyperTransport.

例如，Sandybridge系列CPU具有4至16路高速缓存L3和page_size 4KB，那么这允许通过共享内存在不同内核上执行的并发进程之间交换数据.这是可能的，因为高速缓存L3不能同时包含与进程1的页面和进程2的页面相同的物理内存区域.

For example, Sandybridge family CPU has 4 - 16 way cache L3 and page_size 4KB, then this allows to exchange the data between concurrent processes which are executed on different cores via a shared memory. This is possible because cache L3 can't contain the same physical memory area as a page of process 1 and as a page of process 2 at the same time.

这是否意味着每次当process-1请求相同的共享内存区域时，process-2便将其页面的缓存行刷新到RAM中，然后process-1加载与cache-相同的内存区域.进程1的虚拟空间中的页面行?真的很慢，还是处理器使用了一些优化?

Does this mean that every time when the process-1 requests the same shared memory region, then the process-2 flushes its cache-lines of page into the RAM, and then process-1 loaded the same memory region as cache-lines of page in virtual space of process-1? It's really slow or processor uses some optimizations?

现代的x86_64 CPU是否使用相同的缓存行，而不进行任何刷新，以通过共享内存在具有不同虚拟空间的2个进程之间进行通信?

Does modern x86_64 CPU use the same cache lines, without any flushes, to communicate between 2 processes with different virtual spaces via a shared memory?

Sandy Bridge Intel CPU-缓存L3:

Sandy Bridge Intel CPU - cache L3:

8 MB-缓存大小
64 B-缓存行大小
128 K-行(128 K = 8 MB/64 B)
16向
8 K-路数集(8 K = 128 K行/16路)
13位[18:6]-定义当前设置的数字(这是标签)
512 K-每个相同(虚拟地址/512 K)竞争同一组(8 MB/16路)
低19位-对于确定当前设置的数字很重要

8 MB - cache size
64 B - cache line size
128 K - lines (128 K = 8 MB / 64 B)
16-way
8 K - number sets of ways (8 K = 128 K lines / 16-way)
13 bits [18:6] - of virtual address (index) defines current set number (this is tag)
512 K - each the same (virtual address / 512 K) compete for the same set (8 MB / 16-way)
low 19 bits - significant for determining the current set number

4 KB-标准页面大小

4 KB - standard page size

我们有7个丢失的位[18:12]-也就是说，我们需要检查(7 ^ 2 * 16路)= 1024个缓存行.这与1024路缓存相同-因此非常慢.这是否意味着缓存L3(已物理索引，已物理标记)?

We have 7 missing bits [18:12] - i.e. we need to check (7^2 * 16-way) = 1024 cache lines. This is the same as 1024-way cache - so this is very slow. Does this mean, that cache L3 is (physically indexed, physically tagged)?

标记的虚拟地址中缺少的位的摘要(页面大小8 KB-12位):

Summary of missing bits in virtual address for tag (page size 8 KB - 12 bits):

L3(8 MB = 64 B x 128 K线)，16路，8 K集，13位标签[18:6]-缺少7位
L2(256 KB = 64 B x 4 K线)，8路，512组，9位标记[14:6]-缺少3位
L1(32 KB = 64 B x 512行)，8路，64组，6位标记[11:6]-没有丢失的位

应为:

L3/L2(经过物理索引，物理标记)
L1(经过虚拟索引，物理标记)

L3 / L2 (physically indexed, physically tagged) used after TLB lookup
L1 (virtually indexed, physically tagged)

indexed

x86_64 CPU是否使用相同的缓存行通过共享内存在2个进程之间进行通信?

问题描述

推荐答案