本文介绍了L3 $在MESI协议中的角色是什么的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想了解Intel Broadwell中MESI的更多详细信息.

I like to know more details of MESI in intel broadwell .

假设一个cpu套接字具有6个核心,核心0到核心5,他们每个人都有自己的L1 $和L2 $,并共享L3 $,共享内存中有一个var X,x位于高速缓存行中名为XCacheL,以下是我的问题的详细信息:

Suppose A cpu socket has 6 cores core 0 to core 5 ,each of them has their own L1$ and L2$ and share L3$ ,there are a var X in shared memory , x located in cache linecalled XCacheL , the following is the detail for my question:

T1:核心0和核心4和核心5具有x = 100和XCacheL处于 S 状态,因为3个内核具有XCacheL的副本.

T1 : Core 0 and core 4 and core 5 has x = 100 and XCacheLis Shared status since 3 cores has the copy of XCacheL .

T2:核心0需要修改x,因此核心0广播使信号无效,而核心4和核心5接收该信号,使它们的XCacheL副本无效,核心0将x修改为200并且XCacheL的状态现在已变为 M .

T2 : Core 0 require to modify x , so core 0 broadcast invalidate signal and core 4 and core 5 receive the signal,invalidate their copy of XCacheL , Core 0 modify x to 200and XCacheL status now is Modified .

T3:核心4需要读取x,但是其T2中的XCacheL副本无效,因此它会引发读取未命中,将发生以下情况:

T3: core 4 require to read x but its XCacheL copy is invalidated in T2 , so it fire a read miss , the following is going to happen :

● Processor makes bus request to memory
● Snooping cache puts copy value on the bus
● Memory access is abandoned
● Local processor caches value
● Local copy tagged S
● Source (M) value copied back to memory
● Source value M -> S

因此,在T3之后,XCacheL处于核心0和核心4状态: S hared和 I 在核心5中有效,并且L3 $和主内存具有最新的有效XCacheL.

so after T3 , XCacheL is core 0 and core 4 status : Shared , and Invalidated in core 5 , and alsoL3$ and main memory has the newest valid XCacheL .

T4:核心5需要读取x,因为它的XCacheL副本在T2中已被 I 验证,但是此monent XCacheL具有正确复制L3 $,核心5是否需要像核心4一样触发读取未命中?

T4 : core 5 require to read x , since its XCacheL copy is Invalidated in T2 , but this monent XCacheL has thecorrect copy in L3$ , Would core 5 need to fire a read miss like core 4 do ?!

我的猜测是:不需要,因为L3 $具有有效的XCacheL,因此核心5可以达到L3 $,并在核心5中获得正确的XCacheL从L3 $到L1 $,因此核心5不会引发读取未命中.

My guess is : no need , since L3$ has the valid XCacheL,so core 5 can reach L3$ and get the right XCacheL from L3$ to L1$ in core 5 , so core 5 won't fire a read miss .

推荐答案

您是对的,在您的T4步骤中,核心#5的负载将进入L3,因此不会发生任何内存访问.核心5会在共享状态下获取该行的另一个副本.

You're right, in your T4 step, core #5's load will hit in L3, so no memory access happens. Core #5 gets another copy of the line, in Shared state.

对于像Broadwell这样的CPU,所有内核共享对片上DRAM控制器的访问权,您的步骤顺序对它们没有任何意义.

Your sequence of steps makes zero sense for a CPU like Broadwell where all cores share access to on-chip DRAM controller(s).

环形总线连接核心(每个核心都有一个L3缓存),系统代理(PCIe链接和与其他核心的连接)和本地代理(内存控制器).参见 https://en.wikichip.org/wiki/intel/microarchitectures/broadwell_(client)#Die_Stats 展示了环形总线的框图.

A ring bus connects cores (each of which has a slice of L3 cache) and the System Agent (PCIe links and connection to other cores) and Home Agent (memory controllers). See https://en.wikichip.org/wiki/intel/microarchitectures/broadwell_(client)#Die_Stats for a block diagram showing the ring bus.

单个内核不会直接驱动内存总线",甚至不会驱动2个或4个DRAM总线之一.存储器控制器可仲裁对DRAM的访问,并具有一些缓冲以重新排序/合并访问.(访问内存的所有内容都会通过它进行访问,包括DMA,因此只要它能够以某种合理的顺序显示加载/存储的外观,它就可以做任何喜欢的事情.)

Individual cores don't directly drive "the memory bus", or even one of the 2 or 4 DRAM buses. The memory controller arbitrates access to DRAM, and has some buffering to reorder / combine accesses. (Everything that accesses memory goes through it, including DMA, so it can do whatever it likes as long as it gives the appearance of loads/stores happening in some sane order.)

只有在L3高速缓存中未命中负载后,负载请求才会发送到系统代理.参见 https://superuser.com/questions/1226197/x86-address-space-controller/1226198#1226198 展示了四核台式机(它更简单,只是将内存控制器连接到系统代理,使其完全像北桥一样,在CPU集成内存控制器之前.)

A load request won't be sent to the system agent until after it misses in L3 cache. See https://superuser.com/questions/1226197/x86-address-space-controller/1226198#1226198 for an illustration of a quad-core desktop (which is simpler and just has the memory controller connected to the System Agent, making it exactly like a Northbridge before CPUs integrated the memory controllers.)

由于Broadwell使用了包含性的L3缓存,所以即使L3中的行本身是不可共享的,L3标签也可以告诉它哪个核心具有Modified或Exclusive副本.(即,线路的数据在L3中可能是无效的,但标签仍在跟踪哪个核心拥有私有副本).请参阅在Intel Core中使用了哪种缓存映射技术i7处理器?

Since Broadwell uses an inclusive L3 cache, L3 tags can tell it which, if any, core has a Modified or Exclusive copy, even if the line in L3 itself isn't shareable. (i.e. a line's data can be Invalid in L3, but the tags are still tracking which core has a private copy). See Which cache mapping technique is used in intel core i7 processor?

这使L3标签可以充当探听过滤器,以减少广播.

This lets L3 tags act as a snoop filter to reduce broadcasts.

这篇关于L3 $在MESI协议中的角色是什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-29 06:23