问题描述
根据 Wiki :非统一内存访问(NUMA)是计算机内存多处理设计,其中内存访问时间取决于相对于处理器的内存位置.
但是尚不清楚它是否与包括高速缓存在内的任何内存有关,还是仅与主内存有关.
例如至强融核处理器具有下一个体系结构:
所有内核对主存储器(GDDR)的存储器访问均相同.同时,对于不同的内核,对L2缓存的内存访问也有所不同,因为先检查本机L2缓存,然后再通过环检查其他内核的L2缓存. 是NUMA还是UMA体系结构?
从技术上讲,NUMA可能仅应用于描述对主内存的非均匀访问延迟或带宽. (如果NUMA因数[等待时间远/延迟近或带宽远/带宽近]小(例如,与DRAM行未命中,缓冲等导致的动态可变性相当),则仍可以将系统视为UMA.)/p>
(从技术上讲,至强融核具有很小但不为零的NUMA因子,因为环形互连上的每一跳都需要时间[一个内核可能只是一个内存控制器的一跳,而最远的一个则是几跳].)
使用术语NUCA(非统一缓存访问)来描述对不同缓存块具有不同访问延迟的单个缓存. NUCA属于共享缓存级别,其中部分与内核或内核集群的关系更紧密,但单独的缓存层次结构(我认为)不能证明该术语的合理性(即使监听可能会在远程"中找到所需的缓存块)缓存).
我不知道有任何术语来描述具有与监听相关联的可变缓存延迟(即具有单独的缓存层次结构)和小/零NUMA因子的系统.
(由于缓存可以透明地复制和迁移缓存块,因此NUMA概念不太合适.[是的,操作系统可以将页面透明地迁移和复制到NUMA系统中的应用程序软件,所以这种差异不是绝对的.] )
也许有些有趣,Azul Systems声称 UMA跨套接字用于其Vega系统:
According to wiki: Non-uniform memory access (NUMA) is a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to a processor.
But it is not clear whether it is about any memory including caches or about main memory only.
For example Xeon Phi processor have next architecture:
Memory access to main memory (GDDR) is same for all cores. Meanwhile memory access to L2 cache is different for different cores, since first native L2 cache is checked, then L2 cache of other cores is checked via ring. Is it NUMA or UMA architecture?
Technically, NUMA should probably only be used to describe non-uniform access latency or bandwidth to main memory. (If the NUMA factor [latency far/latency near or bandwidth far/bandwidth near] is small [e.g., comparable to dynamic variability due to DRAM row misses, buffering, etc.], then the system might still be considered UMA.)
(Technically, the Xeon Phi has a small but non-zero NUMA factor since each hop on the ring interconnect takes time [a core might be only one hop from one memory controller and several hops from the most distant one].)
The term NUCA (Non-Uniform Cache Access) has been taken to describe a single cache with different latency of access for different cache blocks. A shared cache level with portions more closely tied to a core or cluster of cores would also fall under NUCA, but separate cache hierarchies would (I believe) not justify the term (even though snooping might find a desired cache block in a 'remote' cache).
I do not know of any term being used to describe a system with variable cache latency associated with snooping (i.e., with separate cache hierarchies) and a small/zero NUMA factor.
(Since caches can transparently replicate and migrate cache blocks, the NUMA concept is a little less fitting. [Yes, an OS can migrate and replicate pages transparently to application software in a NUMA system, so this difference is not absolute.])
Perhaps somewhat interestingly, Azul Systems claims UMA across sockets for its Vega systems:
这篇关于哪种架构可称为非均匀内存访问(NUMA)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!