问题描述
我知道写入组合写入将被缓存,并且不会直接到达内存.但是程序员是否有必要在其他人可以访问之前显式刷新此内存?
I know that write combine writes will be cached, and don't reach the memory directly.But is it necessary for the programmer to flush this memory explicitly before others can access?
我从图形驱动程序代码中得到了这个问题.例如,CPU 填充顶点缓冲区(映射为 WC).但是在 GPU 访问它之前,我在代码中没有看到任何刷新操作.架构(x86)是否已经为我们解决了这个问题?有没有更详细的文档?
I got this question from the graphics driver code. For example, CPU fills the vertex buffer(mapped as WC). But before GPU access it, I don't see any flush operation in the code.Have the architecture(x86) already taken care of this for us? Any more detail document about this?
推荐答案
根据英特尔® 64 位和 IA-32 架构软件开发人员手册,第 3A 卷:系统编程指南,第 1 部分(2012 年 8 月版本,但这不应该改变),第 11.3.1 节,必须刷新缓冲区:
According to Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A: System Programming Guide, Part 1 (August 2012 version, but this should not have changed), Section 11.3.1, the buffer must be flushed:
用于驱逐 WC 缓冲区的协议取决于实现,软件不应依赖于系统内存一致性.使用 WC 内存类型时,软件必须对将数据写入系统内存延迟这一事实敏感,并且在需要系统内存一致性时必须有意清空 WC 缓冲区.
如果图形驱动程序实际上没有刷新写入组合缓冲区,那么它们取决于系统特定的时序和/或缓冲区大小(同时假设后续 WC 写入将分配给缓冲区,这在架构上没有保证).这在普通工作负载下的现有系统上可能有效(或看起来有效),但在架构上并不能保证它有效.
If the graphics drivers did not actually flush the write combining buffers, then they were depending on system specific timing and/or buffer sizes (while assuming that subsequent WC writes will be allocated to the buffer, this is not architecturally guaranteed). This may work (or appear to work) on existing systems under ordinary workloads, but it is not architecturally guaranteed to work.
由于广泛的序列化事件会刷新写入组合缓冲区,因此很可能存在刷新操作/事件但并不明显(就像 SFENCE 一样).来自英特尔® 64 位和 IA-32 架构软件开发人员手册(版本 052,2014 年 9 月),第 3 卷,第 11.3 节可用缓存方法:
Since a broad range of serializing events will flush the write combining buffers, it is quite possible that the flush operation/event is present but not obvious (as an SFENCE would be). From Intel® 64 and IA-32 Architectures Software Developer’s Manual (version 052, September 2014), Volume 3, Section 11.3 Methods of Caching Available:
如果WC缓冲区被部分填满,写入可能会延迟到下一次序列化事件发生;例如,SFENCE 或 MFENCE 指令、CPUID 执行、对未缓存内存的读取或写入、中断发生或 LOCK 指令执行.
例如,写入 GPU 寄存器(如果映射到未缓存的内存)将刷新写入组合缓冲区.
For example, a write to a GPU register (if mapped to uncached memory) would flush the write combining buffer.
这篇关于是否有必要由程序员显式刷新写入组合内存?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!