本文介绍了如何在不污染缓存的情况下从内存中加载值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想读取一个内存位置而不污染高速缓存.我正在X86 Linux机器上工作.我尝试使用MOVNTDQA汇编程序指令:

I want to read a memory location without polluting the cache. I am working on X86 Linux machine. I tried using MOVNTDQA assembler instruction:

  asm("movntdqa %[source], %[dest] \n\t"
      : [dest] "=x" (my_var) : [source] "m" (my_mem[0]) : "memory");

my_mem是分配有new的int *,my_var是一个int.

my_mem is an int* allocated with new, my_var is an int.

这种方法有两个问题:

  1. 代码已编译,但是运行它时出现非法指令"错误.有什么想法吗?
  2. 我不确定为new分配什么类型的内存.我认为那是世界银行.根据文档,MOVNTDQA指令仅在USWC存储器类型有效.我怎么知道我正在使用哪种内存类型?

总而言之,我的问题是:

To summarize, my question is:

如何在不污染X86机器上的缓存的情况下读取内存位置?我的方法是否朝着正确的方向发展?可以固定解决吗?

How can I read a memory location without polluting the cache on an X86 machine? Is my approach in the right direction, and can it be fixed to work?

谢谢.

推荐答案

以%% xmm为目标(从内存加载)的movntdqa指令存在的问题是,此insn仅适用于SSE4.1及更高版本.这意味着到目前为止只有更新的Core 2(45 nm)或i7.另一种方法(将数据存储到内存)在早期的SSE版本中可用.

The problem with the movntdqa instruction with %%xmm as target (loading from memory) is that this insn is only available with SSE4.1 and on. This means newer Core 2 (45 nm) or i7 only so far. The other way around (storing data to memory) is available in earlier SSE versions.

对于此指令,处理器将数据移至极少数读取缓冲区中的一个很小的区域(Int​​el未指定确切大小,但假定其在16个字节的范围内),该位置随时可用,但在其他一些负载之后被踢出.

For this instruction, the processor moves the data into one very small of very few read buffers (Intel doesn't specify the exact size, but assume it is in the range of 16 bytes), where it is readily available, but gets kicked out after a few other loads.

它不会污染其他缓存,因此,如果您具有流数据,则您的方法是可行的.

And it does not pollute the other caches, so if you have streaming data, your approach is viable.

请记住,此后您需要使用一个sfence insn.

Remember, you need to use a sfence insn afterwards.

预取存在两种变体:prefetcht0(预取所有缓存中的数据)和prefetchnt(预取非临时数据).通常,在所有缓存中进行预取是正确的做法,如果随后使用流指令,则对于流数据循环而言,后者会更好.

Prefetching exists in two variants: prefetcht0 (Prefetches data in all caches) and prefetchnt (Prefetch non temporal data). Usually prefetch in all caches is the right thing to do, for a streaming data loop the latter would be better, if you make consequent use of the streaming instructions.

将其与您要在不久的将来使用的对象的地址一起使用,如果有循环,通常会提前一些迭代.预取insn不会等待或阻塞,只会使处理器开始在指定的内存位置获取数据.

You use it with the address of an object you want to use in the near future, usually some iterations ahead if you have a loop. The prefetch insn doesn't wait or block, it just makes the processor start getting the data at the specified memory location.

这篇关于如何在不污染缓存的情况下从内存中加载值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 05:10