本文介绍了映射MMIO区域回写不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望所有人都阅读&向PCIe设备写入要由CPU缓存缓存的请求.但是,它没有按我预期的那样工作.

I want all read & write requests to a PCIe device to be cached by CPU caches. However, it does not work as I expected.

这些是我对回写MMIO区域的假设.

These are my assumptions on write-back MMIO regions.

  1. 对PCIe设备的写入仅在高速缓存写回时发生.
  2. TLP有效负载的大小是缓存块大小(64B).

但是,捕获的TLP不符合我的假设.

However, captured TLPs do not follow my assumptions.

  1. 每次写入MMIO区域时都会写入PCIe设备.
  2. TLP有效载荷的大小为1B.

我使用以下用户空间程序&将8个字节的0xff写入MMIO区域.设备驱动程序.

I write 8-byte of 0xff to the MMIO region with the following user space program & device driver.

用户程序的一部分

struct pcie_ioctl ioctl_control;
ioctl_control.bar_select = BAR_ID;
ioctl_control.num_bytes_to_write = atoi(argv[1]);
if (ioctl(fd, IOCTL_WRITE_0xFF, &ioctl_control) < 0) {
    printf("ioctl failed\n");
}

设备驱动程序的一部分

case IOCTL_WRITE_0xFF:
{
    int i;
    char *buff;
    struct pci_cdev_struct *pci_cdev = pci_get_drvdata(fpga_pcie_dev.pci_device);
    copy_from_user(&ioctl_control, (void __user *)arg, sizeof(ioctl_control));
    buff = kmalloc(sizeof(char) * ioctl_control.num_bytes_to_write, GFP_KERNEL);
    for (i = 0; i < ioctl_control.num_bytes_to_write; i++) {
        buff[i] = 0xff;
    }
    memcpy(pci_cdev->bar[ioctl_control.bar_select], buff, ioctl_control.num_bytes_to_write);
    kfree(buff);
    break;
}

我修改了MTRR,以回写相应的MMIO区域. MMIO区域从0x0c7300000开始,长度为0x100000(1MB).以下是不同策略的cat /proc/mtrr结果.请注意,我使每个地区都排他了.

I modified MTRRs to make the corresponding MMIO region write-back. The MMIO region starts from 0x0c7300000, and the length is 0x100000 (1MB). Followings are cat /proc/mtrr results for different policies. Please note that I made each region exclusive.

不可缓存

reg00: base=0x080000000 ( 2048MB), size= 1024MB, count=1: uncachable
reg01: base=0x380000000000 (58720256MB), size=524288MB, count=1: uncachable
reg02: base=0x0c0000000 ( 3072MB), size=   64MB, count=1: uncachable
reg03: base=0x0c4000000 ( 3136MB), size=   32MB, count=1: uncachable
reg04: base=0x0c6000000 ( 3168MB), size=   16MB, count=1: uncachable
reg05: base=0x0c7000000 ( 3184MB), size=    1MB, count=1: uncachable
reg06: base=0x0c7100000 ( 3185MB), size=    1MB, count=1: uncachable
reg07: base=0x0c7200000 ( 3186MB), size=    1MB, count=1: uncachable
reg08: base=0x0c7300000 ( 3187MB), size=    1MB, count=1: uncachable
reg09: base=0x0c7400000 ( 3188MB), size=    1MB, count=1: uncachable

写合并

reg00: base=0x080000000 ( 2048MB), size= 1024MB, count=1: uncachable
reg01: base=0x380000000000 (58720256MB), size=524288MB, count=1: uncachable
reg02: base=0x0c0000000 ( 3072MB), size=   64MB, count=1: uncachable
reg03: base=0x0c4000000 ( 3136MB), size=   32MB, count=1: uncachable
reg04: base=0x0c6000000 ( 3168MB), size=   16MB, count=1: uncachable
reg05: base=0x0c7000000 ( 3184MB), size=    1MB, count=1: uncachable
reg06: base=0x0c7100000 ( 3185MB), size=    1MB, count=1: uncachable
reg07: base=0x0c7200000 ( 3186MB), size=    1MB, count=1: uncachable
reg08: base=0x0c7300000 ( 3187MB), size=    1MB, count=1: write-combining
reg09: base=0x0c7400000 ( 3188MB), size=    1MB, count=1: uncachable

回写

reg00: base=0x080000000 ( 2048MB), size= 1024MB, count=1: uncachable
reg01: base=0x380000000000 (58720256MB), size=524288MB, count=1: uncachable
reg02: base=0x0c0000000 ( 3072MB), size=   64MB, count=1: uncachable
reg03: base=0x0c4000000 ( 3136MB), size=   32MB, count=1: uncachable
reg04: base=0x0c6000000 ( 3168MB), size=   16MB, count=1: uncachable
reg05: base=0x0c7000000 ( 3184MB), size=    1MB, count=1: uncachable
reg06: base=0x0c7100000 ( 3185MB), size=    1MB, count=1: uncachable
reg07: base=0x0c7200000 ( 3186MB), size=    1MB, count=1: uncachable
reg08: base=0x0c7300000 ( 3187MB), size=    1MB, count=1: write-back
reg09: base=0x0c7400000 ( 3188MB), size=    1MB, count=1: uncachable

以下是采用不同策略进行8B写入的波形捕获.我已经使用集成逻辑分析仪(ILA)捕获这些波形.设置pcie_endpoint_litepcietlpdepacketizer_tlp_req_valid时,请注意pcie_endpoint_litepcietlpdepacketizer_tlp_req_payload_dat.在这些波形示例中,您可以通过计算pcie_endpoint_litepcietlpdepacketizer_tlp_req_valid来计算数据包的数量.

Followings are waveform captures for 8B write with different policies. I have used integrated logic analyzer (ILA) to capture these waveform. Please watch pcie_endpoint_litepcietlpdepacketizer_tlp_req_payload_dat when pcie_endpoint_litepcietlpdepacketizer_tlp_req_valid is set. You can count the number of packets by counting pcie_endpoint_litepcietlpdepacketizer_tlp_req_valid in these waveform example.

  1. 不可缓存:链接->正确,1B x 8包
  2. 写合并:链接->正确,8B x 1包
  3. 回写:链接->意外,1B x 8包
  1. Uncacheable: link -> correct, 1B x 8 packets
  2. Write-combining: link -> correct, 8B x 1 packet
  3. Write-back: link -> unexpected, 1B x 8 packets

系统配置如下.

  • CPU :英特尔(R)至强(R)CPU E5-2630 v4 @ 2.20GHz
  • 操作系统:Linux内核4.15.0-38
  • PCIe设备:使用 litepcie
  • CPU: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
  • OS: Linux kernel 4.15.0-38
  • PCIe Device: Xilinx FPGA KC705 programmed with litepcie

相关链接

  1. 生成64字节的读取PCIe来自x86 CPU的TLP
  2. 如何在英特尔®架构上实现64B PCIe *突发传输
  3. 写入组合缓冲区乱序写入和PCIe
  4. Ryzen是否支持对内存映射IO的写回缓存(通过PCIe接口)?
  5. MTRR(内存类型范围寄存器)控件
  6. 修补Linux
  7. 从TLP开始:PCI如何表达设备说话(第一部分)
  1. Generating a 64-byte read PCIe TLP from an x86 CPU
  2. How to Implement a 64B PCIe* Burst Transfer on Intel® Architecture
  3. Write Combining Buffer Out of Order Writes and PCIe
  4. Do Ryzen support write-back caching for Memory Mapped IO (through PCIe interface)?
  5. MTRR (Memory Type Range Register) control
  6. PATting Linux
  7. Down to the TLP: How PCI express devices talk (Part I)

推荐答案

简而言之,看来MMIO区域的写回映射在设计上是行不通的.

In short, it seems that mapping MMIO region write-back does not work by design.

如果有人发现有可能,请上传答案.

Please upload an answer if anyone finds that it is possible.

我来找John McCalpin的文章和答案.首先,无法映射MMIO区域回写.其次,可以在某些处理器上解决该问题.

I came to find John McCalpin's articles and answers. First, mapping MMIO region write-back is not possible. Second, workaround is possible on some processors.

  1. 无法映射MMIO区域回写

引用此链接

引用于此链接

  • 某些处理器上可能有解决方法

    有关对内存映射的IO域进行缓存访问的说明,John McCalpin

  • 这篇关于映射MMIO区域回写不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

    05-30 02:31
    查看更多