


I made a post about page table and the amount of registers needed for a multi level page table and fount out that every page table, regardless of the level, only needs one register to access the top of the page table. But my second question has not been answered.


How will cache (L1-L3) in the processor affect memory reference access to page table? Will the majorities miss or hit? Why does it happen? I am told that this topic may have different answers based on the architectures used, so maybe general answers would be fine.


I tried to find references for this, but I cannot find it. Might say that I am really beginner in OS.



Because of TLB, the access of memory reference to Page Table can be reduced, causing it to get more hits. Is it correct? Help please :D


基本思想(没有任何类型的缓存)是当你访问内存时 CPU:

The basic idea (without any caches of any kind) is that when you access memory the CPU:

  • 找到最高级别的页表(例如,从虚拟地址和控制寄存器中)并从 RAM 中获取最高级别​​的页表条目

  • finds the highest level page table (e.g. from the virtual address and a control register) and fetches the highest level page table entry from RAM

找到下一级页表(例如从虚拟地址和最高一级页表条目)并从RAM中取出下一级页表条目;依此类推(对每一级页表重复),直到 CPU 到达最低级页表条目.

finds the next level page table (e.g. from the virtual address and highest level page table entry) and fetches the next level page table entry from RAM; and so on (repeated for each level of page tables) until the CPU reaches the lowest level page table entry.


finds the physical address (e.g. from the virtual address and lowest level page table entry), and fetches the data from that physical address


This is obviously slow. To speed it up there are multiple "cache like things":

a) 缓存本身.例如.CPU 可以从缓存中获取而不是从 RAM 中获取任何内容(包括 CPU 获取页表条目时).请注意,通常有多个级别的缓存(例如 L1 数据缓存、L2 统一缓存等),这可能适用于某些缓存而不适用于其他缓存(例如,CPU 不会从L1 指令缓存"中获取页表条目,但是可能会从L3 统一缓存"中获取它们).

a) The caches themselves. E.g. rather than fetching anything from RAM the CPU may fetch from cache instead (including when CPU fetches page table entries). Note that typically there's multiple levels of caches (e.g. L1 data cache, L2 unified cache, ...) and this may apply to some caches and not others (e.g. CPU won't fetch page table entries from "L1 instruction cache" but probably will fetch them from "L3 unified cache").

b) TLB(翻译后备缓冲区);它主要缓存最低级别的页表条目.这允许跳过几乎所有的工作(如果有TLB 命中").

b) The TLBs (Translation Look-aside Buffers); which mostly cache the lowest level page table entry. This allows almost all of the work to be skipped (if there's a "TLB hit").

c) 更高级别的翻译缓存.现代 CPU 具有额外的缓存,可以缓存页表层次结构的中间级别(例如,如果有 4 级或更多级别,则可能是第 3 级页表条目,而不是最高或最低级别的条目).这些降低了TLB 未命中"的成本.(如果有更高级别的翻译命中"),允许跳过某些工作.

c) Higher level translation caches. Modern CPUs have additional caches that cache an intermediate level of the page table heirarchy (e.g. maybe the 3rd level page table entry if there's 4 or more levels, and not the highest or lowest level entry). These reduce the cost of "TLB miss" (if there's a "higher level translation hit") by allowing some of the work to be skipped.


07-30 16:03