| 131. Last time<br>132. x86-64: addre<br>133. x86-64: page | sses<br>table structures                                                                                   |
|-----------------------------------------------------------|------------------------------------------------------------------------------------------------------------|
| g 4. TLBs                                                 | ONE HANDOUT                                                                                                |
|                                                           | 1                                                                                                          |
| - (entral mechan<br>- idealized page<br>giant table, P    | IN is the contents of the table of the                                                                     |
| index - thus, pg table                                    | implements a map from VPN->PPN  a map from VPN->PPN U{\$\psi\$}, lecause  there wall massing in the table. |
| a VPN might<br>- NOTE: VPN + P                            | not have a valid mapping in the table,  PN do not necessarily have the same # of                           |
| bits.                                                     | 1 1 . J. At make in line                                                                                   |

· Because the table would be gigantic, it's not materialized as a linear table. Instead, the architecture specifies multilevel page tables.

2. x86-64; addresses 264 Sits = 8 bytes VAs: 36 5ths for VPN 1211 48 47 bit 63 [-247, 247-1] Address space has 48 usable bits. 2 possible addresses (each addresses a byte). Thus, 256 TB. 57 bits: 128 PB ,52 bits PAS

| Physical memory can be addressed the To                                                  | )       |
|------------------------------------------------------------------------------------------|---------|
| 52 bits.  How much physical memory can thus be sup  Addit IX  Addra: X                   | ported? |
| Mapping: going from 48-bit number (VA) to 52<br>number (PA) at the granularity of ranges | f 22.   |
| So it's really a mapping from 36-bit num!<br>40-bit numbers.                             |         |
| 3. Page table structures                                                                 | table   |
| [see handait] -/-c3 -/-c3                                                                | 3       |





4. TLB

< VPN, (PPN, perms) > < 0,202, (0,3)

H/v managed: x86, ARM

S/w managed: MITPS

TLB miss ?> ?g fault

Pg fault -> TLB miss

## **Core i7 Page Table Translation**



# **Review of Symbols**

#### Basic Parameters

- N = 2<sup>n</sup>: Number of addresses in virtual address space
- M = 2<sup>m</sup>: Number of addresses in physical address space
- **P = 2**<sup>p</sup> : Page size (bytes)

#### Components of the virtual address (VA)

- TLBI: TLB index
- TLBT: TLB tag
- VPO: Virtual page offset
- VPN: Virtual page number

#### Components of the physical address (PA)

- PPO: Physical page offset (same as VPO)
- PPN: Physical page number
- **CO**: Byte offset within cache line
- CI: Cache index
- CT: Cache tag

# **Core i7 Level 1-3 Page Table Entries**



#### Each entry references a 4K child page table. Significant fields:

**P:** Child page table present in physical memory (1) or not (0).

**R/W:** Read-only or read-write access access permission for all reachable pages.

**U/S:** user or supervisor (kernel) mode access permission for all reachable pages.

**WT:** Write-through or write-back cache policy for the child page table.

**A:** Reference bit (set by MMU on reads and writes, cleared by software).

**PS:** Page size: if bit set, we have 2 MB or 1 GB pages (bit can be set in Level 2 and 3 PTEs only).

**Page table physical base address:** 40 most significant bits of physical page table address (forces page tables to be 4KB aligned)

**XD:** Disable or enable instruction fetches from all pages reachable from this PTE.

# **Core i7 Level 4 Page Table Entries**

| 63 | 62 52  | 51 1                       | 12 11 | 9      | 8 | 7 | 6 | 5 | 4  | 3  | 2   | 1   | 0   |
|----|--------|----------------------------|-------|--------|---|---|---|---|----|----|-----|-----|-----|
| XD | Unused | Page physical base address | U     | Jnused | G |   | D | A | CD | WT | U/S | R/W | P=1 |

Available for OS (for example, if page location on disk)

P=0

### Each entry references a 4K child page. Significant fields:

P: page is present in memory (1) or not (0)

R/W: Read-only or read-write access permission for this page

**U/S:** User or supervisor mode access

WT: Write-through or write-back cache policy for this page

A: Reference bit (set by MMU on reads and writes, cleared by software)

**D:** Dirty bit (set by MMU on writes, cleared by software)

Page physical base address: 40 most significant bits of physical page address (forces pages to be 4KB aligned)

**XD:** Disable or enable instruction fetches from this page.

#### **End-to-end Core i7 Address Translation**



## **Cute Trick for Speeding Up L1 Access**



#### Observation

- Bits that determine CI identical in virtual and physical address
- Can index into cache while address translation taking place
- Cache carefully sized to make this possible: 64 sets, 64-byte cache blocks
- Means 6 bits for cache index, 6 for cache offset
- That's 12 bits; matches VPO,  $PPO \rightarrow$  One reason pages are  $2^{12}$  bits = 4 KB

## Virtual Address Space of a Linux Process





Figure 4-12. Page-Fault Error Code