| I) l. Last time                                                                                                    |
|--------------------------------------------------------------------------------------------------------------------|
| 12. Exercises                                                                                                      |
| 13. Page faults: intro + mechanics                                                                                 |
| 04. Page faults and paging: USES                                                                                   |
| 15. Page faults' costs                                                                                             |
| D7. Page replacement policies<br>D7. Thrashing                                                                     |
| D7. Thrashing                                                                                                      |
| 3. Page faults intro + mechanics                                                                                   |
| Concept: illegal virtual memory reference: hardware thinks it's illegal (though it might be valid for the process) |
| OS has to get involved                                                                                             |
| Malaiss:                                                                                                           |
| -processor constructs trap trame and trasfer execution to an interrupt or trap handler                             |
|                                                                                                                    |



4. Uses of page taults
- Classic example: overcommitting physical memory

prog: 64 GB
RAM

4/w · 1663



- Copy on write fork();

- Accounting



- demand paging

- growing the stack

- BSS page allocation

- Shared text (code)

- Shared libraries

- Shared memory

5. Page faults: costs

look at AMAT (avg. memory access time)

AMAT = (1-p)+ (mem access time) + p+ (arge fault time)

ty

P is probability (or frequency) of a page fault

mem access time ~ 100ns ty

Misk access time ~ 10ms = 10<sup>7</sup>ns t<sub>p</sub>

such that QUESTION: what is p s.t. paging hurts performance by less than 10%?  $1.1 t_{M} > (1-p) \cdot t_{M} + p \cdot t_{D}$ p.tm > -. 1.tm + p.tp  $P(t_{M}-t_{D}) > -1.t_{M}$  $\Rightarrow \frac{|x| (0^{\alpha})^{2}}{|0^{7} | (0^{\alpha})^{2}}$  $\Rightarrow p(t_p-t_m) < . lt_m$ => P < !ltn fr-tm  $\approx ... \times \frac{10^2}{10^7}$  $\frac{1}{106} = 10^{-6}$ 

6. Page replacement policies

P<sub>x,</sub> vPN 172



· FIFO: eject oldest

MIN (OPT): eject entry that won't be referenced for the longest time

input: reference string cache size

output: number of evictions, or more generally misses

FIFO

ABCABDADBCB phys-slot \$1 \$2 \$3

OPTIMAL ABCABDADBCB Ays-slot S1 S2 S3

LRU ABCABDADBCB

Phys. slot
51
52
53

ABCDABCDABCD phys-slot \$1 \$2 53

back to FIFO

3extries ABCDABEABCDE

phys-sht

\$1

\$2

\$3

Herties ABCDABEABCDE

phys.slot

\$1

\$2

\$3

\$4

-OPT minimizes misses/swaps/evictions
- but can't be implemented in general.

-LRU: approximates OPT (assuming what?)

-approximate LRU with CLOCK



H/W sets Acressed - Dirty bits

OS consumes there bits and clears them.

- Generalization of CLOCK: Nth Chance (see notes).

7 11 1.

t. Ihrashing ex: program touches 50 pages, equiprobably but only 40 phys. frames (or slot) Thrashing: processes demand more memory for active use than the system has. 3 reasons'
(a) process has no temporal locality, or
(b) temporal locality but not enough memory; or
(b) temporal locality but there's not enough memory.
(c) individually all processes fit but there's not enough memory.

# **Core i7 Page Table Translation**



# **Review of Symbols**

#### Basic Parameters

- N = 2<sup>n</sup>: Number of addresses in virtual address space
- M = 2<sup>m</sup>: Number of addresses in physical address space
- **P = 2**<sup>p</sup> : Page size (bytes)

#### Components of the virtual address (VA)

- TLBI: TLB index
- TLBT: TLB tag
- VPO: Virtual page offset
- VPN: Virtual page number

#### Components of the physical address (PA)

- PPO: Physical page offset (same as VPO)
- PPN: Physical page number
- CO: Byte offset within cache line
- CI: Cache index
- CT: Cache tag

# **Core i7 Level 1-3 Page Table Entries**



#### Each entry references a 4K child page table. Significant fields:

**P:** Child page table present in physical memory (1) or not (0).

**R/W:** Read-only or read-write access access permission for all reachable pages.

**U/S:** user or supervisor (kernel) mode access permission for all reachable pages.

**WT:** Write-through or write-back cache policy for the child page table.

**A:** Reference bit (set by MMU on reads and writes, cleared by software).

**PS:** Page size: if bit set, we have 2 MB or 1 GB pages (bit can be set in Level 2 and 3 PTEs only).

**Page table physical base address:** 40 most significant bits of physical page table address (forces page tables to be 4KB aligned)

**XD:** Disable or enable instruction fetches from all pages reachable from this PTE.

# **Core i7 Level 4 Page Table Entries**



#### Each entry references a 4K child page. Significant fields:

P: Child page is present in memory (1) or not (0)

R/W: Read-only or read-write access permission for this page

**U/S:** User or supervisor mode access

WT: Write-through or write-back cache policy for this page

A: Reference bit (set by MMU on reads and writes, cleared by software)

**D:** Dirty bit (set by MMU on writes, cleared by software)

Page physical base address: 40 most significant bits of physical page address (forces pages to be 4KB aligned)

**XD:** Disable or enable instruction fetches from this page.

### **End-to-end Core i7 Address Translation**



# **Cute Trick for Speeding Up L1 Access**



#### Observation

- Bits that determine CI identical in virtual and physical address
- Can index into cache while address translation taking place
- Cache carefully sized to make this possible: 64 sets, 64-byte cache blocks
- Means 6 bits for cache index, 6 for cache offset
- That's 12 bits; matches VPO,  $PPO \rightarrow$  One reason pages are  $2^{12}$  bits = 4 KB

# Virtual Address Space of a Linux Process





Figure 4-12. Page-Fault Error Code