# CS202 (003): Operating Systems Virtual Memory

Instructor: Jocelyn Chen

Most of the materials covered in this slide come from the lecture notes of Mike Walfish's CS202



### Last Time

Process

What is a process?

How does process communicate with low-level resources?

How can one process do multiple tasks concurrently?

How does the operating system manage memory for multiple processes efficiently?

### Let's take a step back...



## Virtual Memory

"Each process has its own view of memory"



Does the address space of this program actually at the physical addresses 0 through 16KB?

$$y = x + 1$$

| code address       |      | code instruction |
|--------------------|------|------------------|
| 0x500              | movq | 0x200000, %rax   |
| <mark>0x508</mark> | incq | <b>1, %</b> rax  |
| <mark>0x510</mark> | movq | %rax, 0x300000   |

### Virtual Memory

How many virtual memory translations happen when the lines above are executed?

## Goals/Benefits of Virtual memory

### Programmability

Protection

### Efficient use of resources

- Program thinks it has a lot of memory, and has its own physical memory - Compiler and linker don't have to worry about physical addresses - multiple instances of the programs can be loaded and not collide

- Program cannot read/write each other's memory - Therefore delivers isolation (prevent bug in one process corrupt with another)

- Programmers don't have to worry that the sum of the memory consumed by all active processes is larger than physical memory

### How is the translation implemented?



(per-process)

In the traditional x86 (and in our labs), the page size will be 4096 B = 4 KB =  $2^{12}$ 

 $2^{10}$ : kilo 2<sup>20</sup>: mega 2<sup>30</sup>: giga  $2^{40}$ : tera

How many pages are there on a 32-bit architecture?

$$\frac{2^{32} \text{ bytes}}{2^{12} \text{ bytes/page}} =$$

## Paging



What about if there are 48 bits used to address memory?

 $\cdot 2^{20}$  pages

 $\frac{2^{48} \text{ bytes}}{2^{12} \text{ bytes/page}} = 2^{36} \text{ pages} = 64 \text{ billion pages}$ 



Each process has a separate mapping

Each page is separately mapped

OS take control on certain (invalid) operations: - If a process tries to write to a page marked as read-only, it triggers a trap If a process tries to access a page marked as invalid, it triggers a trap

> After handling a trap, the OS can modify the memory mapping as needed (load a page from disk, change permissions, ....)

## Paging





### Page Number

D: 
$$[0,4095]$$
  
1:  $[4096, 8191]$   
2:  $[8192, 12277]$   
3:  $[12777, 16384]$   
 $2^{20}$ -1:  $[..., 2^{32}$ -1]

What's the size of space for 32 bits virtual address?

Size of space =  $2^{32}$  bits = 4 GB

**VPN** (virtual page number) **PPN** (physical page number)



### Another way to look at it



virtual address

physical address

20-bit PPN

if OS wants a program to be able to use address 0x00402000 to refer to physical address 0x00003000, then the OS conceptually adds an entry: table[0x00402] = 0x00003(table[1026] = 3 in decimal)

(assuming 48-bit addresses and 4KB pages)



## Create the mapping is hard

36-bit VPN => 2^36 translation from VPN to PPN

Assuming each translation is 8 byte  $=> 2^{36} * 8 = 512$ GB

Recall that we are maintaining these mapping per process, 100 process => 51200GB of memory to store address translation!

Most programs only use a small fraction of the available address space, so it does seem to be a good use of resources

Page table can get terribly large!



## Multi-level page table

Represent a linear page table as a hierarchy of smaller page tables

Each level uses a portion of the virtual address to index into its table

- a) The system starts with the root page table.
- b) It uses the first part of the address to find an entry in this table. c) This entry points to a second-level table.
- d) The next part of the address is used to index into this second table.
- e) This process continues through all levels.
- f) The final level provides the actual physical page number.

This tree is space: only fill in parts that are actually in-use!

A virtual address is divided into several parts:

- Multiple segments (often 9 bits each) for indexing each level of tables
- A final segment (often 12 bits) for the offset within the physical page



## Multi-level page table

Map 2MB of physical memory at virtual memory 0, ..., 2 ^ 21 -1 Let's say we have 48 bits, and we divide the VPN into 4 9 bits segments First of all, assuming each physical page is 4 KB, then we have 512 physical pages

**The Virtual Address Range**: We're mapping addresses from 0 to 2^21 - 1 (2MB). **48-bit Address Structure**: (It's divided as) 9 bits | 9 bits | 9 bits | 9 bits | 12 bits For the range 0 to 2^21 - 1, the binary representation looks like this (X is either 0 or 1): (Level 1) (Level 2) (Level 3) (Level 4) (Page Offset)

The first 9 bits are always 000000000 for our entire range. Level 1 (Root): So, we only need one entry in the root table, pointing to the single Level 2 table we'll use. The next 9 bits are also always 00000000 for our entire range. Level 2: Again, we only need one entry, pointing to the single Level 3 table. The next 9 bits are also always 00000000 for our entire range. Level 3: Again, we only need one entry, pointing to the single Level 4 table. The next 9 bits (XXXXXXXX) can represent any value from 000000000 to 111111111. This gives us  $2^9 = 512$  different combinations. Level 4: That's why we need 512 entries in this level.

### Alternatives and tradeoffs



Many level of mapping: Less space spent on page structures when address space is space, but more costly for hardware to walk the page table

### Many/few level of mapping

- Large page size: waste actual memory
- **Small page size:** lots of page table entries

Few level of mapping: Need to allocate larger pages, which cost more space, but the hardware has fewer levels of mapping





### Virtual Address

### Bit patterns that are valid addresses are called **Canonical Addresses**







What happen if we only have 16 GB of memory?

Physical Address

### (roughly) only 34 bits that matters! the top 18 bits will (generally be) zero

### We are mapping 48-bit number to 52-bit number, at a granularity of ranges of $2^{12}$