Class 15 CS 202 27 October 2025 On the board ------------ 1. Last time 2. Virtual memory reinforcement 3. WeensyOS walkthrough 4. WeensyOS context switches --------------------------------------------------------------------------- 1. Last time Page faults admin notes - RS6 2. Virtual memory reinforcement Exam problem 3. WeensyOS walkthrough [draw picture of the software stack: two instances of virtualization] advice: start if you haven't!!! processes, files with p-* kernel code, files with k-* processes just allocate memory. system call: sys_page_alloc(). analogous to brk() or mmap() in POSIX systems. look at process.h for where the system call happens see exception_return() for where the return back into user space happens %rax is what the application return value is. - figures (the animated gifs) are from 32-bit version of the lab. so you'll see some differences. - you'll use the virtual_memory_map() function pay attention to the "allocator" argument (and make sure your allocator initializes the new page table) - how many page tables are allocated for 3MB? what's the structure? - 3MB virtual address space, but the L4 page table that handles [2MB, 3MB) is allocated only on demand. - thus, make sure when calling virtual_memory_map that you're passing in a non-NULL allocator when you're supposed to. - process control block (PCB): this is the "struct proc" in kernel.h - recall: register %rax is the system call return value register %rdi contains the system call argument - remember: bugs in earlier parts may show up only later - pageinfo array: typedef struct physical_pageinfo { int8_t owner; int8_t refcount; } physical_pageinfo; static physical_pageinfo pageinfo[PAGENUMBER(MEMSIZE_PHYSICAL)]; one physical_pageinfo struct per _physical_ page. - x86_64_pagetable....array of 512 entries (each 8 bytes) - note: recall the picture from a prior handout, where in Linux, the kernel is mapped at the top of every user-level process's address space (which has lately been modified, to address Meltdown and Spectre). in lab 4, it doesn't work like that. in lab4, the kernel has its own separate page table. ---- [what's below here are detailed notes from a prior recitation on lab4.] - Kernel virtual address Kernel is set up to use an identity mapping [0, MEM_PHYSICAL) -> [0, MEM_PHYSICAL) - Physical pages' metadata are recorded in physical_pageinfo array, whose elements contain refcount, owner owner can be kernel, reserved, free, or pid - Process control block: * Process registers, process state * Process page table - a pointer (kernel virtual address, which is the identical physical address) to an L1 page table L1 page table's first entry points to a page table, and so on... Our job mainly consists of manipulating the page tables, and pageinfo array - High level evolution of the lab: We have five programs: kernel + 4 processes Ex1. five processes share the same page table. virtual addresses are all PTE_U | PTE_P | PTE_W Job: mark some of the addresses as (PTE_P | PTE_W), i.e. not user accessible Ex2. Each program uses its own page table. kernel already has a page table The job is to allocate and populate a page table for each process. The process break-down as helper functions: 1. allocate a new page for process pid, and zero it (important) [use memset to zero-out] 2. populate the new page table. can memcpy kernel's, but easier to copy the kernel's mappings individually. In order to achieve the screenshot, after memcpying, we have to mark [prog_addr_start, virtual_addr_size) as not-present. Ex3. Physical page allocation Motivation: Before this, during sys_page_alloc, when process asks for a specific virtual page, the identity mapping is employed to find the physical page. But it is too restrictive and with virtual memory, the process does not really care which physical page it gets If we have implemented the function 1 mentioned in Ex2 (allocate a free page), then we are mostly good to go and just use that function. We also need to connect virtual-physical by setting the corresponding page table entry. Use virtual_memory_map Ex4. Overlapping virtual addresses Motivation: Every process has its own page table & accessible virtual addresses (PTE_P portions), we don't need to restrict processes to use different parts of the virtual memory. They can overlap, as long as the physical pages backing them are not overlapped. Easy to do: in process_setup, we use (MEMSIZE_VIRTUAL - PAGESIZE) instead of the arithmetic to compute the process's stack page. Ex5. Fork High level goal: Produce a mostly identical process (minus the register rax). What does it mean to be an identical process?? 1 same binary 2 same process registers 3 AND same memory state / contents 3 basically covers 1 because the binary is loaded in memory too. 2 is easy to achieve (copy the registers; can do this with a single line of C code) The goal here is mainly to achieve 3. Fork creates a copy: the memory state has to be a copy! Question: What does it mean to make a copy of memory? - They are backed by physical pages, so we alloc new physical pages and copy the content to new pages (memcpy) - Then connect virtual to physical by setting the page table The address space is potentially 256 TB large, do we copy 256 TB? How do we know which parts to copy? - Iterate over the virtual address space; find pages with (PTE_P | PTE_U | PTE_W) Given a page table entry, how do you check if it is user RW-able? Fill in the blanks... pte_val _ (PTE_P | PTE_W | PTE_U) == ___ How do you find its corresponding physical page? PTE_ADDR - Useful functions to implement for said manipulations: * find a PO_FREE physical page and assign it to a process (Useful for ex2, 3, 4, 5) * allocate empty page dir + page table for a process (Ex2, 4) * make a copy of existing page table and assign it to a process (Ex2, 5) * implement your own helper functions as you see fit Tip: Zero the allocated page before using it!! (memset) - Some useful functions/macros: PTE_ADDR : PTE_ENTRY -> Physical address PAGENUMBER : a phyiscal address -> corresponding index into page info array PAGEADDR : PAGENUMBER^{-1} virtual_memory_lookup(pagetable, va) 4. WeensyOS context switches - on interrupt, hardware saves "trapframe": %rip/%rsp/%eflags and lots of other things saves all of that on the *kernel's stack* at a well-known place in kernel memory - %rdi set equal to stack pointer - %rdi shows up, in C code, as the argument to exception() - exception does a brute force struct copy into process-specific memory in kernel space - now all of the process's registers just live in the PCB / struct proc. - kernel does its thing. - kernel gets ready to choose another process - remember, that process had the same thing happen - so all of *the new process's* registers are sitting in the same kind of memory mentioned above. - now, exception_return (&p->p_registers) - note: %rdi holds address of saved registers - set the stack pointer equal to that address - that means that popq will do the "right" thing - pop the saved registers into the CPU's - add 16 to skip past saved codes (error code if pg fault handler and int code in all cases). - now stack pointer is pointing to the trapframe at the time of the trap - that trapframe includes %rip and trap-time %rsp - iretq brings us back into user space with the %rip,%rsp at the time of the trap