Class 15 CS 439 5 March 2013 On the board ------------ 1. Last time 2. page faults, continued 3. Other page structures 4. Page replacement policies --------------------------------------------------------------------------- 1. Last time --more paging! --JOS memory map --started on page faults 2B. Page faults: uses --exhibit A for the use of paging is virtual memory: --your program thinks it has, say, 512 MB of memory, but your hardware has only 4 MB of memory --the way that this worked is that the disk was (is) used to store memory pages --advantage: address space looks huge --disadvantage: accesses to "paged" memory (as disk pages that live on the disk are known) are sllooooowwwww: --the implementation of this is roughly: --on a page fault, the kernel reads in the faulting page --QUESTION: what is listed in the page structures? how does kernel know whether the address is invalid, in memory, paged, what? --called demand paging, and it's one way to get program code into memory "lazily" --kernel may need to send a page to disk (under what conditions? answer: two conditions must hold for kernel to HAVE to write to disk) (1) kernel is out of memory (2) the page that it selects to write out is dirty --Many 32-bit machines have 4GB of memory, so less common to hear the sound of swapping these days. You either need 36-bit addressing and memory hogs, or multiple large memory consumers running on the same computer --many, many other uses for page faults and virtual memory --high-level idea: by giving kernel (or even user-level program) the opportunity to do interesting things on page faults, you can build interesting functionality: --store memory pages across the network! (Distributed Shared Memory) --basic idea was that on a page fault, the page fault handler went and retrieved the needed page from some other machine --copy-on-write --when creating a copy of another process, don't copy its memory. just copy its page tables, mark the pages as read-only --QUESTION: do you need to mark the parent's pages as read-only as well? --program semantics aren't violated when programs do reads --when a write happens, a page fault results. at that point, the kernel allocates a new page, copies the memory over, and restarts the user program to do a write --then, only do copies of memory when there is a fault as a result of a write --this idea is all over the place --accounting --good way to sample what percentage of the memory pages are written to in any time slice: mark a fraction of them not present, see how often you get faults --if you are interested in this, check out the paper "Virtual Memory Primitives for User Programs", by Andrew W. Appel and Kai Li, Proc. ASPLOS, 1991. --Paging in day-to-day use --Demand paging --Growing the stack --BSS page allocation --Shared text --Shared libraries --Shared memory --Copy-on-write (fork, mmap, etc.) 2C. Page faults: costs --What does demand paging (i.e., paging from the disk) cost? --let's look at average memory access time (AMAT) --AMAT = (1-p)*memory access time + p * page fault time, where p is the prob. of a page fault. memory access time ~ 100ns disk access time ~ 10 ms = 10^7 ns --QUESTION: what does p need to be to ensure that paging hurts performance by less than 10%? 1.1*t_M = (1-p)*t_M + p*t_D p = .1*t_M / (t_D - t_M) ~ 10^1 / 10^7 = 10^{-6} so only one access out of 1,000,000 can be a page fault!! --basically, page faults are super-expensive (good thing the machine can do other things during a page fault) --Thrashing is even worse Memory overcommitted -- pages tossed out while still needed Example: --one program touches 50 pages (each equally likely); only have 40 physical page frames --If have enough pages, 100ns/ref --If have too few pages, assume every 5th reference leads to a page fault --4refs x 100ns and 1 page fault x 10ms for disk I/O --this gets us 5 refs per (10ms + 400ns) = 2ms/ref = 20,000x slowdown!!! --What we wanted: virtual memory the size of disk with access time the speed of physical memory --What we have here: memory with access time roughly of disk (2 ms/mem_ref compare to 10 ms/disk_access) Concept is much larger than OSes: need to pay attention to the slow case if it's really slow and common enough to matter. 3. Other page structures A. Very large page sizes (e.g., 4 MB) --advantage: small page tables --disadvantage: lots of wasted memory --PSE (set bit 7 in PDE and get 4MB pages, no PTs) --**there is trade-off between large page sizes and small page sizes**. what is the nature of the trade-off? --large page sizes means wasting actual memory --small page sizes means lots of page table entries (which may or may not get consumed) B. Many levels of page table --advantage: not much memory spent on page tables if address space is sparse --disadvantage: lots of page table walking C. What happens when memory gets huge? --many levels of page table; or --inverted page table --works as a hash table --stores entries --------------------------------------------------------------------------- 4. Replacement policies --this topic is related to the previous but also more general than the paging context. --the fundamental problem/question: --some entity holds a cache of entries and gets a cache miss. The entity now needs to decide which entry to throw away. How does it decide? --make sure you understand why page faults that result from "page-not-present in memory" are a particular kind of cache miss --(the answer, which you should make sure you understand, is that in the world of virtual memory, the pages resident in memory are basically a cache to the backing store on the disk; make sure you see why this claim, about virtual memory vis-a-vis the disk, is true.) --the system needs to decide which entry to throw away, which calls for a *replacement policy* --so let's cover some policies [put these on the board in one place] * FIFO: throw out oldest (results in every page spending the same number of references in memory. not a good idea. pages are not accessed uniformly.) --optimal: * MIN (also known as OPT). throw away the entry that won't be used for the longest time. our textbook and other references assert that it is optimal, but they do not prove it. it's a good idea to get in the habit of convincing yourselves of (or disproving) assertions. Here's a proof, under the assumption that the cache is always full: Choose any other scheme. Call it ALT. Now let's sum the number of misses under ALT or OPT, and induct over the number of references. Four cases at any given reference: {OPT hits, ALT hits}, {OPT hits, ALT misses}, {OPT misses, ALT misses}, {OPT misses, ALT hits}. The only interesting case is the last one (in the other cases, OPT does as well or better than ALT, so OPT keeps pace with, or beats, the competition at every reference). Say that the last case happens at a reference, r. By the induction hypothesis, OPT was optimal right up until the *last* miss OPT experienced, at reference, say, r - a. After that reference, there has been only one miss (the current one, at r). The alternative, ALT, couldn't have done better than OPT up until r-a (by the induction hypothesis). And since r-a, OPT has had only one miss. But ALT could not have had 0 misses between r-a and now because if it did, it means that OPT replaced the wrong entry at r-a (another way to say the same thing: OPT was chosen so that a is maximal). Thus, OPT is no worse than ALT at r. In the remaining cases, OPT is as good or better than ALT in terms of contributing to the number of misses. So by induction, OPT is optimal. --evaluating these things input --reference string: sequence of page accesses --cache (e.g., physical memory) size output --number of cache evictions (e.g., number of swaps) --examples...... --time goes left to right. --cache hit = h ------------------------------------ FIFO phys_slot A B C A B D A D B C B S1 A h D h C S2 B h A S3 C B h 7 swaps, 4 hits ------------------------------------ OPTIMAL phys_slot A B C A B D A D B C B S1 A h h C S2 B h h h S3 C D h 5 swaps, 6 hits ------------------------------------ -- * LRU: throw out the least recently used (this is often a good idea, but it depends on the future looking like the past. what if we chuck a page from our cache and then were about to use it?) LRU phys_slot A B C A B D A D B C B S1 A h h C S2 B h h h S3 C D h 5 swaps, 6 hits --LRU looks awesome! --but what if our reference string were ABCDABCDABCD? phys_slot A B C D A B C D A B C D S1 A D C B S2 B A D C S3 C B A D 12 swaps, 0 hits. BUMMER. --same thing happens with FIFO. --what about OPT? [not as much of a bummer at all.] --other weirdness: Belady's anomaly: what happens if you add memory under a FIFO policy? phys_slot A B C D A B E A B C D E S1 A D E h S2 B A h C S3 C B h D 9 swaps, 3 hits. not great. let's add some slots. maybe we can do better phys_slot A B C D A B E A B C D E S1 A h E D S2 B h A E S3 C B S4 D C 10 swaps, 2 hits. this is worse. --do these anomalies always happen? --answer: no. with policies like LRU, contents of memory with X pages is subset of contents with X+1 pages --all things considered, LRU is pretty good. let's try to implement it...... --implementing LRU --reasonable to do in application programs like Web servers that cache pages (or dedicated Web caches). [use queue to track least recently accessed and use hash map to implement the (k,v) lookup] --in OS, LRU itself does not sound great. would be doubling memory traffic (after every reference, have to move some structure to the head of some list) --and in hardware, it's way too much work to timestamp each reference and keep the list ordered (remember that the TLB may also be implementing these solutions) --how can we approximate LRU?