Class 26 CS 372H 26 April 2012 On the board ------------ 1. Last time 2. Finish VMWare discussion 3. Stack smashing 4. Unix security model --------------------------------------------------------------------------- 1. Last time --virtual machines. history of VMWare --discussion of VMWare ESX paper 2. Finish VMWare discussion I. [last time] technique: ballooning II. [last time] technique: content-based page sharing III. technique: share-based allocation, with a tax --Basic idea: give resource rights based on *shares*, S_1, ..., S_n [same author as lottery scheduling!] --The VM selected to relinquish should be the one with the fewest shares per allocated page i.e., lowest ratio of S_i / P_i. that's the OS that's paying the least. --example: A, B each have S=1. reclaim from the larger user. A has twice as many shares --> A can use twice as much memory --Problem: what if a VM has tons of shares but isn't using its memory? Don't want to reclaim pages from other VMs --Solution: tax the idle pages: tax arithmetic. if my income tax rate is T, then if I earn $1, I pay T*$1 in taxes. thus: -- $1 gross = $(1-T) take home. -- $1/(1-T) gross = $1 take home -- to get a dollar taken home, need k = 1/(1-T) idea: tax idle memory pages that are being used are "tax deductible" if you're not using a page, pay a fraction, T, of it back to the system (not "yours"). so each idle page costs, in shares, k times the price of a non-idle page. consider # of shares per post-tax dollars/pages: rho = S / [(# used) + k*(#idle)] = S / P(f + k(1-f)) k is "idle page cost", k = 1/(1-T) f is fraction of active pages --ASK: how to measure non-idle memory (f): Statistical sampling: Pick n pages at random, invalidate, see if accessed If t pages touched out of n at end of period, estimate usage as t/n How expensive is this? <= 100 page faults over 30 seconds negligible Ridiculously easy --ASK: why do they keep three moving averages? What do they keep three moving averages of? --> Slow exponentially weighted moving average of t/n over many periods --> Faster weighted average that adapts more quickly --> Version of faster average that incorporates samples in current period --use max of 3. why? Basic idea: respond rapidly to increases in memory usage and gradually to decreases in memory usage. --When in doubt, want to respect priorities (so give credit for having had a high estimate of non-idle pages in the past). --Spike in usage likely means VM has "woken up" --Small pause in usage doesn't necessarily mean pause will continue to last --ASK: how do they use the estimate? --ASK: how well does this do? [answer: figure 6 (p. 9)] big picture: estimate (5.3) --> shared-based alloc. based on tax and reclaiming from smallest (5.2) --> ballooning (3.2) or paging (3.3) to decide which page commentary: very nice design in part because it has very few parameters: min, max, S [per VM] system-wide [\tao] 3. Stack smashing --history --('buffer overflow' is one way to conduct a stack smashing attack.) --note how exploit works --primitive form of linking, at exploit time! --relies on fork/exec separation --demo [NOTE: fork/exec separation is what allows us to write tcpserve: after the fork() but before exec() of buggy-server, child rearranges its file descriptors to be the socket itself. Also, this sample code gives you a chance to see sockets in action.] --UTCS host runs server. as Jason. --my laptop runs honest client --my laptop runs dishonest client --note: if this server had been running as root, we'd have been able to get a root shell --and if the user/syscall interface doesn't check its arguments properly, can buffer overflow that interface --in practice, once you have a user account on a machine, it's often possible to get root access (why? because the syscall interface is really hard to secure, as a matter of practice.) --other versions of these attacks --return-to-libc (see Tanenbaum) [DRAW PICTURE] --return-oriented programming --overwriting function pointers --smashing the heap --how do people defend against these things? --W ^ X (map the stack pages as non-executable, if the hardware allows it). But there are some issues.... --the original 386 did not allow it with page tables. However, all x86 chips that support extended page tables (which are used to help users get at >4GB of physical memory even if the machine is 32 bits) also support an XD bit in those page tables, which means "don't execute code in this page". We haven't worked with this bit in this class, but the architecture on modern 32-bit x86 supports it. --Even on x86s that don't suport extended page tables, segmentation would help with do-not-execute (since the permissions in the segment descriptor can express this). The disadvantage here is that the compiler needs to lay out the code and stack to match what the segments would require. --The bummer with W ^ X, even when it *is* supported, is this: some languages not only don't need it but also are actively harmed by W ^ X. The core of the issue is that a program written in a safe language (Perl, Python, Java, etc.) does not need W ^ X whereas lots of C programs do. Meanwhile some machines *always* enforce W ^ X, even for programs that do not need it. Such enforcement constrains certain languages, namely those that need to do runtime code generation. This is related to the topic of binary translation (recall guest lecture) --Address space randomization. This provides some help but obviously doesn't help our vulnerable server because our server tells the client where the buffer is. --StackGuard (in gcc). --another defense: don't use C! CPUs are so fast that a language with bounds checking probably isn't going to pay a huge performance penalty relative to one without bounds checks --unfortunately, this is an arms race, and each time a new defense arises, a new attack arises too. here's the most advanced current technique, and it defeats many of the above defenses: --smash the stack with a bunch of return addresses. each return address points to the needed instruction followed by "ret" (requires the attacker to have previously identified these instructions in the code). not too hard in CISC code like on x86, where there are lots of sequences of code embedded in the binary, even sequences that the programmer didn't mean (because instructions are not fixed length). result: the control flow bounces around all of these byte sequences in memory, executing exactly what the attacker wanted, but not executing off of the stack. --this is called "return-oriented programming". defending against it is hard (though if people use only safe languages, that is, languages that do bounds checking and other pointer checks, such attacks will be much, much harder) --Question: can we instead confine processes and users so that when they're broken into, the damage is limited?