Class 6 (Make-up) CS 480-008 12 February 2016 On the board ------------ 1. Last time 2. Heap smashing 3. BROP 4. Defenses against BROP --------------------------------------------------------------------------- 1. Last time Buffer overflow: defenses and attacks canaries NX stack, W^X ASLR Exercise: go through each attack and think through the threat model (for example, does the attacker have access to the source code? to the binary? is the attacker concerned with making few tries? etc.) Canaries can be defeated by: --overwriting something other than the return address --function pointer --global variable --heap smashing (see below), as a way of indirectly writing to the return address on the stack --stack reading, as a way of *directly* fooling the canary-checking code (again, see below) Context: --Defenses are still not perfect, but the skill and resources required of an attacker have gone way up. --Bugs/vulnerabilities are probably inevitable. So we want to structure applications to limit damage. This will be our concern in privilege separation and lab 3. 2. Heap-smashing --say we have an overflow of heap-allocated buffer. example: foo() { char *p = malloc(16); gets(p); } Can attacker predict what is after p in memory? --simplified version of a real attack: free list: array in memory, with doubly-linked list structure prev next data .... .... prev next data .... .... prev next data So: if the attacker overflows a malloc()ed block, the attacker can modify the next and prev pointers in the next block. Meanwhile, here's an excerpt from malloc()'s logic when it allocates a block: b = choose a free block b->next->prev = b->prev; b->prev->next = b->next; So: suppose attacker writes x y to start of next block. Call next block b. So now: b->prev = x b->next = y Assume b is chosen by next malloc(). When b->next->prev = b->prev executes, this is effectively: *y = x [because prev is the first item in the struct, so (b->next)->prev <-- foo becomes *(b->next + 0) <-- foo b->prev->next = b->next becomes: *(x + 4) = y This means that an attacker chosen value can be written to any memory location! Thus writing an attacker-chosen value to any memory location! If attacker can guess address of saved return PC, and can guess address of the buffer being overflowed, can load instructions into the buffer and cause PC to point to injected instructions. Similarly for *any* function pointer with predictable address. Q: how could the attacker predict such addresses? Real attacks have to be more complex For details see: http://www.win.tue.nl/~aeb/linux/hh/hh-11.html http://phrack.org/issues/57/8.html http://phrack.org/issues/57/9.html#article 3. BROP Many of the attacks that we have seen require the attacker to have access to the source code and, ideally, the binary. Meanwhile, the binary varies a lot, depending on compiler, OS, etc. What if the attacker doesn't have access to the binary? ASK: what's the threat model in this paper? (attacker has network access to a server. attacker can tell whether a server crashes. server restarts on crash but DOES NOT rerandomize its canaries and ASLR on crash server can be compiled with PIE (which makes the attacker's life harder, because it means that ASLR is applied to the entire address space) ) targets servers with stack buffer overflow (as opposed to heap buffer overflow, as above) (Figure 4 is a nice summary of how the authors extend the state of the art) Game board: Attacker does the following: (A) exploit the stack buffer overflow vulnerability to mount a stack reading attack, to defeat canaries and ASLR (B) identify gadgets, without seeing the executable (C) exploit that same vulernability to construct a ROP chain to write the binary itself over the socket, possibly in small pieces (D) on the attacker's computer, analyze the binary to scan for more gadgets and get complete information about the randomization (E) exploit the vulnerability yet again, this time with a ROP chain that spawns a shell The paper's contribution and focus is (A)-(C); steps (D) and (E) rely on known techniques. We will cover (A)-(C) below. A. Stack reading --How does this work? char canary[8]; for(int i = 1; i <= 8; i++){ //For each canary byte . . . for(char c = 0; c < 256; c++){ //. . . guess the value. canary[i-1] = c; server_crashed = try_i_byte_overflow(i, canary); if(!server_crashed){ //We've discovered i-th byte of the //the canary! break; } } } (This works because of the assumption that the server is not rerandomizing after crashes.) How many tries does it take to guess the canary? 128 on average for the inner loop. 8 for the outer loop (on a 64-bit system, the canary is 8 bytes) ---- 1024 on average. This isn't many. And it's much faster than "brute force", which would mean 2^{27} guesses on average (because there are 28 "free" bits; see Table I). --This technique was known. The authors' contribution in this piece is extending it to read out the frame pointer and saved return address. They do this by writing bytes into the return address until the program does not crash. Once they have done that, they know an address where code lives, and they have partially defeated ASLR. Because the top two bytes of code addresses are always zero and because "the third byte is 0x7f for libraries and the stack," they only need to try this many times: 128 on average for the inner loop 5 for the outer loop (only 5 vary) --- 640 on average --Note that they also want to read the frame pointer. How can they read it, and why do they want to? The how: Because if the frame pointer isn't set correctly, the next stack frame (the higher one) will be set up wrong. The why: Because knowing the saved frame pointer gives them information, and helps validate that the attack is working --If stack reading fails, i.e., if the program crashes for *every* byte value tried (or keeps going for more than one value in the canary-reading phase), then the attacker gives up: this isn't a stack buffer overflow vulnerability. B. BROP Once they know approximately where code lives, they go in search of gadgets... Step 1: find a stop gadget --A stop gadget is a return address that points to code that will hang the program (or more generally, have some identifiable effect), but not crash it (which is also identifiable, because the remote OS closes the socket) --Once the attacker can defeat canaries, the attacker can overwrite the function's return address and start guessing locations for a stop gadget. If the client network connection suddenly closes, the guessed address was not a stop gadget. If the connection stays open, the gadget is a stop gadget. [Q: why doesn't hanging the server stop the attack? A: --maybe there is a daemon that checks for liveness and restarts the app if it's not doing anything useful --or maybe server blocks on accept, and creates new thread for each newly returned socket from accept (if the attacker suspects this is the case, would want to crash the process in a second connection, to prevent the case that there is a limit on the number of threads) --or stop gadget doesn't have to literally "stop"; just has to be something that can be detected (see last paragraph of VIII-B; page 6, and VIII-J) ] Step 2: find gadgets that pop stack entries --start guessing addresses for such gadgets --place the guessed address (g1) in the location of the return address, followed by one or more stop gadgets --if the program crashes, the guessed address was no good; if the program stops, the guessed address was a gadget --this technique lets them figure out *how many* entries are popped by the gadget (by fiddling with the locations of the stop/crash gadgets; see VIII-C) --Example: say there's a gadget at address 0x400000 (=probe) that does pop rdi; ret if the attacker notices that results in a stop, then the attacker knows that its gadget does not pop the stack. if the attacker notices that .... results in a stop, then the attacker knows that the gadget pops one element from the stack. so the attacker knows that probe is the address of a gadget with the form: pop REG; ret but the attacker does not know the value of REG Step 3: Figure out exactly what the gadgets do Two options (a) "first principles" (b) optimized version that looks for the BROP gadget For (b), the attacker can locate the gadget based on the probe address not leading to a crash if the stack is arranged as: ... .... [7 stops] Once the attacker has the address of the BROP gadget, he or she can control %rdi and %rsi, which are the first two arguments to a system call (see part C below) For (a), it's even more devious: build mega-gadget that chains together all pop instructions, hoping that one of them pops into %rax. Give each gadget an "argument" that is the pause syscall. Then place the guessed address of syscall() If there is a pause, the attacker now has the address of syscall() Once the attacker has *that* address, it tries each of its pop gadgets one-by-one, loading the number of the pause syscall into a (currently known) register, and then following that on the stack with the address of syscall(). If there is a pause, then the attacker knows that the given gadget-under-test is the one that pops into %rax (since %rax holds the syscall number). There are related tricks for learning what other gadgets do. C. Invoke write() See VIIIA: they need 5 gadgets (4, under the "call write" optimization). They need: pop rdi; ret (socket) pop rsi; ret (buffer) pop rdx; ret (length) pop rax; ret (write syscall number) syscall or pop rdi; ret (socket) pop rsi; ret (buffer) pop rdx; ret (length) call write Part B above told us how to identify the first three gadgets. Finding "pop rdx; ret" is difficult, so the authors use yet another trick: treat all of strcmp as a gadget (!!!) They also need to guess the socket number, but that's fairly easy to do, since Linux restricts processes to 1024 simultaneously open file descriptors, and new file descriptors have to be the lowest one available (so guessing a small file descriptor works well in practice). To test whether we've guessed the correct file descriptor, simply try the write and see if we receive anything! Once there's a ROP chain that invokes write() with the correct arguments, the attacker starts getting the binary over the socket (the 'buffer' is just the address of the program's .text segment, which by now the attacker has learned). D. Some loose ends: What's going on in Figure 13? How does the pointer arithmetic allow them to bypass the canary? (choose randomLen > RAN_LEN; then the read starts at a negative offset...and whacks the return address in the stack frame of input.read) 4. Defenses against BROP --Rerandomize after crash! Note that Windows is less vulnerable to this attack because it has no fork() call, and hence rerandomizes after crashes. But even Windows rerandomizes *system* libraries only when the computer boots. Creates attack surface --Or even better: generate new canary randomly before entering functions. Still, if the attack can circumvent the canary (see the attack on yaSSL; Figure 13), then randomizing the canary won't be effective. (But randomizing the address space still will be helpful.) --After a crash, delay the fork. +: slows down attacker -: attacker can now conduct denial-of-service (DoS) --Extreme version of that: after a crash, don't restart at all! +: defeats BROP -: DoS again -------------------- More info on ROP and x86 calling conventions: http://codearcana.com/posts/2013/05/21/a-brief-introduction-to-x86-calling-conventions.html http://codearcana.com/posts/2013/05/28/introduction-to-return-oriented-programming-rop.html http://www.slideshare.net/saumilshah/dive-into-rop-a-quick-introduction-to-return-oriented-programming https://cseweb.ucsd.edu/~hovav/dist/rop.pdf -------------------- Acknowledgment: MIT 6.858 staff (for some of these notes and refs)