Class 5 CS 480-008 11 February 2016 On the board ------------ 1. Last time 2. Aleph One article 3. admin notes 4. Defenses and attacks --canaries --NX stack, W^X --ASLR --------------------------------------------------------------------------- 1. Last time x86 architecture and assembly stack frames buffer overflows What happens if stack grows up, instead of down? Stack frame for read_req() has buf[] at highest address, so won't overflow onto read_req()'s return address. BUT: look at the stack frame for gets. 2. Discuss Aleph One article Concepts/techniques: [You need to use some, but not necessarily all, of these concepts/techniques to do the lab] Concerning the shellcode itself: don't need to know where "/bin/sh" lives (even though the precise location has to be passed). use the jump+call+pop trick. [see shellcode.S, from handout last time] recode 0 bytes (by avoiding them) (you may also need to avoid 10 and 13 in a byte, since those are ASCII '\n' and '\r'; parsing/copying/processing code often treats those values as delimiters.) [see shellcode.S] Concerning the exploit payload (shellcode + other stuff): don't need to know exact location of return address: blast the stack with it don't need to know exact location of buffer: use NOP slide technique Conducting the exploit: Aleph One is running the vulnerable process as a child of the exploit process. In this way, the vulnerable process inherits the environment. [you will use the network to pass the exploit] Aleph One uses this "inherit of the environment" to take advantage of a buffer that is too small for the shellcode: EGG holds NOP slide + shellcode RET holds lots of repeated copies of the current process's stack pointer (call this address X); the child has a larger environment than the parent, and the environment shows up at the beginning of the stack. So the hope is that address X falls somewhere in the NOP slide in EGG. Then, when RET overflows its small buffer, and the return address location is overwritten, the program's control flow will jump to X, hit the NOPs, and slide into the shellcode. Good news: simple buffer overruns like this do not work any more. Student questions: When we overflow a buffer to point to an environment variable that contains our exploit, doesn't this assume that we already have access to the OS? Can't we just run arbitrary programs on the OS to do something malicious anyways? why is this line a problem instruction? That is definitely not a null byte. movl $0xb,%eax It's not clear why/how the shell code jumps from assembly code to the encoded string. Do we have to write the encoded version? Why leave the '/bin/sh' as it is at the end? Why is the char shellcode[] cast to int? In the case execve fails, "The program will continue fetching instructions from the stack, which may contain random data! The program will most likely core dump." Why is this? what can cause execve to fail? if you're trying to smash someone's stack, why would you care as to whether or not you "exit cleanly"? When trying to overflow a character buffer, instead of replacing problem instructions with a set of substitute instructions why can't we pad the instructions with NOPs? Question: How does padding your overflow buffer with NOP instructions help you guess the offset of another program, if you are have no idea what the offset is?  I'm still unclear as to why/how the Xt library attack works. 3. admin notes make-up class Friday, 12:30 CIWW 101. if you cannot make it to the make-up class, that is completely fine, but note that the video will be assigned homework over the weekend. question due for tomorrow's class, even if you can't make it tomorrow. labs heads up: if you fully understand Aleph One's article, and if you're comfortable with gdb, the concept of memory as array of bytes, the interchangeability of code/data, etc., then the lab won't be too bad. otherwise, it could take a while. 4. Defenses and attacks * Large part of the problem: buggy C code that writes beyond the end of buffer/array. So use a language that enforces bounds, like C#, Java, Python? --huge effort to retrain programmers and rewrite software (but not impossible) --runtime code uses raw memory. tons and tons of bugs in the Java runtime, Javascript, etc. https://www.usenix.org/legacy/event/sec09/tech/full_papers/ratanaworabhan.pdf https://www.corelan.be/index.php/2011/12/31/exploit-writing-tutorial-part-11-heap-spraying-demystified/ --C is used for many valuable applications and libraries, so we cannot abandon it, and often can't avoid writing new C code. (And the C definition makes it hard (or impossible) to precisely check bounds automatically.) (And although the perfect programmer would check bounds 100% of the time, no programmers are perfect. Still, here are some guidelines: --Be careful about sizes of buffers, strings, arrays, etc. --Use strncpy() instead of strcpy(), strncat() instead of strcat(), fgets() instead of gets(), etc. --Don't ignore compiler warnings ) * Thus, we need defenses that make make buffer overflows harder to exploit, for big buggy C programs that we don't understand! A. Canaries (StackGuard, or gcc's Stack Smashing Protector) --Compiler generates code that pushes a "canary" value on stack at function entry, pops and checks value before return. --Idea: detect modification of return PC on stack *before* RET instruction executes --Canary sits between variables and return address, e.g.: | | +------------------+ entry %esp ----> | return address | ^ +------------------+ | new %ebp ------> | saved %ebp | | +------------------+ | | CANARY | | Overflow goes +------------------+ | this way. | buf[127] | | | ... | | | buf[0] | | +------------------+ | | --What value should we use for the canary? (Needs to be hard to guess. So make it be a random number, chosen at program start, stored somewhere.) --What kinds of attacks might work despite stack canaries? * overwrite function pointer before the canary * overflow of one global variable into the next (much like on stack). * overflow of heap-allocated buffer (heap smashing: discuss in next class) * In fact, we can even defeat canaries when the exploit is a classical return address overwrite (stack reading: discuss in next class) B. NX stack? W ^ X? Modern MMUs have a concept of execute permissions for memory. so arrange for OS to use that, and mark the stack non-executable. This does not require recompiling programs. But it can be defeated. The class of attacks is return-to-libc. Version 1 (basic version): Assume: system(char* cmd_name) is in the process's memory, at address 0x50000 overwrite: increasing mem -------------------------> ret_addr arg1 arg2 <------------------------ direction in which stack grows with: 0x50000 junk x /bin /bas h\0 where x is the address of the next element on the stack, namely "/bin/bash" (this is a fake stack frame for system()) But what if system() isn't the address space of the process? Version 2: Assume: 0x300000 is an area of memory is writable for the process overwrite: increasing mem -------------------------> ret_addr arg1 arg2 <------------------------ direction in which stack grows with memcpy_addr 0x300000 0x30000 x size [shellcode] where x is the address of shellcode on the stack Result: --exploited function "returns" to memcpy --when memcpy runs, it has a stack frame set up for it, with the return address being 0x300000, the destination to copy to also being 0x300000, the source being a stack address, and the size being the size of the shellcode But what if we have W ^ X permissions for all memory (meaning memory is either writable, or executable, or neither)? Advantage: the memcpy-to-writableaddress attack doesn't work Advantage: the program will never execute injected code. Disadvantage: harder to dynamically generate code (and JITs like Java runtimes and JavaScript engines generate x86 code on the fly). Can work around this by first writing and then changing to executable. Disadvantage: the defense can be defeated. See below. Version 3: return-oriented programming (ROP) look for *gadgets*: pieces of the code section that end in the RET instruction. (assumption: attacker has access to the binary, and the addresses in the binary do not change.) attacker makes the stack look like this: increasing mem -------------------------> g1 g2 a1 a2 g3 b1 g4 <------------------------ direction in which stack grows the g1, g2, ... are the *addresses* of *existing code*. for example, maybe the code at g1 does movl $0x10, %eax ret and maybe the code at g2 does: popl %ebx popl %ecx ret etc. But the attack requires knowledge of the code addresses in the vulnerable binary. So, idea: randomize the addresses. C. Randomized memory: ASLR (Address space layout randomization) Relies on the fact that a lot of code is relocatable. Dynamic loader can choose random address for each library and each program. Now adversary doesn't know the location of gadgets. Attacks: --on 32-bit machines there isn't much randomness: 12 bits cannot be randomized because of the page offsets 4 bits cannot be touched because of existing conventions so only 16 bits of entropy, which allows adversary to guess (if process does not rerandomize on crash) for details, see: https://cseweb.ucsd.edu/~hovav/dist/asrandom.pdf --Adversary might not care exactly where to jump. Ex: "Heap spraying": fill memory with shellcode so that a random jump is okay! --Adversary might exploit some code that's not randomized (if such code exists). --Adversary might extract randomness. *Programs might generate a stack trace or error message which contains a pointer. *If adversaries can run some code, they might be able to extract real addresses (JIT'd code?). *Cute address leak in Flash's Dictionary (hash table): 1) Get victim to visit your Flash-enabled page (e.g., buy an ad). The following is then happening in the client-side Flash player: 2) Hash table internally computes hash value of keys. 3) Hash value of integers is the integer. 4) Hash value of object is its memory address. 5) Iterating over a hash table is done from lowest hash key to highest hash key. 6) So, the attacker creates a Dictionary, inserts a string object which has shellcode, and then inserts a bunch of numbers into the Dictionary. 7) By iterating through the Dictionary, the attacker can determine where the string object lives by seeing which integers the object reference falls between! 8) Now, overwrite a code pointer with the shellcode address and bypass ASLR! What about on 64-bit machines? (Much more entropy! Harder for atacker!) And what if the binary isn't even available? Seems like it should be really hard for the attacker, but ... even that case is exploitable, under certain assumptions. This is our transition to "Hacking Blind," to be discussed in the next class. ------- Acknowledgment: MIT 6.858 staff (for some of these notes and refs)