Class 5
CS 480-008
11 February 2016

On the board
------------

1. Last time
2. Aleph One article
3. admin notes
4. Defenses and attacks
    --canaries
    --NX stack, W^X 
    --ASLR    

---------------------------------------------------------------------------

1. Last time

    x86 architecture and assembly

    stack frames

    buffer overflows

    What happens if stack grows up, instead of down?
        Stack frame for read_req() has buf[] at highest address,
            so won't overflow onto read_req()'s return address.
        BUT: look at the stack frame for gets.


2. Discuss Aleph One article

    Concepts/techniques:

        [You need to use some, but not necessarily all, of these
          concepts/techniques to do the lab]

        Concerning the shellcode itself:

            don't need to know where "/bin/sh" lives (even though the
            precise location has to be passed). use the jump+call+pop trick.
                [see shellcode.S, from handout last time]

            recode 0 bytes (by avoiding them)
                (you may also need to avoid 10 and 13 in a byte, since
                those are ASCII '\n' and '\r'; parsing/copying/processing code 
                often treats those values as delimiters.)

                [see shellcode.S]

        Concerning the exploit payload (shellcode + other stuff):

            don't need to know exact location of return address: blast the
            stack with it

            don't need to know exact location of buffer: use NOP slide
            technique

        Conducting the exploit:
        
            Aleph One is running the vulnerable process as a child of the
            exploit process. In this way, the vulnerable process inherits
            the environment.
                [you will use the network to pass the exploit]

            Aleph One uses this "inherit of the environment" to take
            advantage of a buffer that is too small for the shellcode:

                EGG holds NOP slide + shellcode

                RET holds lots of repeated copies of the current
                process's stack pointer (call this address X); the child
                has a larger environment than the parent, and the
                environment shows up at the beginning of the stack. So
                the hope is that address X falls somewhere in the NOP
                slide in EGG.
                
                Then, when RET overflows its small buffer, and the
                return address location is overwritten, the program's
                control flow will jump to X, hit the NOPs, and slide
                into the shellcode.


    Good news: simple buffer overruns like this do not work any more.

    Student questions:

        When we overflow a buffer to point to an environment variable that
        contains our exploit, doesn't this assume that we already have
        access to the OS? Can't we just run arbitrary programs on the OS to
        do something malicious anyways?

        why is this line a problem instruction? That is definitely not a
        null byte.

                   movl   $0xb,%eax

        It's not clear why/how the shell code jumps from assembly code
        to the encoded string. Do we have to write the encoded version?
        Why leave the '/bin/sh' as it is at the end? Why is the char
        shellcode[] cast to int?

        In the case execve fails, "The program will continue fetching
        instructions from the stack, which may contain random data! The
        program will most likely core dump." Why is this?  what can
        cause execve to fail? if you're trying to smash someone's stack, why
        would you care as to whether or not you "exit cleanly"?

        When trying to overflow a character buffer, instead of replacing
        problem instructions with a set of substitute instructions why can't
        we pad the instructions with NOPs?

        Question: How does padding your overflow buffer with NOP
        instructions help you guess the offset of another program, if
        you are have no idea what the offset is? 

        I'm still unclear as to why/how the Xt library attack works.

3. admin notes

    make-up class Friday, 12:30 CIWW 101. if you cannot make it to the
    make-up class, that is completely fine, but note that the video will
    be assigned homework over the weekend.

    question due for tomorrow's class, even if you can't make it
    tomorrow.

    labs heads up: if you fully understand Aleph One's article, and if
    you're comfortable with gdb, the concept of memory as array of
    bytes, the interchangeability of code/data, etc., then the lab won't be
    too bad. otherwise, it could take a while. 

4. Defenses and attacks

    * Large part of the problem: buggy C code that writes beyond
    the end of buffer/array. So use a language that enforces bounds,
    like C#, Java, Python?

        --huge effort to retrain programmers and rewrite software
        (but not impossible)

        --runtime code uses raw memory. tons and tons of bugs in the
        Java runtime, Javascript, etc.

            https://www.usenix.org/legacy/event/sec09/tech/full_papers/ratanaworabhan.pdf
            https://www.corelan.be/index.php/2011/12/31/exploit-writing-tutorial-part-11-heap-spraying-demystified/

        --C is used for many valuable applications and libraries, so we
        cannot abandon it, and often can't avoid writing new C code.

            (And the C definition makes it hard (or impossible) to
            precisely check bounds automatically.)
            
            (And although the perfect programmer would check bounds 100%
            of the time, no programmers are perfect. Still, here are
            some guidelines:
                --Be careful about sizes of buffers, strings, arrays, etc.
                --Use strncpy() instead of strcpy(), strncat() instead of strcat(),
                  fgets() instead of gets(), etc.
                --Don't ignore compiler warnings
            )

    * Thus, we need defenses that make make buffer overflows harder
    to exploit, for big buggy C programs that we don't understand!

    A. Canaries (StackGuard, or gcc's Stack Smashing Protector)

        --Compiler generates code that pushes a "canary" value on stack
        at function entry, pops and checks value before return.

        --Idea: detect modification of return PC on stack *before* RET
        instruction executes

        --Canary sits between variables and return address, e.g.:       
                         |                  |
                         +------------------+
        entry %esp ----> |  return address  |    ^
                         +------------------+    |
        new %ebp ------> |    saved %ebp    |    |
                         +------------------+    |
                         |     CANARY       |    | Overflow goes
                         +------------------+    | this way.
                         |     buf[127]     |    |
                         |       ...        |    |
                         |      buf[0]      |    |
                         +------------------+
                         |                  |

       --What value should we use for the canary?

       (Needs to be hard to guess. So make it be a random number, chosen
       at program start, stored somewhere.)

  
        --What kinds of attacks might work despite stack canaries?
            
         * overwrite function pointer before the canary

         * overflow of one global variable into the next (much like on stack).

         * overflow of heap-allocated buffer (heap smashing: discuss in
         next class)

         * In fact, we can even defeat canaries when the exploit is a
         classical return address overwrite (stack reading: discuss in
         next class)

 
    B. NX stack? W ^ X?

        Modern MMUs have a concept of execute permissions for memory.
        so arrange for OS to use that, and mark the stack
        non-executable. This does not require recompiling programs.

        But it can be defeated. The class of attacks is return-to-libc. 

           
        Version 1 (basic version):

            Assume:
                system(char* cmd_name) is in the process's memory, at 
                address 0x50000
 
           overwrite: 
                            increasing mem
               ------------------------->
               ret_addr   arg1   arg2
               <------------------------
                            direction in which stack grows

           with:

               0x50000    junk     x      /bin  /bas  h\0

            where x is the address of the next element on the stack,
            namely "/bin/bash"

            (this is a fake stack frame for system())

        But what if system() isn't the address space of the process?

        Version 2:

            Assume:
                0x300000 is an area of memory is writable for the
                process 

            overwrite:
                            increasing mem
               ------------------------->
              ret_addr    arg1      arg2
               <------------------------
                            direction in which stack grows

  
            with
              memcpy_addr 0x300000  0x30000  x  size   [shellcode]

            where x is the address of shellcode on the stack

            Result:
            
                --exploited function "returns" to memcpy

                --when memcpy runs, it has a stack frame set up for it,
                with the return address being 0x300000, the destination
                to copy to also being 0x300000, the source being a stack
                address, and the size being the size of the shellcode


       But what if we have W ^ X permissions for all memory (meaning
       memory is either writable, or executable, or neither)?
       
          Advantage: the memcpy-to-writableaddress attack doesn't work
          
          Advantage: the program will never execute injected code.

          Disadvantage: harder to dynamically generate code (and JITs
          like Java runtimes and JavaScript engines generate x86 code on
          the fly). Can work around this by first writing and then
          changing to executable.

          Disadvantage: the defense can be defeated. See below.

       Version 3: return-oriented programming (ROP)
       
          look for *gadgets*: pieces of the code section that end in
          the RET instruction. (assumption: attacker has access to the
          binary, and the addresses in the binary do not change.)

          attacker makes the stack look like this:

                             increasing mem
               ------------------------->
              g1  g2 a1 a2   g3 b1   g4 
               <------------------------
                            direction in which stack grows

         
            the g1, g2, ... are the *addresses* of *existing code*.

            for example, maybe the code at g1 does
                movl $0x10, %eax
                ret

            and maybe the code at g2 does:
                popl %ebx
                popl %ecx
                ret

            etc.

        But the attack requires knowledge of the code addresses in the vulnerable
        binary. So, idea: randomize the addresses.
          
    C. Randomized memory: ASLR (Address space layout randomization)

        Relies on the fact that a lot of code is relocatable. Dynamic
        loader can choose random address for each library and each
        program. Now adversary doesn't know the location of gadgets.

        Attacks: 

             --on 32-bit machines there isn't much randomness:
                12 bits cannot be randomized because of the page offsets
                 4 bits cannot be touched because of existing conventions

                so only 16 bits of entropy, which allows adversary to guess
                (if process does not rerandomize on crash)

                for details, see:
                https://cseweb.ucsd.edu/~hovav/dist/asrandom.pdf
        
             --Adversary might not care exactly where to jump.
            Ex: "Heap spraying": fill memory with shellcode so that a
                random jump is okay!

            --Adversary might exploit some code that's not randomized
            (if such code exists).
             
            --Adversary might extract randomness.
               *Programs might generate a stack trace or error message
                which contains a pointer.
               *If adversaries can run some code, they might be able to
                extract real addresses (JIT'd code?).
               *Cute address leak in Flash's Dictionary (hash table):
                    1) Get victim to visit your Flash-enabled page (e.g., buy
                       an ad). The following is then happening in the
                       client-side Flash player:
                    2) Hash table internally computes hash value of keys.
                    3) Hash value of integers is the integer.
                    4) Hash value of object is its memory address.
                    5) Iterating over a hash table is done from lowest
                       hash key to highest hash key.
                    6) So, the attacker creates a Dictionary, inserts a
                       string object which has shellcode, and then inserts
                       a bunch of numbers into the Dictionary.
                    7) By iterating through the Dictionary, the attacker
                       can determine where the string object lives by seeing
                       which integers the object reference falls between!
                    8) Now, overwrite a code pointer with the shellcode
                       address and bypass ASLR!

        What about on 64-bit machines? (Much more entropy! Harder for
        atacker!) And what if the binary isn't even available? Seems
        like it should be really hard for the attacker, but ... even
        that case is exploitable, under certain assumptions. This is our
        transition to "Hacking Blind," to be discussed in the next
        class.

-------
Acknowledgment: MIT 6.858 staff (for some of these notes and refs)