Class 7
CS 480-008
16 February 2016

On the board
------------

1. Last time
2. Finish Hacking Blind 
3. Finish buffer overflows
4. admin notes
5. User authentication
    --the overall concern
    --passwords

---------------------------------------------------------------------------

1. Last time

    Heap smashing

        and

    Hacking Blind

    Context:

        Note the pieces the attacker must assemble for heap smashing:
          * Find a buffer overflow bug in the application or library.
          * Find a way to get the program to execute the buggy code
            in a way that causes attacker's bytes to overflow the buffer.
          * Understand malloc() implementation.
          * Guess the address of a pointer to code (that is, guess where
            in memory the program is storing a value that it will later
            treat as the address of code)
          * Guess the address of the buffer, i.e. attacker's injected instructions.

        Bad news for attacker: for heap smashing, attacker must
        understand corner cases in app logic, malloc(), compiler output.

        Good news for attacker: there are attacks (like Hacking Blind)
        where the attacker does not need any knowledge of the target
        binary

        Bad news for attacker: even Hacking Blind can be defeated (by
        rerandomizing, etc.)

        Bad news for defender: if there's a buffer overflow bug, a
        clever enough attacker can probably exploit it. More generally,
        many bugs that look harmless can be turned to an attacker's
        advantage, perhaps in combination with other flaws.

        But! we are now at the point where significant effort (which
        often translates into dollar cost in the gray market /
        underworld) has to be expended to identify flaws and new attack
        vectors.

2. Finish Hacking Blind

    Recall: canaries are defeated by stack reading (assumption: canary
    does not change across server crashes).

    But they identify a vulnerability that circumvent canaries, and
    hence could be exploited *even if canaries are rerandomized*.

        --Example: Figure 13. They say pointer arithmetic
        allows them to bypass the canary. Why?
            
            (choose randomLen > RAN_LEN; then
                the read starts at a negative offset...and whacks the
                return address in the stack frame of input.read)


    Student questions:

    Q: [asked in class] Where does the "/bin/sh" come from?
        Here's Andrea Bittau's response:
        ---
        In BROP, I read "/bin/sh" from the attacker's socket into any
        (fixed) writable address.  So I basically call read(socket,
        some_writeable_address) and then send over /bin/sh to the
        socket.

        In plain ROP, a lot of people use "/bin/sh" from libc - it's in
        there for system().

        In pre-ROP days, people would "push /sh\x00" push "/bin" onto the
        stack.
        ---

    Q: Is the canary and ASLR randomization determined at compile time?
    (i.e. is that why the values of the canary and layout remain the
    same throughout restarts of the server?)
        [no. they remain the same because the server's logic re-fork()s
        from some parent process, and the canaries are determined by
        exec(). So what the server could do is re-exec(), and then
        re-fork().]

    Q: Additional question(s):  Will there always be enough gadgets on
    any given server?  What does the text mean that when we find one
    gadget we will find two useful ones?  How is the second one found? 
    What is the significance of R15?
        [answer has to do with the parse of the BROP gadget]

    Q: How are we actually dumping the server binary on the socket? Is
    it just a write call with the .text section as the buffer. Is the
    .text section null terminated?  Why don't we just try to send all of
    the binary in one pass?

    Q: I don't understand why filling the stack with zeros is related to
    the instruction/frame pointer.  How would an instruction pointer
    crash a program in the first place?  I assume this is related to how
    the value at the frame pointer is stored in the instruction pointer,
    Why exactly does sending a zero word frame pointer to a different
    worker process prevent crashes?
        [Andrea hilariously writes,
        "just reread the paragraph in the paper - wow... it's badly written!"
        and then goes onto say:

            ---
            When you're stack reading, you'll first read the frame
            pointer, and then the return address (all of this after the
            canary, if present).  What you really care about is the
            return address, to defeat ASLR, and so it's OK if you don't
            leak the exact frame pointer (or none at all).

            The assumption in stack reading is that the stack looks the
            same each time around and you can incrementally read it. 
            However, the frame pointer may be slightly different due to
            different upper layer code paths or using different workers
            in different processes with slightly different stack
            layouts.  In this case you might learn incorrect
            information: e.g., you might read a least significant byte
            of 1 but in the other worker the byte might have been 0, and
            you might get stuck.

            The problem is that by corrupting the frame pointer only,
            the 1 return address that follows (which you are not yet
            changing) could always crash if the frame pointer is bad and
            therefore you get no signal.  The idea is to place a 0 word
            in the entire frame pointer and start changing the return
            address so that you find a return address that works
            (doesn't crash) irrespective of the frame pointer (i.e.,
            ignores it because otherwise it'd be a null dereference).

            So, when you get into a situation where stack reading
            "fails" - e.g., you've tried all 256 bytes and not one
            worked, you assume it's the frame pointer and you stick a
            zero word in there, and start stack reading the saved return
            address (next word).  You'll now be searching for a saved
            returned address that "ignores" the frame pointer.

            The canary somewhat helps here, because once you find a
            "very random looking" word, and the next word fails to read,
            you can assume you've hit the frame pointer.

            Of course there're cases where reading the frame pointer
            works - that's when the stack / higher-level branches are
            all very deterministic and don't change the stack layout on
            each attempt.
            ---
          ]

    Q: Here is my question after the reading: How exactly does ASLR
    work? By what mechanism is the .text section randomized? How
    specifically would attacks from before ASLR fail when used on an
    ASLR'd system? Is the stack randomized? If not, why not? 

    Q: I don't understand why only brute forcing one canary value to
    leak one return address is enough to defeat all of ASLR. Is it this
    one brute force leaks an address to the .text section and we only
    need to find this section for the rest of the first phase of the
    exploit? And if so, why is the ret address after the canary always
    pointing to .text instead of another stack frame?
        [return addresses are the locations of *code*. the frame pointer
        is the location of a stack frame. stack memory and code memory
        are separate. we will clarify in more detail in class.]

3. Finish buffer overflow

    Other defenses and attacks:

    * Defense: alter compiler to add bounds checks

        C by default of course doesn't do this because C is pretty much
        macros for assembly language, with nicer syntax...and the
        hardware instruction set doesn't perform bounds checking and
        does expose memory pointers.... 

        but can we modify the compiler to add bounds checks? yes.
        there is a big literature on this (Baggy Bounds Checking, etc.)

        but: 
           (a) There are performance overheads
           (b) Progarms need to be compiled
           (c) There are false alarms: Common theme in security tools:
           false alarms prevent adoption of tools! Often, zero false
           alarms with some misses better thanx  zero misses but false
           alarms.

    * What defenses are used in practice?
        --gcc and MSVC enable stack canaries by default
        (in gcc, turn it off with -fno-stack-protector)
        --Linux and Windows include ASLR and NX by default
        --Bounds checking not as common, because of the disadvantages
        above

    * Other errors/vulnerabilities besides buffer overflow

      --Format string attacks (special case of buffer overflow)

      --Using memory after it has been deallocated (use-after-free).
          If writing, overwrite new data structure, e.g. function ptr.
          If reading, might call a corrupted function pointer.
  
      --Freeing the same memory twice (double-free).
        Might cause malloc to later return the same memory twice.

   
      --Decrementing the stack ptr past the end of stack, into some other memory.
      [ http://www.invisiblethingslab.com/resources/misc-2010/xorg-large-memory-attacks.pdf ]
  
      --A one-byte stray write can lead to compromise.
      [ http://www.openwall.com/lists/oss-security/2014/08/26/2 ]

4. admin notes

    How are people doing with the workload?

    Should we move to cold-calling?

    If we stick with questions, we'll drop the lowest few responses.

5. passwords

    A. The problem: user authentication
    B. Passwords
    C. Criteria in the paper
    D. Schemes/alternatives
    E. Discussion

    A. The overall concern is _user authentication_

     --Underpinning of many security policies

     --Some interesting technical issues

     --Easy to do wrong on technical grounds

     --Also remains challenging on non-technical grounds, because
     security isn't just a technical problem

    Authentication: who is the user?
      Challenging to know for sure
        User registers some secret --- but *who* registers it?
        At the scale of a university, we can probably check the
            identity of the user when registering

      Typically settle for weaker guarantee
        Establish that the user who logs has the secret used when registering
        If so, then assume it is the same user
        But, we have no guarantee that we know the true identity of the user
        For many usages that is fine
          E.g., Amazon doesn't really care who you really are as long as you pay

    Problem: how to authenticate users?

      Setting: user <-> computer <-> verifier server.

      Potential extra components might help authentication:
        A trusted third party.
        User's portable device (either dedicated or app in mobile phone).
        A proxy server.

      This paper proposes a number of criteria to evaluate authentication schemes.

      Proposed criteria are reasonable; sometimes non-orthogonal, and not complete.

      Useful as a starting point to think about a new authentication
      scheme.


    B. Passwords

      --Need some secret between user and verifier 
         call this set of bits a "password"

      --User types in username and password; server checks whether
      password is correct for that username.

      --Passwords is a valuable secret so want to avoid repetitive use and exposure
         Just for user authentication
         Once authenticated, use crypto keys between server/clients
            Client certificates, cookies, etc.
         Even for user authentication, corner passwords by composing them with other ideas
            Password manager, single-sign on, two-factor, etc.
	    Progressive authentication
	    Biometric (e.g., apple's fingerprint button)
            Be careful in combining!

      --How to _store_ passwords?
          Server must be able to verify passwords.
          Naive plan: store plaintext passwords.
            Problem: if adversary compromises server, gets full list of passwords.
          
          Hashing: store a table of (username, H(password)).

            Can still check a password: hash the supplied string,
            compare with table, but now if adversary gets the table,
            doesn't get the passwords (because hash function is assumed
            to be hard to invert)

            Problem 1: password space is quite small.
              Top 5000 password values account for 20% of users.
              Skewed distribution towards common passwords chosen by many users.
              Yahoo password study: rule-of-thumb passwords are 10-20 bits of entropy.
                roughly password is equivalent to 10 random bits
                attacker needs try 2^10 combinations to find password
                [see refs at the end for some interesting case studies]

            Problem 2: hash functions optimized for performance -- this
            *helps* the adversary!
              E.g., a laptop can do ~2M SHA1 operations per second.
              Even with reasonable password (20 bits entropy), crack one account/second.

            
          Response: expensive key-derivation (e.g., PBKDF2 or BCrypt):
          replace the hash with a much more expensive hash function 
                
            Key-derivation functions have adjustable cost: make it arbitrarily slow.
              E.g., can make hash cost be 1 second -- O(1M) times slower than SHA1.

            Internally, often performs repeated hashing using a slow hash.

            Problem: adversary can build "rainbow tables".
              Table of password-to-hash mappings.
              Expensive to compute, but helps efficiently invert hashes afterwards.
              Only need to build this rainbow table for dictionary of common passwords.

            Roughly: 1-second expensive hash -> 1M seconds ~ 10 days to
            hash the one million most common pws.
              After that, can very quickly crack common passwords in any password db.

         
           Better response: _salting_
                Input some additional randomness into the password hash: H(salt, pw).
                Where does the salt value come from?  Stored on server in plaintext.
                Why is this better if adversary compromises the salt too?
                  Cannot build rainbow tables.
                Choose a long random salt.
                Choose a fresh salt each time user changes password.

References:

  Full tech report: http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-817.pdf
  http://www.cl.cam.ac.uk/~jcb82/doc/B12-IEEESP-analyzing_70M_anonymized_passwords.pdf
  http://arstechnica.com/security/2013/10/how-the-bible-and-youtube-are-fueling-the-next-frontier-of-password-cracking/
  http://cynosureprime.blogspot.com/2015/09/how-we-cracked-millions-of-ashley.html


---------------------------------------------------------------------------

Acknowledgment: MIT 6.858 staff