Class 11
CS 480-008
3 March 2016

On the board
------------

1. Last time
2. setuid, contd
3. Confused Deputy
4. SFI 

---------------------------------------------------------------------------

1. Last time

    --OKWS

    --setuid

    --escaping from chroot jail:

        assume a jailed root process. if the process then does: 

            mkdir foo
            chroot (foo)   # replaces the original chroot
            cd ../../../../ 

        [key point: chroot's don't "stack". once the current working
        directory is outside the jail, the "../../../../" refers to the
        top-level. If this is done from within the original jail, the
        ../../../ eventually "ends" at the *jail's* "/".]


2. setuid, continued

    --Have to be EXTREMELY careful when writing setuid code

	[Imagine that you are writing or installing a setuid
	program. In other words, you are willing to let that program
	run with *your* privileges.]

	--Fundamental reason you need to be careful: very difficult
	to anticipate exactly how and in what environment the code
	will be run....yet when it runs, it runs with *your*
	privileges (where "your" equals "root" or "whoever set the
	setuid bit on some code they wrote")

	--NOTE: Attackers can run setuid programs any time (no need
	to wait for root to run a vulnerable job)

	--FURTHER NOTE: Attacker controls many aspects of program's
	environment


    --Example attacks:

	--Close fd 2 before execing program
	
	    --now, setuid program opens a file, for example the
	    password file.... (normally, would be fd=3, but because
	    fd 2 was closed, the file will be given fd 2).
	    
	    --then, the program later encounters an error message
	    and does fprintf(stderr, "some error msg").

		--result: the error message goes into the password
		file!

	    --fix: for setuid programs, kernel will open dummy fds
	    for 0,1,2 if not already open

	--Set maximum file size to zero (if, say, setuid program
	changes a password and then rebuilds some password database)

        --IFS hackery
        
            --say there's a program called "preserve" installed as
            setuid root; this setup was used by old editors to make a
            backup of files in a root-accessible directory

                --preserve runs 
                    system("/bin/mail").
                [to mail error messages or whatever]

                --"system" uses the shell to parse its argument

            --now if IFS (internal field separator) is set to "/" before
            running vi, then we get the following:

                --vi forks and execs /usr/lib/preserve (IFS is still set
                to '/', but exec() call doesn't care)

                --preserve invokes system("/bin/mail"), but this causes
                shell to parse the arguments as:
                    bin mail

                --which means that if the attacker locally had a
                malicious binary called 'bin', then that binary could
                do:

                    cd /homes/mydir/bin 
                    cp /bin/sh ./sh 
                    chown root sh  # this succeeds because 'bin' is running as root
                    chmod 4755 sh  # this succeeds because 'bin' is running as root

                --result is that there is now a copy of the shell that
                is owned by root and setuid root

                --anyone who runs this shell has a root shell on the
                machine

	    --fix: shell has to ignore IFS if the shell is running as
	    root or if EUID != UID.

	    --also, modern shells refuse to run scripts that are setuid.
	    (the issue there is a bit different, but it is related.)

        --ptrace() examples

            [we covered these in cs202...placing them in the notes for
            reference...please ask the course staff if you have
            questions.]

            What is ptrace()? It's a system call, used by debuggers

	    Lets one process modify another's memory

	    Attack 1:

		--attacker ptraces setuid program P

		--P runs with root's privileges

		--now manipulate P's memory, get arbitrary privilege
		on the machine. this is bad.

		--fix: don't let process ptrace more privileged process or
		another user's process

		    for example, require sender to match real and
		    effective UID of target


            Attack 2:

                --attacker owns two unprivileged processes A and B.
                
                --A ptraces B. so far, so good. no violation of the
                rule above.

                --Then B execs a setuid program (for example, "su
                whatever"), which causes B's privilege to be raised.

                    (recall that the "su" program is setuid root.
                    "su pat" becomes user "pat" if someone types
                    pat's password.)

                --Now A is connected to a process that is running
                with root's privileges. A can use B's elevated
                privileges. This is bad.

                --fix: disable/ignore setuid bit on binary if
                ptraced target calls exec()

                    --> but let root ptrace anyone

            Attack 3:

                --now, say that A and B are unprivileged processes
                owned by attacker

	        --say A ptraces B. so far, so good. no violation of
	        prior two rules.

	        --say A executes "su attacker", i.e., it's su'ing to
	        its own identity

	        --While su is superuser, B execs "su root" 

	            --remember, the attacker programmed B, and can
	            arrange for it to exec the command just above.

	            --BUT! remembering the ptrace rules above, the
	            ptrace succeeds. the reason is that at this
	            moment A is the superuser, so no problem with
	            B's exec() honoring the setuid.

                --attacker types password into A, gets shell, and
                now this (unprivileged) shell is attached to "su root".

                --the attacker can now manipulate B's memory
                (disable password checks, etc.) so that the "su
                root" succeeds, at which point A is connected to a
                root shell

            See Linux Yama module as a partial defense:
                https://www.kernel.org/doc/Documentation/security/Yama.txt

   
        Another issue:

	    --consider a setuid process that does a bunch of
	    privileged things and then drops privileges to
	    become user again

	    --should be okay, right?

	    *****--NO. once the process has seen something
	    privileged and then become the user again, it can be
	    ptraced(), and the confidential things it has seen
	    (or the privileged resources that it holds) can be
	    manipulated by an unprivileged user.****

	    --fix? privilege separation

        
        Another class of attacks: TOCTTOU (time-of-check-to-time-of-use)

	--very common attack

	--say there's a setuid program that needs to log events to a
	file:

	    fd = open(logfile, O_CREAT|O_WRONLY|O_TRUNC, 0666);

	--what's the problem?

	    --setuid program shouldn't be able to write to file that user
	    can't. thus:
	    
		if (access(logfile, W_OK) < 0)
		    return ERROR;
		fd = open(logfile, ....)

	    should fix it, right?

	    NO!

	--here's the attack........

	    setuid program                          attacker
					      creat("/tmp/X");

	  check access("/tmp/X") --> OK
				    
					      unlink("/tmp/X");
					      symlink("/tmp/X", "/etc/passwd")

	    open("/tmp/X")
	

	--from the BSD man pages:
	    "access() is a potential security hole and should never be
	    used."

	--the issue is that access check and open are non-atomic

	--to fix this, have to jump through hoops: manually traverse
	paths. check at each point that the dir you're in is the one you
	expected to be in (i.e., that you didn't accidentally follow a
	symbolic link). maybe check that path hasn't been modified
        also need to use APIs that are relative to an opened directory
        fd:

            -- openat, renameat, unlinkat, symlinkat, faccessat
            -- fchown, fchownat, fchmod, fchmodat, fstat, fstatat

        Or

            Wrap groups of operations in OS transactions

                --Microsoft supports transactions on Windows Vista and
                newer
                https://msdn.microsoft.com/en-us/library/windows/desktop/bb986748%28v=vs.85%29.aspx

                --research papers:

                http://www.fsl.cs.sunysb.edu/docs/valor/valor_fast2009.pdf
                http://www.sigops.org/sosp/sosp09/papers/porter-sosp09.pdf


3. Confused Deputy

    What is going on?

      Their System had a fortran compiler, /sysx/fort

      They wanted the Fortrain to record usage statistics, but where?

	  --created a special statistics file, /sysx/stat

	  --gave /sysx/fort "home files license". what's this?
	    --Sounds like all files are named (dir)name
	    --Home files license allows (dir)prog to write (dir)xxx for any xxx
	    --Vaguely analogous to Unix setuid

	What goes wrong?

	    --/sysx/fort /mw/code.f -o /sysx/bill

	    --Attacker overwrote billing file (SYSX)BILL with compiler output

     Is this a bug in the compiler?
	Original compiler code to output a file was correct
	Didn't anticipate running with extra authority

    What's the confused deputy?
	The compiler is running on behalf of two principals:
	     - the user principal (to open user's files, which let it
	       run and write debugging files anywhere on the system)
	     - the compiler principal (to open compiler's files, via
	        home files license, which lets it write to /sysx/dir)
	Not clear what principal's privileges should be used
	at any given time

	every file access by compiler was now more privileged, even
	though only one of them should have been (opening /sysx/stat)

	correct operation would require looking at all places where
	compiler opens files *and* checking permissions separately

    Claim: hard to add permissions w/o creating security holes,
	or then restrict without breaking legitimate programs

    Issue shows up in many places:

        --Web server has privilege to read, say, config files; Web server
        also acts on client requests; so have to be careful to prevent
        server from responding to client request for config file!

        --Web browser: running untrusted JavaScript vs saving PDF files
        for user.

    Two ways to think about the confused deputy problem:

      1. Ambient authority: privileges that are automatically used by
      process are the problem here.  It's risky for privilege to be used
      implicitly!

      2. Complex permission checks: it's hard for programs to implement
      their own permission checks correctly.

    Capabilities (and, more generally, better access control schemes)
    can help here. Defer discussion of such schemes to next week or
    after spring break. 

4. SFI

    How did we get here?

    Overall goal: isolate (but still make use of) untrusted code.

        [Example motivation: Web browser and plugins]

        --Approach 1: run the code on another machine. Impractical in
        many cases, particularly if the untrusted code needs to interact
        closely with the trusted code (this is a coarse-grained
        isolation technique; often high overhead).

        --Approach 2: run the untrusted code inside a virtual machine.
        This is easier than finding a separate piece of hardware, and
        it's a good isolation technique, but we still have the issues
        that it is a coarse-grained isolation barriero, often with high
        overhead.)

        --Approach 3: run the untrusted code in a separate process.
        
            --But, even though processes cannot share memory, they can
            still share files (and therefore are not isolated). So:
            
        ---Approach 4: use process isolation together with OS access
        control techniques (as in OKWS, where there was a reliance on
        chroot jails; we'll see other such techniques later when we
        discuss capabilities). these techniques are sometimes known as
        "OS sandboxing".

            --But, these are often awkward, often require root to even
            invoke (you don't want the user's Web browser running as
            root), and differ from platform to platform

        --Approach 5 (today): another form of isolation and containment

A. Intro

    Problem: how to use untrusted code (an "extension") in a trusted
    program?

    Intellectual challenge:
	--need to let code run but somehow control it, without using the
	normal approach to such control, which is the protections
	enforced by hardware (specifically page tables, which create the
	isolated memory view ).

    Examples

    --Use untrusted, legacy jpeg codec in Web browser

	[draw picture of JPEG decoder in browser memory]

    --Use an untrusted driver in the kernel (e.g. loadable kernel module)

    Now a classic paper

	--lots of people doing variants of this

	--most obvious example: the Web, and plugins

	--here's some context:

	--SFI (this paper) --> PittSFIeld (SFI for x86) --> Google NativeClient

	    PittSFIeld reference: 

		[http://people.csail.mit.edu/smcc/projects/pittsfield/]

		Evaluating SFI for a CISC Architecture. Stephen McCamant
		and Greg Morrisett. In 15th USENIX Security Symposium,
		(Vancouver, BC, Canada), August 2-4, 2006


	    NativeClient reference:
		[http://research.google.com/pubs/archive/34913.pdf]

		Native Client: A Sandbox for Portable, Untrusted x86 Native
		Code, Bennet Yee, David Sehr, Gregory Dardyk, Brad Chen,
		Robert Muth, Tavis Ormandy, Shiki Okasaka, Neha Narula,
		Nicholas Fullagar . 30th IEEE Symposium on Security &
		Privacy, May 17-20, 2009.

	--other related work

	    --Xax (by Jon Howell et al.) and NativeClient have the identical
	    motivation but different realizations

	    --vx32 (related work)
		--different approach to sandboxing but similar
		motivation to the works above
		http://pdos.csail.mit.edu/papers/vx32:usenix08.pdf

    The paper we're discussing interestingly missed the Web....

    ...but is still a classic paper
	--at the time, the audience may have been more worried about
	performance...
	--but now, everyone thinks, "yeah, of course we want that", and
	performance may be secondary. (maybe.)

    [defn: "trusted" part of a system is the part of a system assumed to
    be correct.]

    What bad things can the extension do?
    --Write trusted data or code
    --Read private data from trusted code's memory
    --Execute privileged instructions
    --Call trusted functions with bad arguments
    --Jump to unexpected trusted location (e.g. not start of fn)
    --Contain exploitable security flaws that allow others to do the above

    What is it probably OK for an extension to do?
    --Read/write its own memory
    --Execute its own code
    --Call *particular* functions in trusted code

    Possible solutions/approaches:

    --Run extension in its own address space with minimal privileges.
    Rely on hardware and operating system protection mechanisms: page
    tables (for process boundaries) + OS jails (for file access).

    --Restrict the language in which the extension is written:

	--Packet filter language.  Language is limited in its
	capabilities, and it easy to guarantee "safe" execution.

	--Type-safe language. Language runtime and compiler guarantee
	"safe" execution.

    --What's the disadvantages to the above?

	--own address space: expensive context switches

	--safe language: restricts the language that people can use so
	doesn't work for lots of common and legacy code

    --Software-based sandboxing

	--the big idea: isolate code *within* the same address space,
	thereby achieving isolation without context switches
	
	--these ideas are now everywhere. 
	
	Elements:

	--Sandboxer. A compiler or binary-rewriter sandboxes all unsafe
	instructions in an extension by inserting additional
	instructions. For example, every indirect store is preceded by a
	few instructions that compute and check the target of the store
	at runtime.

	--Verifier.

	    --When the extension is loaded in the trusted program, the
	    verifier checks if the extension is appropriately sandboxed
	    (e.g., all direct stores/calls refer to extension's memory,
	    all indirect stores/calls sandboxed, no privileged
	    instructions).
	
	    --If not, the extension is rejected.
	    
	    --If yes, the extension is loaded, and can run.
	
	    --If the extension runs, that means that the sandboxing of
	    unsafe instructions ensures that unsafe instructions are used in a
	    safe way.

	--The verifier must be trusted, but the sandboxer doesn't have
	to be. Meaning: the compiler can screw up and as long as the
	verifier is correct, it doesn't matter.
	
	--We can do without the verifier, if the host can establish that
	the extension has been sandboxed by a trusted sandboxer.

	--You can think of sandboxing as a software version of the memory
	protection you get with page-tables or segments. 

B. Details of SFI

    --Implemented for RISC processors

	--simplifies SFI. why? (two reasons)
	
	    --because every instruction is 32 bits wide, and because one
	    can only jump/call to 32-bit aligned targets, so one can
	    investigate every possible entry point

	    --big register set; makes it easy to use "dedicated
	    registers".

    --Approach:

	    0x101f..........f
				     code
	    0x1010..........0
	    0x100f..........f       
	                             data
	    0x100000000000000
				    
				    Firefox/Chrome/etc.

	    Code Seg ID = 0x101
	    Data Seg ID = 0x100
    
	--[draw the picture above.] the key point is that because the verifier
	enforces that the sandboxed code always uses particular upper
	bits, the code is dealing with a "sandboxed" region of memory.

	--why are there two segments, one for code and the other for data,
	heap and stack?

	    --answer: to prevent application from modifying its own code

	--verifier can check:
	
	    --that direct calls/jumps and stores refer to addresses
	    inside the segment (since such instructions have the address
	    embedded within them).

	    --PC-relative branches
	    
	    --no privileged instructions

	    --The verifier probably has a table of legal call targets that lie in
	    trusted code.

	--hard part: indirect jump/calls (i.e., jump to the contents of
	this register, or store to the address given by this register)

	    [on x86, this is an instruction like "jmp *%ecx"]

	    --first cut: verifier enforces segment matching:

		Suppose the original unsafe instruction is:
		  STORE R1, R0 (i.e. write R1 to Mem[R0])

		Here's how we could sandbox the STORE:
		  Ra <- R0
		  Rb <- Ra >> Rc // Rb = segment ID of target
		  CMP Rb, Rd     // Rd holds extension's data segment ID
		  BNE fault      // Rd != Rb, branch to error handling code
		  STORE R1, Ra

	    --uh-oh. what if the extension jumps directly to the STORE,
	    bypassing the check instructions?

		solution: 
		
		--Ra, Rc, and Rd are _dedicated_ (they cannot be used by
		the extension code.)
		
		--now the verifier must check that the extension doesn't use
		the dedicated registers. 
		
		--the extension CAN jump to the store, but (1) it can't
		set Ra and (2) the sandbox code always leaves a legal
		segment address in Ra.
		
		--thus, the extension can store only to its own memory.

	   --how many registers and check instructions does this cost?

		--4 instructions

		--5 registers (though paper says 4)
		    --Rc (shift amount)
		    --Rd (segment id for data)
		    --Rx (segment id for code)
		    --Ra (address in data segment)
		    --Ry (address in code segment)

	    --second cut: verifier enforces sandboxing:

		  Ra <- R0 & Re // zero out segment ID in Ra
		  Ra <- Ra | Rf // replace with the valid segment ID
		  STORE R1, Ra

		--This code forces the segment part of the address bits
		to be correct.  It doesn't catch illegal addresses; it
		just ensures that illegal addresses are within the
		segment, harming the extension but no other code.

	    --how many registers and check instructions?

		--2 instructions

		--5 registers again this time (the paper says 5)

    --optimizations
	
	--save a sandboxing instruction for instructions of the form:
	    STORE value, offset(R3)

	    naive way:
		Ra <- offset + R3
		Ra <- Ra & Re
		Ra <- Ra | Rf
		STORE value, Ra

	    optimization:
		Ra <- R3 & Re
		Ra <- Ra | Rf
		STORE value, offset(Ra)

	    works because offset is limited to [-32KB,32KB], so no
	    matter the value of Ra, Ra+offset is guaranteed to live in
		[segment_beg-32KB, segment_end+32KB]
	    
	    to prevent code from writing before or after the segment,
	    create guard zones

	--stack pointer

	    --sandbox the stack pointer (SP) only when it is explicitly
	    set, not when it is used to form an address
		so no need to sandbox 
		    STORE value, offset(SP)

	    --optimization works because it's far more common to *read*
	    the SP (as above) than to *set* it.

    --what do they do about system calls?

	--answer: rewrite them to be "RPC" (really a "call" into the
	trusted portion of the code) into arbitration code that decides
	whether the requested system call is acceptable

    --how can we verify if the thing has been sandboxed properly?

	--any time there's a modification to a dedicated register, read
	linearly downward and make sure that the code is such that the
	register becomes valid before it branches or before another
	region starts

	    --question: what does valid mean?
	   
		--answer: its upper bits remain in the segment

	    --basically, the algorithm just makes sure that the code
	    blocks above are in effect

    --summary of properties
	--Prevents writes and calls/jumps outside extension's data memory.
	--Can allow direct calls to specific functions in trusted code.
	--Prevents privileged instructions.
	--Allows any write or call/jump within extension's memory, so
         an extension can wreck itself (or be wrecked by buffer overrun).

C. Discussion

    --performance

	--what types of programs will tend to have higher overheads?
	    (answer: those that write to memory and jump around. tight
	    inner loops not likely to cause much or any overheads)

	--why do they say that sandboxing increases the available
	instruction-level parallelism?
	    (answer: the paper is not clear on this, but it might be
	    that there are fewer context switches, so processor can make
	    better predictions about which code will execute)

	--overall, what do you think of their results? high overhead?
	low overhead?

	--all 5.4 is saying is that, even though encapsulation has an
	overhead, there's a trade-off from avoiding context switches
	(which is abstractly called "crossing fault domains"). their
	analysis captures this trade-off and states the break-even point
	for various constants.

    --At a high level, this thing is doing in software what is really
    hardware's job

    --Can guest read host memory?

	answer: yes

	why? (because loads aren't protected without paying a high price)