Class 28
CS 372H
4 May 2010

1. Trusting trust
2. Unix security
   --access control model
   --consequences

---------------------------------------------------------------------------

0. quick clarifications/corrections

    --Two-phase commit. Correction: if acknowledgments go back from
    workers to coordinator at the end of phase 2, then the coordinator
    does not have to keep the log of that entry forever. 
    
    --x86 does have a no-execute bit even on 32-bit architectures. have
    to use extended page tables

    --I probably was unfair to Java last time. JVM way less likely to have
    bugs than a hand-coded networked server in C.

1. "Reflections on Trusting Trust"

    --first of all, the word "trust" is a bad thing in computer security
    (this is an unfortunate linguistic fact). to "trust" something means
    to "assume it correct", which in turn means "to be in trouble if the
    assumption is false". so "removing trust" is a good thing. so is
    making things "trust*worthy*" (that is, worthy of being assumed
    correct), but it is in general hard to make any given component
    truly trustworthy.

	--you'll notice that the "trusted computing" initiatives from
	various powerful interests subvert this word. who exactly is
	being trusted? "trusted computing" sounds great linguistically,
	but "trusted computing platforms" do not actually mean what they
	sound like (per Keith's lecture)

   
    A. background on this paper by Thompson:

	Thompson gave this lecture/paper after winning the Turing Award,
	which is considered by many to be the Nobel prize of Computer
	Science. The paper is stunning but takes patience and a few
	readings to understand. We're going to reproduce most of what
	Thompson did but will follow the ideas in an order different
	from the one in the paper. 

    B. adding a feature to a language

	What if we wanted to add a feature to Java? say that the java
	compiler is written in C, in a file called java.c. So we modify
	java.c, and rerun the C compiler on java.c, producing a new Java
	compiler that understands a new feature of Java

	Now what if we wanted to ad a feature to the C programming
	language? Well, for all practical purposes, the C compiler is
	also written in C, and let's assume that the entire C compiler is
	implemented in a file called "cc.c". To add a feature to the C
	programming language, we need to modify cc.c, and run the old C
	compiler on the new file. At this point, we have a new C
	compiler that understands a new feature of the language.

    C. Context

	As sometimes happens today, earlier versions of Unix were distributed with
	a full set of binaries and source for those binaries. This source included
	source for the compiler, the OS, the program 'login', etc.

	Because the system was quite small, it was common for people to make a
	change in one source file and then to recompile all of their programs. So
	program recompilation happened a lot.

    D. In this environment, how could someone as clever as Thompson add
    a bug to the login program without leaving a trace in the source
    files?

	**GOAL: have no source files hint at the bug, and meanwhile, the
	bug will persist across all recompilations

	[DRAW PICTURES]

    E. How can we write a self-reproducing program in pseudocode?

	X = "Output 'X'. Output '='. Output quote mark. Output X. Output quote mark. Output X."
	Output 'X'. Output '='. Output quote mark. Output X. Output quote mark. Output X.

	Run that, and you get itself.

	Here is a simpler version:

	    Print this followed by its quotation: "Print this followed
	    by its quotation".

	    [BTW, the GNU Public License works like this. It's a
	    self-replicating license! the license specifies that to 
	    make a copy of the code, you have to release the source
	    **with the license itself included**. the license talks
	    about itself, just as a self-replicating program must.]

	Here's a self-replicating program in Scheme:

	    ((lambda (x) `(,x ',x))
	    '(lambda (x) `(,x ',x)))

    F. Result:

	some well-known string in the C compiler source now compiles to
	binary that does the following:

	    <<
	    (1) if compiling "login", insert a bug
	    
	    (2) if you see the well-known string in the C compiler
	    itself, replace it with everything between << >>
	    >>

    G. What's the moral of the story?

2. Protection and security in Unix

    A. Intro

	--why security in the OS?
	    managing resources for different applications
	    must protect different users from one another
		file system
		memory
		processes

	--access control matrix (conceptual construct)

		       File 1       File 2       File 3 ....
	    User 1      r/w
	    User 2                    r
	    User 3                                  w

	--don't maintain matrix manually or entirely

	    --use tools such as groups or role-based access control

	    individuals		roles		    resources
		x                     r1                    a
		y                     r2                    b
		z                                           c

		    [lots of diagonal lines between but not across columns]
     
    B. Example: Unix

	--designed for specific purpose: multiple users time-sharing a
	Unix system 
	
	(i) process has a user ID and one or more group IDs

	--system stores with each file
	    --user who owns the file and group that file is in
	    --permissions for user, anyone in the file's group, and other
	    --can see this by doing "ls -l":
		    rw- rw- r--  <owner>  <group>   ....  <fname>

		    basic operations: read, write, execute [rwx] 

	    --which permissions apply?
		--if process's UID matches <owner>, then user permissions
		--if process has GID matching <group>, then group permissions
		--otherwise, 'other'.

	--directory has permissions too

	    --"read" means, roughly, "can list files in this directory"
	    --"execute" means, roughly, "can use pathnames in this
	    directory"

	(ii) uid 0, called root, treated specially by the kernel as administrator

	    --has all permissions

		--how do uid's get set?
			setuid() call
			uid=0 can change to any other uid
			other uid's cannot invoke setuid(), to a first approximation
		
		--Unix login
		    runs as root
		    checks username, password against /etc/shadow
		    calls setuid(user), runs user's shell

		--(not a great model for login: way too much privileged
		code)
		    rlogind runs "login [-f] username", -f if already 
			authenticated
		    user asks to log in as user "-froot", avoids password check,
			gets root


	(iii) there are certain operations that only root can do

	    Examples:
		--binding to ports less than 1024
		--change current process's user or group ID
		--mount or unmount file systems
		--opening raw sockets (so you can do something like ping remote machines,
		for example)
		--set clock
		--halt or reboot machine
	    
		--change UIDs (so login program needs to run as root)

	    [Problem: you need to have all of root's permission to do
	    *any* of these things (yes, can drop privileges, but we'll
	    see that's easier said than done). That is a *lot* of
	    privilege to do any one action. That is problematic for
	    reasons we'll see next time.]

	(iv) some implicit privileges (next time)