Class 28 CS 372H 4 May 2010 1. Trusting trust 2. Unix security --access control model --consequences --------------------------------------------------------------------------- 0. quick clarifications/corrections --Two-phase commit. Correction: if acknowledgments go back from workers to coordinator at the end of phase 2, then the coordinator does not have to keep the log of that entry forever. --x86 does have a no-execute bit even on 32-bit architectures. have to use extended page tables --I probably was unfair to Java last time. JVM way less likely to have bugs than a hand-coded networked server in C. 1. "Reflections on Trusting Trust" --first of all, the word "trust" is a bad thing in computer security (this is an unfortunate linguistic fact). to "trust" something means to "assume it correct", which in turn means "to be in trouble if the assumption is false". so "removing trust" is a good thing. so is making things "trust*worthy*" (that is, worthy of being assumed correct), but it is in general hard to make any given component truly trustworthy. --you'll notice that the "trusted computing" initiatives from various powerful interests subvert this word. who exactly is being trusted? "trusted computing" sounds great linguistically, but "trusted computing platforms" do not actually mean what they sound like (per Keith's lecture) A. background on this paper by Thompson: Thompson gave this lecture/paper after winning the Turing Award, which is considered by many to be the Nobel prize of Computer Science. The paper is stunning but takes patience and a few readings to understand. We're going to reproduce most of what Thompson did but will follow the ideas in an order different from the one in the paper. B. adding a feature to a language What if we wanted to add a feature to Java? say that the java compiler is written in C, in a file called java.c. So we modify java.c, and rerun the C compiler on java.c, producing a new Java compiler that understands a new feature of Java Now what if we wanted to ad a feature to the C programming language? Well, for all practical purposes, the C compiler is also written in C, and let's assume that the entire C compiler is implemented in a file called "cc.c". To add a feature to the C programming language, we need to modify cc.c, and run the old C compiler on the new file. At this point, we have a new C compiler that understands a new feature of the language. C. Context As sometimes happens today, earlier versions of Unix were distributed with a full set of binaries and source for those binaries. This source included source for the compiler, the OS, the program 'login', etc. Because the system was quite small, it was common for people to make a change in one source file and then to recompile all of their programs. So program recompilation happened a lot. D. In this environment, how could someone as clever as Thompson add a bug to the login program without leaving a trace in the source files? **GOAL: have no source files hint at the bug, and meanwhile, the bug will persist across all recompilations [DRAW PICTURES] E. How can we write a self-reproducing program in pseudocode? X = "Output 'X'. Output '='. Output quote mark. Output X. Output quote mark. Output X." Output 'X'. Output '='. Output quote mark. Output X. Output quote mark. Output X. Run that, and you get itself. Here is a simpler version: Print this followed by its quotation: "Print this followed by its quotation". [BTW, the GNU Public License works like this. It's a self-replicating license! the license specifies that to make a copy of the code, you have to release the source **with the license itself included**. the license talks about itself, just as a self-replicating program must.] Here's a self-replicating program in Scheme: ((lambda (x) `(,x ',x)) '(lambda (x) `(,x ',x))) F. Result: some well-known string in the C compiler source now compiles to binary that does the following: << (1) if compiling "login", insert a bug (2) if you see the well-known string in the C compiler itself, replace it with everything between << >> >> G. What's the moral of the story? 2. Protection and security in Unix A. Intro --why security in the OS? managing resources for different applications must protect different users from one another file system memory processes --access control matrix (conceptual construct) File 1 File 2 File 3 .... User 1 r/w User 2 r User 3 w --don't maintain matrix manually or entirely --use tools such as groups or role-based access control individuals roles resources x r1 a y r2 b z c [lots of diagonal lines between but not across columns] B. Example: Unix --designed for specific purpose: multiple users time-sharing a Unix system (i) process has a user ID and one or more group IDs --system stores with each file --user who owns the file and group that file is in --permissions for user, anyone in the file's group, and other --can see this by doing "ls -l": rw- rw- r-- .... basic operations: read, write, execute [rwx] --which permissions apply? --if process's UID matches , then user permissions --if process has GID matching , then group permissions --otherwise, 'other'. --directory has permissions too --"read" means, roughly, "can list files in this directory" --"execute" means, roughly, "can use pathnames in this directory" (ii) uid 0, called root, treated specially by the kernel as administrator --has all permissions --how do uid's get set? setuid() call uid=0 can change to any other uid other uid's cannot invoke setuid(), to a first approximation --Unix login runs as root checks username, password against /etc/shadow calls setuid(user), runs user's shell --(not a great model for login: way too much privileged code) rlogind runs "login [-f] username", -f if already authenticated user asks to log in as user "-froot", avoids password check, gets root (iii) there are certain operations that only root can do Examples: --binding to ports less than 1024 --change current process's user or group ID --mount or unmount file systems --opening raw sockets (so you can do something like ping remote machines, for example) --set clock --halt or reboot machine --change UIDs (so login program needs to run as root) [Problem: you need to have all of root's permission to do *any* of these things (yes, can drop privileges, but we'll see that's easier said than done). That is a *lot* of privilege to do any one action. That is problematic for reasons we'll see next time.] (iv) some implicit privileges (next time)