Class 16 CS 480-008 31 March 2016 On the board ------------ 1. Last time 2. Haven Intro Drawbridge Haven design Discussion 3. Some crypto concepts intro public key crypto Diffie-Hellman key exchange --------------------------------------------------------------------------- 1. Last time --defending against untrusted OSes --SGX Note that ARM processors also have security extensions: "TrustZone" technology. Supposedly, iOS (the OS on Apple iPhones) makes use of this. It looks like the protection offered is more coarse-grained than what SGX is providing. Processor is either in trusted mode, or not. Memory isn't encrypted by the processor. the "normal" OS has less work to do, in terms of managing resources. --review memory: does the division of labor between OS, Enclave, and processor leak info? (yes: OS sees faulting virtual page numbers) 2. Haven A. Intro Goal: "Our objective is to run existing server applications in the cloud with a level of trust and security roughly equivalent to a user operating their own hardware in a locked cage at a colocation facility." More specifically, the authors' goal is to execute *unmodified Windows applications* on a cloud platform, under the following Threat model: --System admins control cloud software --Remote attackers may control cloud software --OS may launch "Iago" attacks May pass arbitrary values to Haven May interrupt execution of Haven --Intel and the hardware are trusted: SGX design and fab is correct Intel's private key isn't compromised Unmodified Windows applications means that applications need an environment: send packets store files ... all the services of an operating system What's the challenge? --running binaries unmodified (so there needs to be OS-like services somewhere), but --OS is untrusted (so the OS has to be assumed to misbehave) Response: SGX + Drawbridge (2x) B. Drawbridge What is this? Old idea, new take on it: OS services in user space. High-level goal: isolate apps from each other as if they were running on different virtual machines, but without the overhead of virtual machines. Notice: the goal of this is to protect the platform from the untrusted application (which is the usual goal of isolation but the opposite of the starting goal for the Haven authors) Advantage of this approach versus Virtual Machines? (lighterweight) Advantage of this approach versus the isolation provided by normal process boundaries? Independent evolution of host OS and libOS Ability to migrate application state Stronger isolation; for example, different apps see different file systems How do they do it? win32, win8 \ / \ libOS / \ / \ / DABI -------------------- Host OS libOS is a (very large) library that exposes the entire windows interface to applications but that is implemented in terms of a much smaller set of primitives: DABI = Drawbridge ABI ABI = Application Binary Interface Notice that if the DABI equals "the x86 instruction set", then this picture becomes a representation of virtual machines, and the implementer of the DABI (the "Host OS") is a conventional VMM (=Virtual Machine Monitor) Small interface protects host OS from application C. Haven design See figure 2 Shield module implements API inside enclave interacts with host OS using a narrow, untrusted API untrusted API is a subset of drawbridge's API (see figure 3) [the upcall that they remove is InitializeProcess] Untrusted runtime tunnels between shield in enclave and host kernel also used for bootstrap Host kernel contains SGX driver and drawbridge host drawbridge host implements the narrow API using OS calls Untrusted runtime calls SGX driver in host this driver is what calls ECREATE NOTICE: the Drawbridge ABI shows up *twice*: once to provide OS services, inside the enclave, in terms of something validated (libOS on shield) once to force calls through a narrow interface, where they can be validated (shield on top of untrusted runtime on top of host OS) NOTE: untrusted API/ABI is a subset of Drawbridge's ABI Shield services Virtual memory Enclave starts at 0 (to handle null pointer dereferences by app, libos) otherwise: VA 0 is outside the enclave OS could create an association VA=0 --> PA=x, where x is a valid physical page this would mean that normal NULL pointer dereferences no longer crash, and now are under the control of the OS. Tracking memory pages used by application/libOS Adding/removing memory pages from enclave Verifies that changes have been made correctly Never allows host to pick virtual-memory addresses Doesn't allow application and libOS to allocate pages outside of enclave Threads user-level scheduling (e.g., so that existing bugs in mutexes aren't triggered) multiplexes threads on a fixed number of threads created at startup Allocate a fixed number of TCSs at start *Question*: How do they handle the problem of an untrusted OS mounting Iago attacks? two responses; both are arguably unsatisfying (1) libOS in the process. assumption is that it was validated a priori. okay. (2) shield validates responses from the untrusted host system. we don't really find out how this is done, or get formal guarantees. This could be a source of vulnerabilities... "The interface at the enclave boundary must allow the shield to verify the correctness of all operations.": can such verification can be done for all calls in all cases? the authors have designed the interface to make validation simpler, but without formal guarantees, it's hard to know if there are vulnerabilities. Running binaries unmodified main approach is to be careful about exceptions also have to emulate instructions [no magic here, just lots of work] Attestation more or less what you'd expect D. Discussion Q. What happens if the host OS cheats? A. Shield panics. (Their goal was never availability.) Q. Can Haven run unmodified apps? No: fork. Maybe a minor problem on Windows? No: Cannot map an enclave page at several virtual addresses The authors needed to modify applications Q. How do we know this is secure? Q. Should the authors do fuzz testing on the untrusted interface? Q. What is the relationship between this mechanism and privilege separation? 3. Some crypto concepts For millenia, cryptography was *symmetric-key*: two communicating parties share a key, and want the content of their messages to be hidden from any eavesdroppers. msg --> [Enc alg(k)] --> cipher text --> [Dec alg(k)] --> msg The Enc and Dec algorithms are parameterized by a *secret key*, k. The key k is known only to the two parties. Aside: _Kerckhoffs principle_: "The cipher method must not be required to be secret, and it must be able to fall into the hands of the enemy without inconvenience." (the quotation is from the textbook _Modern Cryptography_, by Katz and Lindell, 2008). Idea: **must** assume that your algorithms are public (and in fact, you should publicize them). The only secret should be the key itself. Why? --easier to secure a key than an algorithm (details of algorithm can be leaked by the person who wrote the code or designed the system; also, details of the algorithm can be inferred by reverse engineering; etc.) --if key is compromised, easier to update a key than to come up with a new algorithm --easier to standardize (should every pair of people have to use different algorithms?!) What's the alternative? "security through obscurity". This is not really an option. It's a terrible practice, and often leads to embarrassment and difficulty for organizations (like commercial companies who try to design their own crypto algorithms in secret): --public designs are reviewed. scrutiny leads to strength --if there are flaws, it's better if the ethical hackers have found them --trying to keep the algorithm private is inherently less secure, because more people are exposed to it (see above) Public key cryptography Motivation: With symmetric-key crypto, every pair of users needs a key... ...and needs to coordinate out of band to share the key This is completely unworkable in open systems, like today's Internet, where users do not meet physically, and users are constantly in touch with servers they have never associated with previously. Other motivating points: What if an authority holds key pairs, or maybe a key per-user, and uses the per-user key to set up shared keys on demand? Problem: the existence of an authority runs counter to the original goal ("only the two endpoints should be able to see the original message"). It's much better if each user has to maintain a *single* secret, instead of a number of secrets proportional to the number of other users and services in the world. [Governments are willing to maintain a per-party secret; anecdote from Lindell about how US embassies decrypted communication.] So, the question: how can two parties, who have never met, communicate over *public channels*, to send each other *private* messages, with zero coordination? Until 1974, everyone assumed that it was impossible to do encryption without the two entities first sharing a secret. Merkle, in 1974, proposed public key cryptography. His paper was not understood at the time. Though his paper was submitted in 1974, it appeared only in 1978. Diffie and Hellman's 1976 paper "New Directions in Cryptography" also proposed public key cryptography, and was a thunderbolt. For that work, Diffie and Hellman were awarded this year's Turing Award. However, many (most?) experts believe that an injustice was done, and that the award should have included Merkle. (There is a famous picture of Diffie, Hellman, and Merkle; when the New York Times reported on the Turing Award, they cropped Merkle out of the photo!) It can be done! You have two keys: private key, public key Everyone knows your public key. Knowing the public key allows others to encrypt messages to you but not to decrypt messages. Also, knowing the public key allows others to check that you signed a document but not to forge your signatures. [We're ignoring the issue of how users learn each other's public keys. This is THE thorny issue with public key cryptography, and -- arguably -- it still hasn't been adequately solved. But it may be solved, yet: Max Krohn's keybase is promising.] We'll look at a few primitives: key exchange public-key encryption digital signatures Diffie-Hellman key exchange To avoid digressing into math, we are going to make some simplifying and wrong assumptions. We'll flag them below. Assume that we are working over the positive integers mod a prime, p: {1,2,....,p-1} g is selected to be an element that will *generate* this group, in the sense that: g^1 mod p, g^2 mod p, ... ... g^{p-1} mod p will be a permutation of the integers {1,2,....,p-1} Example: if p=5, our domain is {1,2,3,4}, and 2 is a generator, because: 2^1 mod 5 = 2 2^2 mod 5 = 4 2^3 mod 5 = 3 2^4 mod 5 = 1 [And for all k >= 1, 2^{k} mod 5 = 2^{k mod 4} mod 5] p=5 is far too small for security. However, when p is of reasonable size (thousands of bits), the discrete log problem is assumed to be hard. To say that the discrete log problem is hard means, roughly, that if you are given g and g^x (mod p), you cannot compute x. In other words, you cannot take logs. FALSE ASSUMPTION: the protocol below is secure if we are working over a group where the discrete log problem is hard. In reality, for the protocol below to hold up, we need to be working over a different domain. We want a particular kind of subgroup of the domain above. (Technically, we want a subgroup where the Decisional Diffie-Hellman assumption is assumed to be hard. "DDH is hard" is a stronger assumption, and hence slightly lower quality, than "Discrete Log is hard". However, many existing cryptosystems are built on the assumption that DDH is hard, and it's considered to be a good quality (safe) cryptographic assumption.) Now, assume that Alice and Bob have agreed on p and a generator g, of the integers {1,2,....,p-1} [this is again a simplification because the true domain won't be {1,...,p-1}] Alice Bob ----- ----- choose x choose y public key: g^x public key: g^y private key: x private key: y a=g^x -------------> <------------- b=g^y k1=b^x (mod p) k2=a^y (mod p) Notice: k1=(g^y)^x = g^{yx} mod p k2=(g^x)^y = g^{xy} mod p k1 = k2 So Alice and Bob have a shared secret, based only on knowledge of each other's public keys (!!!) --------------------------------------------------------------------------- References: SGX Overview: http://www.pdl.cmu.edu/SDI/2013/slides/rozas-SGX.pdf SGX Instructions overview https://software.intel.com/sites/default/files/article/413936/hasp-2013-innovative-instructions-and-software-model-for-isolated-execution.pdf SGX hardware https://jbeekman.nl/blog/2015/10/sgx-hardware-first-look/ SGX Security discussion: https://www.nccgroup.trust/uk/about-us/newsroom-and-events/blogs/2015/january/intel-software-guard-extensions-sgx-a-researchers-primer/ Iago attacks https://cseweb.ucsd.edu/~hovav/dist/iago.pdf Drawbridge http://research.microsoft.com/pubs/141071/asplos2011-drawbridge.pdf http://research.microsoft.com/pubs/180156/bascule_eurosys13.pdf Acknowledgments: MIT's 6.858 staff Crypto A good text is "Modern Cryptography", by Jonathan Katz and Yehuda Lindell.