Network security: Context ============== * What is SSL/TLS? --> SSL was the predecessor to TLS. SSL is now officially deprecated. * TLS: --> Transport layer security --> Send encrypted data over TCP from yourself to a specific recipient while ensuring: ** Confidentiality: No one else can see your data ** Integrity: No one can tamper with the data and fool the recipient into receiving wrong data ** Authentication: Ensures you are really talking to a specific recipient (e.g., Google) and not an imposter. ** These properties are independent: you can have one without the other in theory. In practice, they go together. * Crypto primitives --> PubEncrypt(publickey, cleartext) -> ciphertext --> PubDecrypt(privatekey, ciphertext) -> cleartext --> SymEncrypt(key, cleartext) -> ciphertext --> SymDecrypt(key, ciphertext) -> cleartext --> Sign(privatekey, message) -> signature --> Verifiy(publickey, signature) -> message (Can use the same key for sign and verify. In this case, it's called a message authentication code.) --> Certificate: Entity name, public key, signature by an endorser, endorser name * TLS: --> Runs within TCP --> Port 443 --> Authentication: Server presents a certificate signed by a certificate authority. Certificate authority is endorsed by another CA and so on recursively until the root CA. --> Confidentiality: Can use PubEncrypt and PubDecrypt but that's slow. Instead, create symmetric keys during the TLS handshake. All data transfer uses symmetric keys. --> Integrity: Use Sign to add a signature to the encrypted message using symmetric key. Use Verify at the receiver to check that the encrypted message has not been tampered with. Heartbleed: Context ============== * Major vulnerability in OpenSSL, which implements the SSL/TLS protocols * Why is this important? Almost everyone uses OpenSSL: --> The Apache and Nginx web servers. (Apache+Nginx together made up 66% of the active web sites as of 2014. Unclear if all were vulnerable though). --> MySQL, MongoDB, Redis, etc. --> Companies: Google and Akamai (Microsoft uses IIS, which wasn't affected by this bug.) --> Non-web use cases: ** Tor clients and relays ** Bitcoin clients ** chat servers ** email servers ** Android devices ** Wireless (Extended Authentication Protocol) The bug itself ================ * Read remote memory either on a vulnerable client or a vulnerable server. * Really simply bug. --> Specify a payload length in the heartbeat request that is larger than the amount of data in the heartbeat request. --> Allows you to read up to 65K of remote memory. --> Fix was quite simple: check that the amount of the data matched up with the length field. (https://www.seancassidy.me/diagnosis-of-the-openssl-heartbleed-bug.html explains the bug and the fix quite well using the actual OpenSSL code). --> Very rarely used part of TLS. Probably what hid the bug for so long. * But what's the problem with reading remote memory? --> Can read *anything*. --> Passwords, usernames, private keys, any other sensitive information. --> Literally anything that's close to that particular buffer on the heap. * How bad can it really be? --> Cloudflare created a challenge to crowdsource this question --> https://blog.cloudflare.com/answering-the-critical-question-can-you-get-private-ssl-keys-using-heartbleed/ --> And learned that their private keys could be extracted by repeatedly scanning for strings of length equal to the key length from the remote memory dumped by heartbleed. --> The attack amounts to random guessing but is quite effective. (https://www.lightbluetouchpaper.org/2014/04/25/heartbleed-and-rsa-private-keys/) The paper's scanning methodology ================= * Started scanning about 48 hours after disclosure. * Used a tool called ZMap to scan open hosts throughout the Internet. * ZMap Scan used heartbeat requests with payloads of size 0 --> DTLS RFC states that you should reject this request. --> But Heartbleed-compromised hosts repond with only padding. And patched hosts don't. --> In theory, it's possible that this 0-byte request behavior is symptomatic of some other bug. --> In practice, quite likely that it's triggering the same Heartbleed bug. --> Why do this instead of requesting 65K of data? They did this to prevent exploiting the vulnerability. --> (In general, detecting the presence of these vulnerabilities while staying withing legal and ethical boundaries is quite an interesting technical challenge in of itself.) * False negatives --> Bug in tool: Timeout in their tool was labeled as immune. --> In reality, should have been labeled as unknown and data should not have been included in analysis. --> Why couldn't they fix the bug and rerun the experiment? --> Because they would have lost old data that was most valuable to the phenomenon being measured. --> General problem with measurement-based research: ** Need to collect enough redundant data to ensure you can diagnose bugs and salvage data after the fact---instead of throwing it all out. ** Unlike building a system and evaluating it, can't just rerun evaluation: phenomenon may have disappeared. ** In general, measurement research takes a lot of care to design experiments carefully. ** ... and careful analysis after the fact. --> Net impact of false negatives: Understimated number of vulnerable hosts Heartbleed's impact (as per the paper) ================= * Alexa Top 100 --> All of the Alexa Top 100 were patched before they began scanning. --> Al-Bassam completed a vulnerability scan of the Alexa Top 10K 22 hours after disclosure. --> 22 hours after disclosure his scan found Yahoo, Stack Overflow, Flickr, OkCupid, and a few more vulnerable. (Table 3.) --> Summary: Combining various sources, they estimate 44 of the Alexa Top 100 were vulnerable before Heartbleed was discovered. * Alexa Top 1M --> Used their own scan to determine number of vulnerable servers: * Upper bound: All Apache/Nginx servers out of the HTTPS servers they found (91% of 60% = 55%) * Lower bound: All vulnerable servers with TLS 1.1 and 1.2 prior to Heartbleed disclosure (72.7% of 32.6% = 23.7%) (Why TLS 1.1 and 1.2: those were the TLS versions in which Heartbleed was introduced. Assume that no one bothered to rebuild without DTLS because no one really even knew about DTLS before Heartbleed!) --> Summary: 24 to 55% of the top 1M was vulnerable before disclosure. * Some (Google, Akamai) disabled Heartbleed before public disclosure. * Internet-wide: Many vulnerable hosts (most concentrated in 10 ASes: hosting companies or cloud providers) * Other vulnerable devices: printers, VoIP systems, video conference systems, Network Attached Storage, Firewalls. * Beyond HTTPS: --> Tor --> Mail servers --> Bitcoin --> Android * Overall summary: Pretty bad! A good fraction of the Internet was vulnerable. Heartbleed's aftermath (the patching) =================== * Again popular web sites did quite well at patching: only 5 of the top 100 were vulnerable in Al-Bassam's scan * In their scan, the only web site that remained unpatched was at rank 689. * Internet-wide: Sharp drop when one AS patched many computers in a short period of time. * But patching plateaued after two weeks: 3% of the Alexa Top 1 Million were vulnerable 2 months after disclosure. --> Basically the more popular the web site the more security conscious it seems to be. Certificate revocations =================== * Private keys could have been leaked (and the Cloudflare challenge shows it can happen.) * Need to replace certificates and revoke old ones and change the private keys. * This is quite concerning: --> 2 days after disclosure, they found a certain number of sites vulnerable. --> Only 10.1.% replaced certificates even a month later, even though 73% patched their systems. --> Only 19.1% of the sites that replaced certificates revoked their old one. --> Only 14.1% of the sites that replaced certificates changed their private key (doesn't give you any protection). (Unclear how they determined this 14.1 number.) * Revocation costs network bandwidth, and many CAs were reluctant to do it immediately. * Importance of forward secrecy: private key compromise can compromise old conversations as well --> Even after Heartbleed, only 44% of the connections used forward secrecy. --> Doesn't seem to have changed things. Attacks =================== * Used passive taps at LBNL, ICSI, etc. to see if someone was scanning for Heartbleed. * No clear evidence of scanning for Heartbleed before disclosure. * Quite a bit of activity after disclosure (Figure 8): 5948 connection attemps. * But no clear temporal pattern. Notification =================== * Emailed operators (from WHOIS records) * Ran a controlled experiment by emailing one set of operators and holding back on emailing the others * Notifications seem to have had a positive impact on patching. Discussion ================== * "Using HTTPS introduced vulnerabilities that were more severe than using HTTP" * We don't have support for mass revocation. * How did the bug even arise in the first place? To me, this is the most interesting follow-up question. --> It turns out OpenSSL is massively understaffed and underpaid (http://veridicalsystems.com/blog/of-money-responsibility-and-pride/) --> According to Wikipedia: "The OpenSSL project management team consists of four people, and the entire development group consists of 11 members, out of which 10 are volunteers. There is only one full-time employee, Stephen Henson, the lead developer." --> Despite the reliance of half of the Internet (by some measure) on it. --> Things seem to be changing now. (https://arstechnica.com/information-technology/2014/04/tech-giants-chastened-by-heartbleed-finally-agree-to-fund-openssl/) --> Other open source projects like Linux have had much more support. * Balancing advance disclosure vs. premature leaks. Who to notify? Postscript ================== * Great example of a network measurement paper. IMC is a prominent venue for such work. * Very timely (Starting a project 2 days after the disclosure is as good as it gets.) * Incidental: the whole paper was probably initiated, executed, and submitted in a month: --> Heartbleed was disclosed on April 6; the IMC deadline was May 7! * Entire session on Heartbleed and SSL at IMC 2014. * Paper combines measurements from a variety of sources over time. * Actually follows through with notifications to vulnerable hosts as well. * And is careful about assumptions and provides ranges wherever it isn't sure (e.g., false negative rate, number of vulnerable hosts in Alexa Top 1M at time of disclosure etc.)