Network security: Context
==============
* What is SSL/TLS?
  --> SSL was the predecessor to TLS. SSL is now officially deprecated.
* TLS:
  --> Transport layer security
  --> Send encrypted data over TCP from yourself to a specific recipient while ensuring:
      ** Confidentiality: No one else can see your data
      ** Integrity: No one can tamper with the data and fool the recipient into receiving wrong data
      ** Authentication: Ensures you are really talking to a specific recipient (e.g., Google) and not an imposter.
      ** These properties are independent: you can have one without the other in theory. In practice, they go together.
* Crypto primitives
  --> PubEncrypt(publickey, cleartext) -> ciphertext
  --> PubDecrypt(privatekey, ciphertext) -> cleartext
  --> SymEncrypt(key, cleartext) -> ciphertext
  --> SymDecrypt(key, ciphertext) -> cleartext
  --> Sign(privatekey, message) -> signature
  --> Verifiy(publickey, signature) -> message 
      (Can use the same key for sign and verify. In this case, it's called a message authentication code.)
  --> Certificate: Entity name, public key, signature by an endorser, endorser name
* TLS:
  --> Runs within TCP
  --> Port 443
  --> Authentication: Server presents a certificate signed by a certificate authority.
      Certificate authority is endorsed by another CA and so on recursively until the root CA.
  --> Confidentiality: Can use PubEncrypt and PubDecrypt but that's slow.
      Instead, create symmetric keys during the TLS handshake. All data transfer uses symmetric keys.
  --> Integrity: Use Sign to add a signature to the encrypted message using symmetric key.
      Use Verify at the receiver to check that the encrypted message has not been tampered with. 

Heartbleed: Context
==============
* Major vulnerability in OpenSSL, which implements the SSL/TLS protocols
* Why is this important? Almost everyone uses OpenSSL:
  --> The Apache and Nginx web servers.
      (Apache+Nginx together made up 66% of the active web sites as of 2014. Unclear if all
      were vulnerable though).
  --> MySQL, MongoDB, Redis, etc.
  --> Companies: Google and Akamai (Microsoft uses IIS, which wasn't affected by this bug.)
  --> Non-web use cases:
      ** Tor clients and relays
      ** Bitcoin clients
      ** chat servers
      ** email servers
      ** Android devices
      ** Wireless (Extended Authentication Protocol)
 
The bug itself
================
* Read remote memory either on a vulnerable client or a vulnerable server.
* Really simply bug.
  --> Specify a payload length in the heartbeat request that is larger than the amount of data in the heartbeat request.
  --> Allows you to read up to 65K of remote memory.
  --> Fix was quite simple: check that the amount of the data matched up with the length field.
  (https://www.seancassidy.me/diagnosis-of-the-openssl-heartbleed-bug.html explains the bug and the fix quite well using the
   actual OpenSSL code).
  --> Very rarely used part of TLS. Probably what hid the bug for so long.
* But what's the problem with reading remote memory?
  --> Can read *anything*.
  --> Passwords, usernames, private keys, any other sensitive information.
  --> Literally anything that's close to that particular buffer on the heap.
* How bad can it really be?
  --> Cloudflare created a challenge to crowdsource this question
  --> https://blog.cloudflare.com/answering-the-critical-question-can-you-get-private-ssl-keys-using-heartbleed/
  --> And learned that their private keys could be extracted by repeatedly scanning for strings of length equal to the
      key length from the remote memory dumped by heartbleed.
  --> The attack amounts to random guessing but is quite effective.
      (https://www.lightbluetouchpaper.org/2014/04/25/heartbleed-and-rsa-private-keys/)

The paper's scanning methodology
=================
* Started scanning about 48 hours after disclosure.
* Used a tool called ZMap to scan open hosts throughout the Internet.
* ZMap Scan used heartbeat requests with payloads of size 0
  --> DTLS RFC states that you should reject this request.
  --> But Heartbleed-compromised hosts repond with only padding. And patched hosts don't.
  --> In theory, it's possible that this 0-byte request behavior is symptomatic of some other bug.
  --> In practice, quite likely that it's triggering the same Heartbleed bug.
  --> Why do this instead of requesting 65K of data? They did this to prevent exploiting the vulnerability.
  --> (In general, detecting the presence of these vulnerabilities while
      staying withing legal and ethical boundaries is quite an interesting technical
      challenge in of itself.)
* False negatives
  --> Bug in tool: Timeout in their tool was labeled as immune.
  --> In reality, should have been labeled as unknown and data should not have been included in analysis.
  --> Why couldn't they fix the bug and rerun the experiment?
      --> Because they would have lost old data that was most valuable to the phenomenon being measured.
      --> General problem with measurement-based research:
          ** Need to collect enough redundant data to ensure you can diagnose bugs and
             salvage data after the fact---instead of throwing it all out.
          ** Unlike building a system and evaluating it, can't just rerun evaluation: phenomenon may have disappeared.
          ** In general, measurement research takes a lot of care to design experiments carefully.
          ** ... and careful analysis after the fact.
  --> Net impact of false negatives: Understimated number of vulnerable hosts

Heartbleed's impact (as per the paper)
=================
* Alexa Top 100
  --> All of the Alexa Top 100 were patched before they began scanning.
  --> Al-Bassam completed a vulnerability scan of the Alexa Top 10K 22 hours after disclosure.
  --> 22 hours after disclosure his scan found Yahoo, Stack Overflow, Flickr, OkCupid, and a few more vulnerable. (Table 3.)
  --> Summary: Combining various sources, they estimate 44 of the Alexa Top 100 were vulnerable before Heartbleed was discovered.
* Alexa Top 1M
  --> Used their own scan to determine number of vulnerable servers:
      * Upper bound: All Apache/Nginx servers out of the HTTPS servers they found (91% of 60% = 55%)
      * Lower bound: All vulnerable servers with TLS 1.1 and 1.2 prior to Heartbleed disclosure (72.7% of 32.6% = 23.7%)
        (Why TLS 1.1 and 1.2: those were the TLS versions in which Heartbleed was introduced.
         Assume that no one bothered to rebuild without DTLS because no one really even knew about DTLS before Heartbleed!)
  --> Summary: 24 to 55% of the top 1M was vulnerable before disclosure.
* Some (Google, Akamai) disabled Heartbleed before public disclosure.
* Internet-wide: Many vulnerable hosts (most concentrated in 10 ASes: hosting companies or cloud providers)
* Other vulnerable devices: printers, VoIP systems, video conference systems, Network Attached Storage, Firewalls.
* Beyond HTTPS:
  --> Tor
  --> Mail servers
  --> Bitcoin
  --> Android
* Overall summary: Pretty bad! A good fraction of the Internet was vulnerable.

Heartbleed's aftermath (the patching)
===================
* Again popular web sites did quite well at patching: only 5 of the top 100 were vulnerable in Al-Bassam's scan
* In their scan, the only web site that remained unpatched was at rank 689.
* Internet-wide: Sharp drop when one AS patched many computers in a short period of time.
* But patching plateaued after two weeks: 3% of the Alexa Top 1 Million were vulnerable 2 months after disclosure.
  --> Basically the more popular the web site the more security conscious it seems to be.

Certificate revocations
===================
* Private keys could have been leaked (and the Cloudflare challenge shows it can happen.)
* Need to replace certificates and revoke old ones and change the private keys.
* This is quite concerning:
  --> 2 days after disclosure, they found a certain number of sites vulnerable.
  --> Only 10.1.% replaced certificates even a month later, even though 73% patched their systems.
  --> Only 19.1% of the sites that replaced certificates revoked their old one.
  --> Only 14.1% of the sites that replaced certificates changed their private key (doesn't give you any protection).
      (Unclear how they determined this 14.1 number.)
* Revocation costs network bandwidth, and many CAs were reluctant to do it immediately.
* Importance of forward secrecy: private key compromise can compromise old conversations as well
  --> Even after Heartbleed, only 44% of the connections used forward secrecy.
  --> Doesn't seem to have changed things.

Attacks
===================
* Used passive taps at LBNL, ICSI, etc. to see if someone was scanning for Heartbleed.
* No clear evidence of scanning for Heartbleed before disclosure.
* Quite a bit of activity after disclosure (Figure 8): 5948 connection attemps.
* But no clear temporal pattern.

Notification
===================
* Emailed operators (from WHOIS records)
* Ran a controlled experiment by emailing one set of operators and holding back on emailing the others
* Notifications seem to have had a positive impact on patching.

Discussion
==================
* "Using HTTPS introduced vulnerabilities that were more severe than using HTTP"
* We don't have support for mass revocation.
* How did the bug even arise in the first place? To me, this is the most interesting follow-up question.
  --> It turns out OpenSSL is massively understaffed and underpaid
      (http://veridicalsystems.com/blog/of-money-responsibility-and-pride/)
  --> According to Wikipedia:
      "The OpenSSL project management team consists of four people, and the entire
       development group consists of 11 members, out of which 10 are volunteers.
       There is only one full-time employee, Stephen Henson, the lead developer."
  --> Despite the reliance of half of the Internet (by some measure) on it.
  --> Things seem to be changing now.
      (https://arstechnica.com/information-technology/2014/04/tech-giants-chastened-by-heartbleed-finally-agree-to-fund-openssl/)
  --> Other open source projects like Linux have had much more support.
* Balancing advance disclosure vs. premature leaks. Who to notify?

Postscript
==================
* Great example of a network measurement paper. IMC is a prominent venue for such work.
* Very timely (Starting a project 2 days after the disclosure is as good as it gets.)
* Incidental: the whole paper was probably initiated, executed, and submitted in a month:
  --> Heartbleed was disclosed on April 6; the IMC deadline was May 7!
* Entire session on Heartbleed and SSL at IMC 2014.
* Paper combines measurements from a variety of sources over time.
* Actually follows through with notifications to vulnerable hosts as well.
* And is careful about assumptions and provides ranges wherever it isn't sure
  (e.g., false negative rate, number of vulnerable hosts in Alexa Top 1M at time of disclosure etc.)