Class 26
CS 480-008
3 May 2016

On the board
------------

1. Last time
2. Certificates
3. HTTPS
    --plan
    --what goes wrong
4. admin notes
5. ForceHTTPS

---------------------------------------------------------------------------

1. Last time

    --Network security

    --Crypto protection for network communications

    --Last time:

        Recall strawman:
            A generates a random symmetric session key S.
            A encrypts S for PK_B, sends to B.
            Now we have secret key S shared between A and B, can encrypt and
              authenticate messages using symmetric encryption

        Discussed three attacks: replay, later decryption, impersonation.

        Clarify impersonation:

            --This is only an attack if B cared who A was in the first
            place.

            --In the event that B does care who A is, there are solutions:
            client-side certificates, etc.

    --Today: HTTPS

2. Background: what are certificates?

    --certificates are signed statements that assert a mapping between
    an entity's name (such as a legal name and/or a domain name) and its
    public key 

    --often, there is a certificate chain: one entity signs a statement,
    but how do you trust the signature? A: in the Web context, your
    browser is born trusting certain certificate authorities (CAs).
    Those authorities sign statements like, "Web site www.foo.com has
    public key 0xf120.....".

    --with certificates, in general, there can be certificate *chains*
    (one cert attests to the validity of another public key; that public
    key attests to the validity of a separate public key; and so on).

3. HTTPS

    Motivating question: how to incorporate the cryptographic protection
    of network traffic into the Web security model?

    Plan for securing web browsers: HTTPS.
      New protocol: https instead of http (e.g., https://www.paypal.com/).
      Need to protect several things:
        A. Data sent over the network.
        B. Code/data in user's browser.
        C. UI seen by the user.

      A. How to ensure data is not sniffed or tampered with on the network?
        Use TLS (a cryptographic protocol that uses certificates).
        TLS encrypts and authenticates network traffic.
        Negotiate ciphers (and other features: compression, extensions).
        Negotiation is done in clear.
        Include a MAC of all handshake messages to authenticate.

      B. How to protect data and code in the user's browser?
        Goal: connect browser security mechanisms to whatever TLS provides.
        Recall that browser has two main security mechanisms:
          Same-origin policy.
          Cookie policy (slightly different).

        Same-origin policy with HTTPS/TLS.

          TLS certificate name must match hostname in the URL
            In our example, certificate name must be www.paypal.com.
            One level of wildcard is also allowed (*.paypal.com)
            Browsers trust a number of certificate authorities.

          Origin (from the same-origin policy) includes the protocol.
            http://www.paypal.com/ is different from https://www.paypal.com/
            Here, we care about integrity of data (e.g., Javascript code).
            Result: non-HTTPS pages cannot tamper with HTTPS pages.
            Rationale: non-HTTPS pages could have been modified by adversary.

        Cookies with HTTPS/TLS.
          Server certificates help clients differentiate between servers.
          Cookies (common form of user credentials) have a "Secure" flag.
          Secure cookies can only be sent with HTTPS requests.
          Non-Secure cookies can be sent with HTTP and HTTPS requests.

        What happens if adversary tampers with DNS records?
          Good news: security doesn't depend on DNS (at least, that's
            the plan)
          We already assumed adversary can tamper with network packets.
          Wrong server will not know correct private key matching certificate.

      C. Finally, users can enter credentials directly.  How to secure that?
        Lock icon in the browser tells user they're interacting with HTTPS site.
        Browser should indicate to the user the name in the site's certificate.
        User should verify site name they intend to give credentials to.

    How can this plan go wrong?
      As you might expect, every step above can go wrong.
      Not an exhaustive list, but gets at problems that ForceHTTPS wants to solve.

    1 (A). Cryptography.

    There have been some attacks on the cryptographic parts of SSL/TLS.
      Attack by Rizzo and Duong can allow adversary to learn some plaintext by
        issuing many carefully-chosen requests over a single connection.
        [ Ref: http://www.educatedguesswork.org/2011/09/security_impact_of_the_rizzodu.html ]
      Recent attack by same people using compression
        [ Ref: http://en.wikipedia.org/wiki/CRIME ]
      Most recently, more padding oracle attacks.
        [ Ref: https://www.openssl.org/~bodo/ssl-poodle.pdf ]
      Some servers/CAs use weak crypto, e.g. certificates using MD5.
      Some clients choose weak crypto (e.g., SSL/TLS on Android).
        [ Ref: http://op-co.de/blog/posts/android_ssl_downgrade/ ]
      Some implementations use bad randomness to generate keys
        [ Ref: https://wiki.debian.org/SSLkeys#End_User_Summary ]
      But, cryptography is rarely the weakest part of a system.

    2 (B). Authenticating the server.

    Adversary may be able to obtain a certificate for someone else's name.
      Used to require a faxed request on company letterhead (but how to check?)
      Now often requires receiving secret token at root@domain.com or similar.
      Security depends on the policy of least secure certificate authority.
      There are 100's of trusted certificate authorities in most browsers.
      Several CA compromises in 2011 (certs for gmail, facebook, ..)
        [ Ref: DigiNotar ]
        [ Ref: http://dankaminsky.com/2011/08/31/notnotar/ ]
      Servers may be compromised and the corresponding private key stolen.

    How to deal with compromised certificate (e.g., invalid cert or stolen key)?
      Certificates have expiration dates.
      Checking certificate status with CA on every request is hard to scale.
      Certificate Revocation List (CRL) published by some CAs, but relatively
        few certificates in them (spot-checking: most have zero revoked certs).
      CRL must be periodically downloaded by client.
        Could be slow, if many certs are revoked.
        Not a problem if few or zero certs are revoked, but not too useful.
      OCSP: online certificate status protocol.
        Query whether a certificate is valid or not.
        One issue: OCSP protocol didn't require signing "try later" messages.
          [ Ref: http://www.thoughtcrime.org/papers/ocsp-attack.pdf ]
      Various heuristics for guessing whether certificate is OK or not.
        CertPatrol, EFF's SSL Observatory, ..
        Not as easy as "did the cert change?". Websites sometimes test new CAs.
      Problem: online revocation checks are soft-fail.
        An active network attacker can just make the checks unavailable.
        Browsers don't like blocking on a side channel.
          Performance, single point of failure, captive portals, etc.
        [ Ref: https://www.imperialviolet.org/2011/03/18/revocation.html ]
      In practice browsers push updates with blacklist after major breaches.
        [ Ref: https://www.imperialviolet.org/2012/02/05/crlsets.html ]

    Users ignore certificate mismatch errors.
      Despite certificates being easy to obtain, many sites misconfigure them.
      Some don't want to deal with (non-zero) cost of getting certificates.
      Others forget to renew them (certificates have expiration dates).
      End result: browsers allow users to override mismatched certificates.
        Problematic: human is now part of the process in deciding if cert is valid.
        Hard for developers to exactly know what certs will be accepted by browsers.
      Empirically, about 60% of bypass buttons shown by Chrome are clicked through.
        (Though this data might be stale at this point..)

    What's the risk of a user accepting an invalid certificate?
      Might be benign (expired cert, server operator forgot to renew).
      Might be a man-in-the-middle attack, connecting to adversary's server.
      Why is this bad?
        User's browser will send user's cookies to the adversary.
        User might enter sensitive data into adversary's website.
        User might assume data on the page is coming from the right site.

    3 (B). Mixing HTTP and HTTPS content.
      Web page origin is determined by the URL of the page itself.
      Page can have many embedded elements:
        Javascript via <SCRIPT> tags
        CSS style sheets via <STYLE> tags
        Flash code via <EMBED> tags
        Images via <IMG> tags
      If adversary can tamper with these elements, could control the page.
        In particular, Javascript and Flash code give control over page.
        CSS: less control, but still abusable, esp w/ complex attribute selectors.
      Probably the developer wouldn't include Javascript from attacker's site.
      But, if the URL is non-HTTPS, adversary can tamper with HTTP response.

      Alternative approach: explicitly authenticate embedded elements.
      E.g., could include a hash of the Javascript code being loaded.
        Prevents an adversary from tampering with response.
        Does not require full HTTPS.
      Might be deployed in browsers in the near future.
        [ Ref: http://www.w3.org/TR/SRI/ ]
      [We also discussed this in the browser security unit.]

    4 (B). Protecting cookies.
      Web application developer could make a mistake, forgets the Secure flag.
      User visits http://bank.com/ instead of https://bank.com/, leaks cookie.

      Suppose the user only visits https://bank.com/.  Why is this still a problem?
      1. Adversary can cause another HTTP site to redirect to http://bank.com/.
      2. Even if user never visits any HTTP site, application code might be buggy.
        Some sites serve login forms over HTTPS and serve other content over HTTP.
        Be careful when serving over both HTTP and HTTPS.
          E.g., Google's login service creates new cookies on request.
          Login service has its own (Secure) cookie.
          Can request login to a Google site by loading login's HTTPS URL.
          Used to be able to also login via cookie that wasn't Secure.
          ForceHTTPS solves problem by redirecting HTTP URLs to HTTPS.
          [ Ref: http://blog.icir.org/2008/02/sidejacking-forced-sidejacking-and.html ]

      Cookie integrity problems.
        Non-Secure cookies set on http://bank.com still sent to https://bank.com.
        No way to determine who set the cookie.

    5 (C). Users directly entering credentials.
      Phishing attacks.
      Users don't check for lock icon.
      Users don't carefully check domain name, don't know what to look for.
        E.g., typo domains (paypa1.com), unicode
      Web developers put login forms on HTTP pages (target login script is HTTPS).
        Adversary can modify login form to point to another URL.
        Login form not protected from tampering, user has no way to tell.

4. admin notes

    --final will be cumulative

    --open vs closed:
        --closed computer
        --open papers 
        --open handwritten notes
        --maybe open printed class notes

    --grades: we'll get those to you so we can sync, etc.

    --final exam review? can we do this on the 10th?

5. ForceHTTPS

    How does ForceHTTPS (this paper) address some of these problems?
      Server can set a flag for its own hostname in the user's browser.
      - Makes SSL/TLS certificate misconfigurations into a fatal error.
      - Redirects HTTP requests to HTTPS.
      - Prohibits non-HTTPS embedding (+ performs ForceHTTPS for them).

    What problems does ForceHTTPS solve?
      Mostly 2, 3, and to some extent 4.
        Users accepting invalid certificates.
        Developer mistakes: embedding insecure content.
        Developer mistakes: forgetting to flag cookie as Secure.
        Adversary injecting cookies via HTTP.

    Is this really necessary? Can we just only use HTTPS, set Secure cookies, etc?
      Users can still click-through errors, so it still helps for #2.
      Not necessary for #3 assuming the web developer never makes a mistake.
      Still helpful for #4.
        Marking cookies as Secure gives confidentiality, but not integrity.
        Active attacker can serve fake set at http://bank.com, and set cookies
          for https://bank.com. (https://bank.com cannot distinguish)
        ...even if client never visited bank.com! (active attacker could
          have returned some other Web page in response to an HTTP
          request, and that page could have initiated communication with
          "bank.com" (controlled by attacker))
         

    Why not just turn on ForceHTTPS for everyone?
      HTTPS site might not exist.
      If it does, might not be the same site (https://www.nyu.edu may be
         a different page from http://www.nyu.edu)
      HTTPS page may expect users to click through (self-signed certs).

    Implementing ForceHTTPS.
      The ForceHTTPS bit is stored in a cookie.
      Interesting issues:
        State exhaustion (the ForceHTTPS cookie getting evicted).
        Denial of service (force entire domain; force via JS; force via HTTP).
          Why does ForceHTTPS only allow specific hosts, instead of entire domain?
          Why does ForceHTTPS require cookie to be set via headers and not via JS?
             [defend against XSS]
          Why does ForceHTTPS require cookie to be set via HTTPS, not HTTP?
        Bootstrapping (how to get ForceHTTPS bit; how to avoid privacy leaks).
          Possible solution 1: DNSSEC.
          Possible solution 2: embed ForceHTTPS bit in URL name (if possible).
          If there's a way to get some authenticated bits from server owner
            (DNSSEC, URL name, etc), should we just get the public key directly?
          Difficulties: users have unreliable networks. Browsers are unwilling
            to block the handshake on a side-channel request.

    Current status of ForceHTTPS.
      Some ideas from ForceHTTPS have been adopted into standards.
      HTTP Strict-Transport-Security header is similar to a ForceHTTPS cookie.
        [ Ref: http://tools.ietf.org/html/rfc6797 ]
        [ Ref: http://en.wikipedia.org/wiki/HTTP_Strict_Transport_Security ]
        Uses header instead of magic cookie:
          Strict-Transport-Security: max-age=7884000; includeSubDomains
        Turns HTTP links into HTTPS links.
        Prohibits user from overriding SSL/TLS errors (e.g., bad certificate).
        Optionally applies to all subdomains.
          Why is this useful?
          non-Secure and forged cookies can be leaked or set on subdomains.
        Optionally provides an interface for users to manually enable it.
        Implemented in Chrome, Firefox, and Opera.
        Bootstrapping largely unsolved.
          Chrome has a hard-coded list of preloads.
      IE9, Firefox 23, and Chrome now block mixed scripting by default.
        [ Ref: http://blog.chromium.org/2012/08/ending-mixed-scripting-vulnerabilities.html ]
        [ Ref: https://blog.mozilla.org/tanvi/2013/04/10/mixed-content-blocking-enabled-in-firefox-23/ ]
        [ Ref: http://blogs.msdn.com/b/ie/archive/2011/06/23/internet-explorer-9-security-part-4-protecting-consumers-from-malicious-mixed-content.aspx ]

    Another recent experiment in this space: HTTPS-Everywhere.
      Focuses on the "power user" aspect of ForceHTTPS.
      Allows users to force the use of HTTPS for some domains.
      Collaboration between Tor and EFF.
      Add-on for Firefox and Chrome.
      Comes with rules to rewrite URLs for popular web sites.

    Other ways to address problems in SSL/TLS
      Better tools / better developers to avoid programming mistakes.
        Mark all sensitive cookies as Secure (#4).
        Avoid any insecure embedding (#3).
        Unfortunately, seems error-prone..
        Does not help end-users (requires developer involvement).
      EV [= extended validation] certificates.
        Trying to address problem 5: users don't know what to look for in cert.
        In addition to URL, embed the company name (e.g., "PayPal, Inc.")
        Typically shows up as a green box next to the URL bar.
        Why would this be more secure?
        When would it actually improve security?
        Might indirectly help solve #2, if users come to expect EV certificates.
      Blacklist weak crypto.
        Browsers are starting to reject MD5 signatures on certificates
          (iOS 5, Chrome 18, Firefox 16)
        and RSA keys with < 1024 bits.
          (Chrome 18, OS X 10.7.4, Windows XP+ after a recent update)
        and even SHA-1 by Chrome.
          [ Ref: http://googleonlinesecurity.blogspot.com/2014/09/gradually-sunsetting-sha-1.html ]
      OCSP stapling.
        OCSP responses are signed by CA.
        Server sends OCSP response in handshake instead of querying online (#2).
        Effectively a short-lived certificate.
        Problems:
        - Not widely deployed.
        - Only possible to staple one OCSP response.
      Key pinning.
        Only accept certificates signed by per-site whitelist of CAs.
        Remove reliance on least secure CA (#2).
        Currently a hard-coded list of sites in Chrome.
        Diginotar compromise caught in 2011 because of key pinning.
        Plans to add mechanism for sites to advertise pins.
          [ Ref: http://tools.ietf.org/html/draft-ietf-websec-key-pinning-21 ]
          [ Ref: http://tack.io/ ]
        Same bootstrapping difficulty as in ForceHTTPS.

    Other references:
      http://www.imperialviolet.org/2012/07/19/hope9talk.html

      Quote from Entangled web (p.64) wrt CAs:

        "In order to allow web servers to prove their identity, every HTTPS-enabled
        web browser ships with a hefty set of public keys belonging to a variety of
        certificate authorities. Certificate authorities are organizations that are
        trusted by browser vendors to cryptographically attest that a particular
        public key belongs to a particular site, hopefully after validating the
        identity of the per- son who requests such attestation and after verifying
        his claim to the domain in question.

        The set of trusted organizations is diverse, arbitrary, and not particularly
        well documented, which often prompts valid criticisms. But in the end, the
        system usually does the job reasonably well. Only a handful of bloopers have
        been documented so far (including a recent high-profile compromise of a
        company named Comodo22), and no cases of widespread abuse of CA privi- leges
        are on the record."

        22. P. Hallam-Baker, “The Recent RA Compromise,” Comodo IT Security (blog)
        (March 23, 2011), http://blogs.comodo.com/it-security/data-security/
        the-recent-ra-compromise/.


---------------------------------------------------------------------------

Acknowledgment: 6.858 staff