Last week: BGP ===================== * The glue underlying the Internet * Goal is to find a policy-compliant path (respecting transit and peer relationships), not necessarily the shortest path (often not the shortest path) * Forwards packets based on destination IP address. This week: SDX and Espresso ==================== * Going beyond destination-based forwarding. * Forward based on application, source address, ingress link, etc. * Quite hard to express using BGP --> Recall the ASPATH padding hack from last week's lecture notes. * How do we make it easier to express these kind of policies that go beyond destination-based forwarding? * This is the focus of both SDX and Espresso --> But papers reflect very different styles of writing a paper. --> SDX is heavy on concepts. Espresso is heavy on details. --> Might reflect an academic vs. industrial take on the same (or at least related) problems. SDX: Context =================== * Targets SDN switches. * For this paper, an SDN switch is a switch that can filter packets based on some header fields and carry out some transformations on these packets. * The filter is called a match rule and the transformation is called an action. * We'll have much more to say about matches and actions in the Ethane+OpenFlow and RMT+P4 lectures. * Targets Internet Exchange Points (IXPs) --> Centralized facilities that allow different ISPs to peer with each other by connecting into one facility. --> Avoids the physical expense of each ISP needing to lay down cable to every other ISP. --> But just because 2 ISPs are physically present at an IXP, doesn't necessarily mean they will be peering with each other. --> IXPs facilitate peering but don't remove the need for business agreements that are at heart of peering. SDX: Use cases ================== * Application-specific peering: Have two ASes/ISPs only exchange traffic for certain traffic (e.g., video traffic). * Inbound traffic engineering: favor one incoming path over another. BGP does this using unintuitive workarounds like path prepending. * Network-assisted load balancing: Using the network instead of DNS for load balancing by using anycast addresses. * Redirection through middleboxes: Send traffic to a middlebox for further processing before sending it onward. SDX: Programming abstraction ================= * Virtual switch abstraction connecting an AS's border router to the border router of all its peers (i.e., those ASes with which it has a business relationship allowing peering). * Each AS has such a virtual switch abstraction and can only write policies on top of its own virtual switch. * SDX uses the Pyretic language to express forwarding policies on top of this virtual switch abstraction. * Pyretic: Mapping from a packet to set of packets. --> Empty set corresponds to dropping. --> Set with multiple packets corresponds to multicast. --> Rewriting packet's location corresponds to forwarding. * Boolean predicates to match on a subset of packets. * Predicates and mappings correspond to matches and actions respectively. * Can use this to express --> Application-specific peering (match is on the destination port, action is forwarding) --> Inbound traffic engineering (match is on the source IP address, action is forwarding) --> Wide-area server load balancing (match is on the destination IP+source IP (to balance load), action is rewriting destination IP) * Can integrate with BGP --> Useful to use BGP by default --> Only override when packet headers match specific predicates SDX: Compilation =================== * Ensure policies can actually be enforced: --> For instance, an AS's outbound policy can only be applied only on traffic originating from that AS, not on traffic passing through it. --> Similarly, inbound policy can only be applied to traffic that it receives from outside. * Ensure policies are consistent with BGP route advertisements * Reduce number of data-plane rules (i.e., match-action rules) to enforce policy. --> Group IP prefixes with the same behavior (same action) into an equivalence class. --> Effectively compresses that group into a single match-action rule. SDX: Evaluation =================== * Version that runs in an emulated environment (Mininet) * Virtual ASes that all run within Mininet and all connect to an Open vSwitch (a software switch) --> Virtual in the sense that they don't have publicly recognized AS numbers and don't carry traffic on behalf of others. * Demonstrates two use cases: application-specific peering and load balancing. * Also evaluated number of data plane rules and compiler run time. --> Derived policies and topologies from three large IXPs: AMS-IX, LINX, and DEC-IX --> Evaluation is notable for its use of real IXP data---as opposed to randomly generating synthetic data --> Shows how number of prefix groups (a prefix group is a set of prefixes with the same forwarding behavior) increases with the number of prefixes --> Similarly, shows how number of forwarding table rules increases with the number of prefix groups. --> Shows the benefits of incremental recompilation. Espresso: Google's peering edge =================== * Background: --> Google B4: inter-datacenter WAN --> Google B2: connectivity from datacenters to peering edge --> Peering edge: A ring of peering metros. * What is a peering metro? --> Paper doesn't really define it precisely, but it's safe to say based on paragraph 1 that it's probably a collection of Google servers that provides: * TCP termination proxies * Content caching (in the style of Akamai or Cloudflare) * DDoS and DoS prevention * Peering with other edge ISPs (like Comcast, Verizon, or TWC) * Load shedding from the main dataccenters --> Peering metros are also often called points of presence. (essentially they increase the footprint of a service like Google and bring it closer to the user.) Espresso: The problem =================== * A management system for routers sitting within the peering edge (more precisely the border routers of different peering metros) * Favor a few scale-up routers to reduce number of routers that need to be managed --> Will see in the datacenter lecture of a setting where the opposite (many small routers, which is called scale out) is preferred. * On the incoming path to these routers. --> Complex and large-scale access control lists for firewalling and DoS attack prevention. --> Why support firewalling on routers? * Alternative of deploying specialized firewalls in peering metros is too expensive and power hungry. * Their experience: --> Flexibility of enforcing fine grained rules (over and above basic BGP) was limited by traditional Internet routers. --> Could not leverage a global view of traffic due to the distributed nature of these platforms (will see in the Ethane lecture how this gets fixed). --> Hard to override BGP behavior in a fine-grained manner. Espresso: Requirements for a large web-scale company ================== * Efficiency: Make good use of Google's network and compute infrastructure (i.e., reduce load on links back to Google datacenters, utilize compute within peering edges, etc.) * Interoperability with the rest of the Internet, especially residential ISPs, who are still running BGP. * Reliability: Peering edge outages are costly and affect user experience and hence revenue. * Incremental deployment: The network is constantly running. Can't just pause it to upgrade. * Velocity: Iterate quickly. Espresso: The idea =================== * Remove most software from the peering metro's border routers. * Move it to an external control platform (builds on SDN ideas, which we'll discuss as part of Ethane in lecture 7.) * Makes the router much simpler: can use cheaper commodity "merchant silicon" switches. * Move routing to more capable server infrastructure already present in the peering metro (e.g., TCP termination proxies, caching servers) * Implement BGP/Traffic Engineering/etc. (many use cases similar to SDX) on their own servers --> Without relying on propreitary BGP implementations from router vendors such as Cisco/Arista/Juniper/etc. * Net effect: 13% increase in global traffic delivered using peering edge * Improved end-to-end user experience: Mean time between rebuffers goes up. * Perhaps most importantly, reduced the time required to make policy changes from a year to two weeks. --> Can make change in house, don't have to speak to vendor and get it into vendor's software. * Essentially, the takeaway is that if you put in the effort to build software on your own (as opposed to using off-the-shelf software), it pays off in the long run because you are the one who understands your requirements the best. * Use cases similar to SDX: Application-aware routing, traffic engineering etc. Comparing Espresso and SDX: ================== * Espresso firmly rooted in practice. --> Maybe a bit too firmly. Makes it hard in some cases to extract principles from the detail-oriented description. * SDX is all about ideas: abstractions (e.g., virtual switch) and clever algorithms (e.g., incremental compilation). * Espresso has a much more extensive evaluation that SDX could have never done because few entities have Google's scale. * Ultimately both paper are very valuable: --> SDX to allow us to think about the problem in an idealized setting and --> Espresso to teach us the realities of solving the problem in production.