Ethane: Context ======================================== * Published in 2007; influenced considerably by the 4D work from AT&T in 2005 and Casado et al.'s SANE work in 2006 --> The biggest difference from 4D is that 4D outlined a research agenda and a high-level vision --> Ethane brings this vision to life and concretizes several important details (boot up, registration, policy language, real deployment, etc.) --> 4D's vision was influenced by the authors' vantage point at an ISP (similar to the Gao-Rexford theorem) * Ethane and SANE --> Martin Casado had a background in enterprise networking; worked for the intelligence community on networking+cybersecurity. --> This background was important because he had experienced problems with the network infrastructure firsthand. --> Meta point: Addressing your own pain points is a pretty good way to do research Ethane, OpenFlow, and SDN ======================================== * Ethane influenced the creation of OpenFlow in 2008. * OpenFlow was instrumental in launching the software-defined networking (SDN) movement. --> The term SDN wasn't coined until an MIT Technology Review article in 2009 (http://www2.technologyreview.com/news/412194/tr10-software-defined-networking/) --> SDN was a misnomer because SDN as specified in 2009 only dealt with a programmable control plane. --> Did not deal with the data plane, which we'll look at next lecture and started becoming programmable only around 2013. --> But, it caught on as a term, and it spread quickly. * In a way, SDN would have happened in one form or another because of the rise of cloud computing --> Networks were getting larger and harder to manage (recall: 100K servers, 10K switches) --> Again, reiterates the importance of timing in research ideas (similar to DCTCP and VL2). --> E.g., Google had SDN-style ideas within its datacenters before the term SDN was coined. Ethane's impact: ======================================== * Ethane and OpenFlow led to an SDN startup Nicira, which was later acquired by VMware in 2012. * Brought VMware to the forefront of networking as a provider of network virtualization in addition to server virtualization. * The Ethane paper itself was awarded SIGCOMM's test of time award last year. Ethane's (and by implication SDN's) main ideas: ======================================= * Move control plane out of the switches and into a centralized server infrastructure (called the controller in Ethane). * Simple interface between the controller and barebone "Ethane" switch to add or remove entrires from the switch's flow tables. --> This interface was later standardized as OpenFlow (as the least common denominator across all switches). * The details of OpenFlow itself are not too relevant conceptually. * But the details of OpenFlow _were_ very important from the perspective of getting industry involvement early on. * The big idea in Ethane/SDN: centralized control * Or abstracting even further: the decoupling of the data and control planes. --> The control plane doesn't need to be one centralized server. --> It can be a distributed server farm. --> The critically important aspect of SDN is the decoupling of the data plane (packet-to-packet behavior) from the control plane (management of policies). Ethane's approach: ====================================== * Good example of a paper that articulates a high-level vision, and then follows through on several low-level details. * Did not want to rely on proprietary, unstable interfaces from switch vendors (Btw, companies with large networks have dedicated engineers to deal with these ever-changing interfaces). * Instead, Ethane took a clean-slate approach to this problem: how do we do this if we could change everything? * But also figured out a way to get from the status quo to that utopia: --> OpenFlow was a proposal to extract a common minimum interface from existing proprietary APIs --> OpenFlow's capabilities were sufficient to realize Ethane's policies without networking hardware changes. --> Also (in footnote 4) they talk about Ethane could work in a network where only some switches are Ethane switches. * Meta point: Does a good job of articulating a futuristic vision, addressing the details, and figuring out the pathway to get to that future. * Follows through by dogfooding, i.e., using their own system by deploying it in Stanford's network. * Is also good at pointing out its limitations and what attacks it can't prevent. Ethane's details: ====================================== * Registration: Register all switches, users, and hosts at the controller using credentials so that Controller knows what entities are present in the network. * Bootstrapping: Figure out a pathway from the switches to the controller using the spanning tree protocol. * Authentication: When a user enters the network for the first time, controller checks that the user is allowed on the network and allocates the user an IP using DHCP. * When the first packet from a flow shows up, the controller enforces policy and writes the forwarding table entries into the data plane of all relevant switches. * Assumes a layer-2 network like a LAN where forwarding is done based on MAC addresses. * Can be extended to IP networks at layer 3 using tunnels. * Considerably simpler switches: no OSPF, ACLs, NATs, etc. What Ethane needs out of its switches (Section 3.2 paragraph on Flow Table and Flow Entries) ====================================== * Flow entry has a header or bit pattern to match packets against * And an action to carry out on matched packets. * For instance, to prioritize SSH traffic by placing it into a separate queue, you could do: --> Match TCP port 22, action: enqueue into high priority queue. * This match-action paradigm is now ubiquituous within networking research * Example match fields: TCP src/dst port, IP src/dst address, MAC address, VLAN tag. * Example actions: Forward to a particular port, drop packet, send to specific queue, rate limit, etc. * OpenFlow was a minimum common standard for this that was mostly already supported by switches. * Because it was a minimum common standard, it wasn't particularly flexible. Will come back to this. What Ethane can and can't do: ====================================== * Can write rich policies in the policy language by matching on different packet fields and carrying out different actions * But the matches and actions are themselves fixed --> That's dictated by the switch hardware. --> And the Ethane authors didn't want to change the switch hardware (not just yet anyway). --> For instance, you can't add two header fields and write it into a third as part of the action in an Ethane switch. * Later work (next lecture) in RMT and P4 looked at how you could create new matches and actions as a programmer of the network Aside ===================================== * RMT (2013) and P4 (2014) had roughly the same relationship as Ethane (2007) and OpenFlow (2008). * Similar to Ethane leading to Nicira in 2007, RMT led to Barefoot Networks in 2013. * RMT and P4 deal with programmable data planes instead of control planes. Ethane's controller: ===================================== * The omniscient brain of the network with full visibility. * Policy compiler to compile user's policy into a flow table at the controller, which eventually dictates what flow entries get installed at the swtiches. * Whole slew of work later on richer programming languages and compilers beyond this POL-ETH language. Look at work by Nate Foster (Cornell), David Walker (Princeton), and Jennifer Rexford (Princeton) on these topics. * Controller fault tolerance and availability: don't want a single controller to be a point of failure * Also a single controller couldn't scale to very large networks. * Again, lots of follow on work on controller scalability and consistency since 2007. Lots of follow on work to Ethane: ===================================== * The Ethane paper was seminal in many respects * Every aspect of the paper's design (controller, policy language, deployment, scalability, matches, actions) was improved by subsequent research papers that built on the basic centralized control paradigm. * The centralized control paradigm (and control-vs-data plane decoupling) is perhaps the most important legacy of this paper. * New workshop that sprung up in 2012 (HotSDN) that turned into a conference in 2015 (SOSR). * In many ways, Ethane led to the creation of a new field of research. * But, it's very important to remember that Ethane built on considerable prior work: nothing germinates in a vacuum. Ethane evaluation: ==================================== * Deployed on the Stanford campus network --> Unusual for an academic paper's evaluation. Most papers settle for a simpler, smaller testbed. --> This paper decided to try out their own system for a while to see what they learnt. * Main takeaway: Single controller can easily handle 10K new flows per second. --> Sufficient for their deployment (avg: 30--40 flows per second, max: 750) --> Extrapolating based on datasets from LBL and Stanford, which had a max of 1200 and 9000 flows per second. --> Conclusion: controller isn't a bottleneck for a network of up to 20K hosts. * Second takeaway: Number of active flows doesn't exceed 500 in their Ethane deployment. --> Active flows does not exceed 1200 in LBL network --> Why is the size of the active flow set so important? Ethane shortcomings: =================================== * Application-layer routing: Overlay networks could circumvent Ethane policies * Inferring user intent is hard without user involvement: User could use HTTP on port 500 instead of 80. Ethane postscript and summary: ================================== * Led to OpenFlow, then SDN, then a whole lot of hype. * Led to the creation of Nicira. * Burning need at the time: Network virtualization. What is network virtualization? * Network virtualization: giving multiple tenants sharing a network the illusion of their own private network. --> Example use cases: cloud computing, enterprise networks. * Presented a great use case for SDN: can policy language express network virtualization policies? * So that's what Nicira built: a network virtualization solution for enterprises. * But it turned out network virtualization could be done mostly at the end hosts using VL2-style end-host-based filtering. --> Recall the VL2 agent being used for enforcing access control. --> Critically, the end hosts were in software and could be more easily modified. --> The virtual switch (a software switch connecting different VMs to a NIC) was rising in prominence. --> Hence, much of network virtualization ended up being implemented as part of the virtual switch. --> Turned out to be easier than persuading router vendors to support OpenFlow (even though it was simple for them to do so). --> For more detailed examples of end-host-based network virtualization, look at https://www.usenix.org/system/files/conference/nsdi17/nsdi17-firestone.pdf and https://benpfaff.org/papers/net-virt.pdf * In some ways, OpenFlow did not find as much adoption in the enterprise (the target scenario for Ethane) --> But is used in some modified form within Google's intra-datacenter network (Jupiter), inter-datacenter nework (B4), and peering edge (Espresso). --> I don't know many use cases outside of that. * It turned out that the common minimum denominator approach of OpenFlow was a little too restrictive. --> RMT and P4 (two weeks from now) attempt to fix this. --> Alternatively, you could do everything from the end hosts or virtual switches. --> Both approaches (RMT+P4 vs. virtual switches) have their pros and cons. Maybe a hybrid will emerge?