Speaker: Ethan Katz-Bassett, University of Washington
Location: Warren Weaver Hall 1302
Date: April 1, 2011, 11:30 a.m.
Host: Denis Zorin
The Internet is now central to many aspects of modern society, yet it remains remarkably fragile. Partial outages are common, and performance problems are widespread. Operators would like to address these issues, but poor diagnostic tools hamstring their efforts.
I will argue that a more robust Internet -- one with the predictable performance and high availability needed to provide critical services -- requires the development of a new generation of better tools. We must move towards a self-healing Internet that fixes problems in seconds, not the hours or days that operators often currently take. In my research, I have developed practical distributed systems to understand Internet problems and to provide crucial steps towards automated remediation. My systems are deployable today, without requiring modifications to the network. In the first half of the talk, I will present Reverse Traceroute, my system to measure the routing and performance behavior of reverse paths back to the local host from other networks. While tools have long existed to measure the forward direction, the reverse path has been largely opaque, hindering troubleshooting efforts. I will show how Google and other content providers can use reverse traceroute to troubleshoot their clients’ performance problems. In the second half of the talk, I will focus on using Reverse Traceroute and related systems to diagnose and automatically repair availability problems, even without the participation of the network containing the failure.
Refreshments will be offered starting 15 minutes prior to the scheduled start of the talk.