Class 3 CS 480-008 02 February 2016 On the board ------------ 1. Last time 2. Application layer (continued) 3. Link layer 4. Bootstrapping: ARP, DHCP 5. NAT and private networks 6. Sum up networking unit --------------------------------------------------------------------------- 1. Last time DNS Network layer, transport layer, application layer (a bit) [Draw the layering picture] 2. Application layer continued Example: HTTP when your Web browser connects to a URL, it knows to always make requests on port 80, meaning it stamps "80" in its packets you can direct your Web browser to make requests on any port, though, like this: http://:port_num In that case, the browser itself will address its packets to the IP address that corresponds to the name of the machine and destination port port_num instead of destination port 80. Messages look like this: Browser --> Server: "GET /pics/dog.jpg HTTP/1.0\r\n" Server --> Browser: "HTTP/1.0 404 Not found\r\n" or "HTTP/1.0 200 OK\r\n header1: value1\r\n header2: value2\r\n \r\n [the bytes in dog.jpg]" [Keep in mind that the above is happening inside TCP, and that TCP is presenting a reliable byte stream to the layers above it.] Some points about the API --send()/recv(): for TCP sockets, these are synonymous with read()/write(). So can redirect to socket, e.g. Some points about the underlying implementation (which is the transport layer): Many issues involved in implementing TCP --Wants multiple packets outstanding --But want to react to congestion in the network (want to save network from congestion collapse) --TCP has to "learn" parameters per-connection --Connection set-up and tear-down is complicated --sender never knows if it's last packet was lost --so has to keep state around after connection close --Tons of hacks for good performance Issues directly for OS too --Have to track unacknowledged data --Keep a copy around until recipient acknowledges it --Keep timer around to retransmit if no ack --Receiver must keep out of order segments and reassemble --When to wake process receiving data? --E.g., sender calls write (fd, message, 8000); --First TCP segment arrives, but is only 512 bytes --Could wake recipient, but useless w/o full message --TCP sets PUSH bit at end of 8000 bytes, to force write data --When to send short segment, vs. wait for more data --Usually send only one unacked short segment --But bad for some apps, so provide NODELAY option --Must ack received segments very quickly --Otherwise, effectively increases RTT, increasing bandwidth-delay product but without increase in bandwidth --> useful throughput declines 3. Link layer Ethernet: classic technology History: developed at Xerox PARC, intended to help with the office of the future, amazing technology. used constantly. however, not used much in its original configuration (of shared medium) because many links now point-to-point. --but if you plug your computers into a hub, your hardware is still going to use Ethernet's key features. originally designed for shared medium (coaxial cable) Packets in Ethernet (and most link layers) are called **frames** [header: 14 bytes. then frame payload, then CRC] [preamble (8 bytes) dst src ethertype CRC] (DIX frames...Digital, Intel, Xerox) ethertype = 0x0800, 0x0806 preamble: helps device recognize start of packet CRC: helps device throw away corrupted packets payload: up to 1500 bytes (roughly) the payload and the other fields are usually set by the OS Where do Ethernet addresses, otherwise known as MAC addresses, come from? [assigned *to* different hardware manufacturers, who then install them in their products] [but you can reset it, which is one reason why tying access to MAC addresses is often easily circumvented: sniff the wire, learn someone else's MAC address, and take that one on.] Special Ethernet addresses for broadcast and multicast Medium Access Control (**MAC**) protocol governs access to coax --don't transmit when someone else is --CSMA/CD (carrier sense, multiple access, collision detection) --if you collide, can detect that, use randomized backoff and try again --need to transmit for at least RTT (measured from one end of extent to other) --(above is a bit of a simplification) Consequence: Ethernet has a maximum end-to-end extent and a minimum frame size (these are specified in standards documents). To see why..... The 10 Mbps ethernet standard specified a maximum end-to-end extent of 2.5 kms --> RTT = 5 kms / 1.25 x 10^5 km/sec = 40 microseconds 10 Mbps * 40 microseconds = 400 bits = 50 bytes so what happened with "fast ethernet" of 100 Mbits/sec? and 1Gbps Ethernet? --for FastE, they reduced the maximum network diameter to 200 meters --for GigE, minimum packet size is 512 bytes ASK: why not just connect all computers to the same link layer? Ethernet is awesome, but it cannot scale to the world: --limit on number of nodes --limit on distance --forwarding state doesn't scale People address node limits and distance with **bridges** that connect two Ethernet networks. People also use **switches**, which connect lots more Ethernet networks --bridges/switches learn where all the devices are and avoid forwarding useless packets [table: dst_ether: link] --this technology is widely used in organizations, but it could never scale to the Internet (too many addresses) ASK: how does IP solve the problem? 4. Bootstrapping A. Learning one's own IP address: DHCP, static configuration "give me an IP address" (sent to broadcast IP address) B. Learning others' MAC addresses: ARP "who has this MAC address?" (also sent to broadcast ethernet address) --Okay, so the OS has some IP packet with some destination IP address. How does it know which Ethernet address to stamp in the destination field of the Ethernet header? --If destination host physically connected, use its MAC address --Otherwise, use MAC address of next router (given IP address) --Either way, OS must map IP addresses into physical addresses --How? --ARP! (Address Resolution Protocol) --Broadcast request for MAC address of the destination IP address "who-has" --Everyone on the medium learns the requesting node's MAC address and IP address --Target machine responds with its MAC address --OS keeps ARP cache with IP-->MAC address mappings --Periodically discards entries that have not been refreshed --type "arp -a" on a Unix machine to see contents of ARP cache. --[TRACE THROUGH PICTURE OF HOW PACKETS TRAVEL: --arp to get MAC address of router --packet goes to router --router does whatever --eventually gets to destination LAN --destination router may need to ARP for MAC address of destination, given destination IP address --packet is delivered to host] 5. NAT and private networks --can think of NAT as something like a router; sits between the outside world and the internal computer creates an internal network: 10.0.2/24 multiple computers in your home get an IP address behind this thing. NAT maintains state; uses port fields to mux/demux ASK: what happens on outbound packets? what happens on inbound packets? --another use of NAT: create a way for fake hosts to think they're listening on public ports VM gets: 10.0.2.15 fake IP router in VirtualBox gets: 10.0.2.2 HTTP server on VM thinks it's listening on port 80. But VirtualBox, when run as a user, isn't privileged, so how does this work? VirtualBox listens on an unprivileged port VirtualBox rewrites... but other games are possible too, including making VMs think they're listening on privileged ports. example of the lab machines 6. Sum up networking unit --If you have seen networking before, maybe this perspective is helpful --If you haven't seen networking before, don't worry! Key systems ideas that show up in networking * Layering (protocol architecture, packets, software implementations) * Switching/forwarding/routing (many ways to solve this problem) * Naming (wavelengths, MAC addresses, host addresses, host ports, host names, file names, author, authentication) [ASK: where does naming show up?] * Placement of function (edge vs. middle; hardware vs. software; OS vs. app) [ASK: where have we seen different possibilities for this?] * Performance (need to define the metric [round-trip time? or bandwidth? latency depends on both], think through what the best possible is, and compare to that.) * Resource allocation (huge consideration in networks; we haven't really covered it in this unit)