Class 21
CS 372H
10 April 2012

On the board
------------

1. Last time 
2. Words about labs 6 and 7
3. Crash course in networking

---------------------------------------------------------------------------

1. Last time

    --scheduling

    --I/O

    --livelock paper

2. words about lab 6 and lab 7

    lab 6:

    gives you the sense you're programming a real piece of hardware.

	--complete with the confusing and frustrating manual (which is
	better than most)

	--this is actually a part of getting real hardware to work,
	unfortunately

    in fact, if you run JOS on real EE100-based network interface, your
    driver should work with it.

    this lab is a fair bit of work. you need to understand a bunch of
    things to make progress:

	 --> how all the different environments fit together

	 --> what the hardware expects from software

	 --> how to actually provide that in software

	 --> roughly what the sockets API is (roughly)

	 --> roughly what an HTTP GET message looks like (roughly)

	 --> how Web servers fit into this


    lab 7:

    start thinking about project proposal now!

3. Networking

    A. Intro
    B. Physical layer
    C. Big picture
    D. Link layer
    E. Network layer
    F. What do we mean by layering?
    G. ARP
    H. [next time] Zoom out
    I. [next time] Transport layer
    J. [next time] Application layer

    A. Intro

    --What's a network?
	--just a bunch of interconnected channels
	--railroad, highway, plumbing, communication, telephone
	--computer!!!!

    --computer networks are interesting
	--end-points highly programmable, middle kind of boring (only
	kind of).
	    --can program all of the nodes!
	    --extremely easy to innovate and develop new uses of the
	    network
	--contrast: telephone network: end-points ridiculously simple, middle has
	complexity.
	    --worse, can't program most phones, need FCC approval for
	    new devices, no visibility, etc.

    --Going to describe the various layers of a network. Case study will
    be what happens when you gain access to a single Web page or send a
    single RPC in NFS.

    --If you're interested in this stuff, take classes in networking! Or
    program away! Or read the RFCs (short for "Request For Comments" but
    despite the name, they are standards). Few things are as open and
    well-documented as the various protocols that form, and run over,
    the Internet

    --Network classically explained as being divided into sharply
    distinguished layers. In reality, things are messier. But still
    incredibly useful to think about layering.

	--So begin by talking about the lowest layer and then we'll come
	back to layers again a bit later 

    B. Physical layer

	--signals in a medium
	    --medium: coaxial cable, twisted pair (Ethernet), fiber, radio
	    --signals: endless innovation. different electrical profiles
	    correspond to different sets of bits

	--some media are point-to-point:
	    --fiber, twisted pair

	--some media are shared transmission medium (coax, radio)
	    --any message can be seen by all nodes
	    --but now there is contention

	--speed of light matters!
	    --300,000 km/sec in a vacuum, slower in fiber
	    --New York to CA: ~3000 miles = ~5000 km
	    --propagation time:
		5000 km / (300,000 km/sec) = ~17 msec
	    --round-trip: ~34 msec, assuming no computation

	--Technology improvements are not going to fix this

	--But what the heck? I thought I keep reading that networks keep
	getting faster....

	    --*delay* is never going to improve as long as the theory of
	    relativity stands

	    --throughput -- bits per second -- improves ridiculously
	    well

	    --so how do we take advantage of this?

		--concept: bandwidth-delay product
		    
		    [DRAW CYLINDER:
			bandwidth is the height, delay is the length]	
   
		--get full network utilization if you've got # bytes in
		flight = bandwidth*delay

		--but what if the network isn't doing bulk transfer?
		    
		    --then you'll get poor throughput. ping/pong (send a
		    packet, wait for a response) has terrible throughput

		    --this is one reason why concurrency is absolutely
		    critical for good network utilization: a bunch of
		    low-throughput flows may add up to good utilization

    
	Note that physical connectivity is rare.....

	    --instead, communications usually "hop" through multiple
	    devices

	    --[DRAW PICTURE:
		source --> bunch of switches --> destination ]

	    --Allows links and devices to be shared for multiple purposes

	    --Must determine which bits are part of which messages
	    intended for which destinations

	Two kinds of ways to create this indirect connectivity:

	    --Circuit-switched: provide virtual links. Dump bits in at
	    source, they come out at the destination
		--example: the old telephone network. dialing the number
		set up a virtual circuit (and before that, human
		operators set up an actual circuit)
	    
	    --Packet-switched:
		--Pack a bunch of bytes together intended for same destination
		--Slap a _header_ on packet describing where it should go
		--Most networks today are packet switched


    C. Before we go further, let's look at the big picture, or the
    classic Internet technologies:

	computer  - LAN - router - cloud [lots of routers] - router - LAN - computer

	[ Web browser         <transforms packet>
	 ------
	  TCP
	 -----
	  IP ]


    D. Link layer
	
	Ethernet: classic technology

	History:

	    developed at Xerox PARC, intended to help with the office of
	    the future, amazing technology. used constantly. however,
	    not used much in its original configuration (of shared
	    medium) because many links now point-to-point. 

		--but if you plug your computers into a hub, your hardware
		is still going to use Ethernet's key features.
	
	originally designed for shared medium (coaxial cable)

	Packets in Ethernet (and most link layers) are called **frames**
	    [header: 14 bytes. then frame payload, then CRC]
	    [preamble (8 bytes) dst src ethertype <payload> CRC]
	    
	    (DIX frames...Digital, Intel, Xerox)
	    ethertype = 0x0800, 0x0806

	    preamble: helps device recognize start of packet
	    
	    CRC: helps device throw away corrupted packets

	    payload: up to 1500 bytes (roughly)

	    the payload and the other fields are usually set by the OS

	Where do Ethernet addresses, otherwise known as MAC addresses,
	come from?

	    [assigned *to* different hardware manufacturers, who then
	    install them in their products]

	    [but you can reset it, which is one reason why tying access
	    to MAC addresses is often easily circumvented: sniff the
	    wire, learn someone else's MAC address, and take that one
	    on.]

	Special Ethernet addresses for broadcast and multicast

	Medium Access Control (**MAC**) protocol governs access to coax
	    --don't transmit when someone else is
		--CSMA/CD (carrier sense, multiple access, collision
		detection)
	    --if you collide, can detect that, use randomized backoff
	    and try again
	    --need to transmit for at least RTT (measured from one end of extent to other)
	        --(above is a bit of a simplification)

	    Consequence: Ethernet has a maximum end-to-end extent and a
	    minimum frame size (these are specified in standards
	    documents). To see why.....
	
		The 10 Mbps ethernet standard specified a maximum
		end-to-end extent of 2.5 kms --> 
		    <model speed of light in this medium approx. 1.25 x 10^5 km/sec
			because of delays in repeaters>
		    RTT = 5 kms / 1.25 x 10^5 km/sec = 40 microseconds
		    10 Mbps * 40 microseconds = 400 bits = 50 bytes
	
		note that the smallest *useful* packet size is 19 bytes,
		as we'll see below

		so what happened with "fast ethernet" of 100 Mbits/sec?
		and 1Gbps Ethernet?

		    --for FastE, they reduced the maximum network diameter to
		    200 meters

		    --for GigE, minimum packet size is 512 bytes

		    --as Ethernet gets faster, this will get more
		    ridiculous, but increasingly people aren't using
		    Ethernet for its ability to manage a shared medium, so
		    it's okay

	Ethernet is awesome, but it cannot scale to the world:

	    --limit on number of nodes
	    --limit on distance
	    --forwarding state doesn't scale
	    --want a lingua franca

	People address node limits and distance with **bridges** that
	connect two Ethernet networks.

	People also use **switches**, which connect lots more
	Ethernet networks

	--bridges/switches learn where all the devices are and avoid
	forwarding useless packets
	    [table: 
		dst_ether: link]

	--this technology is widely used in organizations, but it
	could never scale to the Internet (too many addresses)

	--moreover, we need a lingua franca, the **network layer** so
	that computers connected to different media (DSL, wireless,
	phone, whatever) can communicate

    E. Network layer

	Internet Protocol (IP): classic technology

	--IP used to connect multiple networks

	--Runs over a variety of physical networks
    
	--Most computers today speak IP

	Fundamentals

	--Every host has a unique 4-byte IP address
	(Or at least thinks it has, when there are address shortages)

	    --for example: 
		mig.cs.utexas.edu is 128.83.120.150
		www.cs.utexas.edu is 128.83.120.139

	--Based on a destination's IP address, packets are routed

	--Address space structured to make routing practical at global
	scale

	--For example, UT Austin gets:

	    128.62.*.*
	    128.83.*.*
	    146.6.*.*

	    etc.

	    (the top-level assignment is by IANA, who delegates to ARIN
	    (for north america), who assigns to either UT or UT's
	    providers.)

	--NOTE: there is a sharp separation between an entity's IP
	address and its attachment point in the network

	    --*routing* solves the problem of knowing where all of the
	    hosts are attached, and how to reach them

	    --Dijkstra's algorithm, Link state, path vector, etc., etc.
	    
	--[DRAW PICTURE of a network with a bunch of nodes and edges,
	one labeled S, one labeled D, and a packet flowing]

	--Result: number of routing entries across the Internet vastly
	smaller than the number of addresses

	    --this was hugely important for scaling. still is, though
	    becoming less so (as memory gets cheaper)

	Upshot

	--Packets need IP addresses in addition to MAC addresses

	--Refer to picture

    F. Key idea: layering

	--packets inside packets (though different layers packetize
	differently from each other, so the picture below is a
	simplification)

	--[DRAW PICTURE 2x, ONE ON EACH SIDE]

		[app_payload]
	     [TCP header | app_payload]
	  [IP header | TCP header | app_payload]
	[Eth header | IP header | TCP header | app_payload]

	--[MAP THIS ONTO THE DIAGRAM OF THE BIG PICTURE, SHOWING THAT IP
	PIECE TRAVELS MOSTLY UNADULTERATED]

	--An IP router _forwards_ a packet from one Ethernet to another,
	creating a new Ethernet packet containing the same IP packet
    
	--In principle, layers should not depend on each other. In
	practice, there are annoying dependencies (TCP's checksum
	depends on fields in IP header)

	--Different layers have different functions

	    --link layer: framing and media access

	    --network layer:
		--forwarding
		--routing (NOTE: routing != forwarding)
	  
    G. ARP
    
	--Okay, so the OS has some IP packet with some destination IP
	address. How does it know which Ethernet address to stamp in the
	destination field of the Ethernet header?

	    --If destination host physically connected, use its MAC address

	    --Otherwise, use MAC address of next router (given IP address)

	    --Either way, OS must map IP addresses into physical addresses

	    --How?

	    --ARP! (Address Resolution Protocol)
	

	--Broadcast request for MAC address of the destination IP
	address

	    "who-has"

	    --Everyone on the medium learns the requesting node's MAC
	    address and IP address

	    --Target machine responds with its MAC address

	--OS keeps ARP cache with IP-->MAC address mappings

	    --Periodically discards entries that have not been refreshed

	    --type "arp -a" on a Unix machine to see contents of ARP
	    cache.

	--[TRACE THROUGH PICTURE OF HOW PACKETS TRAVEL:
	    --arp to get MAC address of router

	    --packet goes to router

	    --router does whatever

	    --eventually gets to destination LAN

	    --destination router may need to ARP for MAC address of
	    destination, given destination IP address

	    --packet is delivered to host]

---------------------------------------------------------------------------
Encourage you to poke around:

    --"arp -a" (Unix)
    --"ifconfig -a" (Unix)
    --"netstat -arn" (Unix)
    --"ipconfig /all" (windows)
    --"route print" (Windows?)

---------------------------------------------------------------------------

NEXT TIME......

    H. Zoom out: where are we?

	I hope to have convinced you that if

	    (a) a computer knew the IP address of a local router; and

	    (b) that computer knew the IP address of the destination; and

	    (c) we have a network that knows how to forward packets

	then

	    --that computer could arrange for packets to travel to its
	    destination

	Okay, but how do we get (a)--(c)?

	(a) two possibilities:

	    --manual configuration
		--BTW, even edge routers get this thing configured
		manually. A third-tier ISP is told: "here's the IP
		address of the other end of this link."
		--If you have a cable modem, it does this	

	    --DHCP

	(b) Naming system: Domain Name System (DNS)

	(c) [DRAW PICTURE OF ROUTING: BGP, OSPF, etc.; ANOTHER FUNCTION
	OF THE NETWORK LAYER]

	WHAT'S NEXT?

	--we do not yet have a way to indicate what application or process
	on the destination computer gets the packet

	--we also don't cleanly handle things like failure, congestion in
	the network, etc.

    I. Transport layer

	Motivation: failure, demultiplexing, flow control, etc.

	DRAW PICTURE:

			layer                          role

		TCP    UDP    ICMP("ping")	{flow control, port space}
			    IP			{forwarding}
			Ethernet		{framing}
		    radio  copper_wires  fiber  {signal propagation}
	    
	
	Several types of error can affect packet delivery
	--Bit errors (e.g., electrical interference, cosmic rays)
	--Packet loss (packets dropped when queues fill on overload)
	--Link and node failure

	In addition, properly delivered frames can be delayed,
	reordered, even duplicated

	How much should OS (or the networking modules) expose to application?
	--Some failures cannot be masked (e.g., server dead)
	--Others can be (e.g., retransmit lost packet)
	--But masking errors may be wrong for some applications (e.g.,
        old audio packet no longer interesting if too late to play)

	UDP and TCP most popular protocols on IP
	--Both use 16-bit _port_ number as well as 32-bit IP address
	--Applications _bind_ to a port and receive traffic to that port
	    (discuss later what the interface is)

	UDP -- User Datagram Protocol
	--Exposes packet-switched nature of Internet
	--Sent packets may be dropped, reordered, even duplicated
        (but generally not corrupted). Application's problem to deal
	with these errors
  
	TCP -- transmission control protocol
	--Provides illusion of a reliable "pipe" between two
	  processes on two different machines
	--Masks lost and reordered packets so apps don't have to worry
	--Handles congestion and flow control

	Uses of TCP
	--Most applications use TCP
	--Easier interface to program to (reliability)
	--Automatically avoids congestion (don't need to worry about
	  taking down network)

	Many issues involved in implementing TCP
	--Wants multiple packets outstanding
	--But want to react to congestion in the network (want to save
	network from congestion collapse)
	--TCP has to "learn" parameters per-connection
	--Connection set-up and tear-down is complicated
	    --sender never knows if it's last packet was lost
	    --so has to keep state around after connection close
	--Tons of hacks for good performance

	Issues directly for OS too
	--Have to track unacknowledged data
	    --Keep a copy around until recipient acknowledges it
	    --Keep timer around to retransmit if no ack
	    --Receiver must keep out of order segments and reassemble

	--When to wake process receiving data?
	    --E.g., sender calls write (fd, message, 8000);
	    --First TCP segment arrives, but is only 512 bytes
	    --Could wake recipient, but useless w/o full message
	    --TCP sets PUSH bit at end of 8000 bytes, to force write data

	--When to send short segment, vs. wait for more data
	    --Usually send only one unacked short segment
	    --But bad for some apps, so provide NODELAY option

	--Must ack received segments very quickly
	    --Otherwise, effectively increases RTT, increasing
	    bandwidth-delay product but without increase in bandwidth
	    --> useful throughput declines

	Servers typically listen on well-known ports
	    SSH: 22
	    Email: 25
	    Finger: 79
	    Web / HTTP: 80

	--Example:  Interacting with www.cs.utexas.edu
	    --Browser resolves IP address of www.cs.utexas.edu 
	    --Browser connects to TCP port 80 on that IP address
	    --Over TCP connection, browser requests and gets home page

---------------------------------------------------------------------------

Aside:

NAT and lab 6

    --can think of NAT as something like a router; sits between the
    outside world and the internal computer

	creates an internal network: 10.0.2/24 

	JOS gets: 10.0.2.15
	fake IP router gets: 10.0.2.2

    --in lab, QEMU runs with tcp:<some_port>::7 which means:

	--QEMU will listen on some_port

	--QEMU will forward connections that are to
	    ip_addr_of_machine:some_port to 
	    10.0.2.15:7

---------------------------------------------------------------------------

    J. Application layer

	Example: HTTP

	Normally, HTTP servers, otherwise known as Web servers, run on
	port 80

	when your Web browser connects to a URL, it knows to always make
	    requests on port 80, meaning it stamps "80" in its packets
	you can direct your Web browser to make requests on any port,
	    though, like this:
		http://<name of some machine>:port_num

	    In that case, the browser itself will address its packets to
	    the IP address that corresponds to the name of the machine
	    and destination port port_num instead of destination port
	    80.

	Messages look like this:

	    Browser --> Server:
		"GET /pics/dog.jpg HTTP/1.0\r\n"
	    
	    Server --> Browser:  
		"HTTP/1.0 404 Not found\r\n"
		or
				 
		 "HTTP/1.0 400 OK\r\n
		 header1: value1\r\n
		 header2: value2\r\n
		 \r\n
		 [the bytes in dog.jpg]"

	    [Keep in mind that the above is happening inside TCP, and
	    that TCP is presenting a reliable byte stream to the layers
	    above it.]

	QUESTION: where does NFS sit in this picture?
	    [answer: runs over UDP or TCP on some port, either
	    well-known, or determined with a port mapping service
	    running on the server]

    K. What is the interface to the networking stack?

     --Application programmer classically sees *sockets*. 

	Inspired by pipes 
	    int pipe(int fds[2])
		--Allow Inter-process communication on one machine
		--Writes to fds[1] will be read on fds[0]
		--Can give each file descriptor to a different process
		(with fork)

	The idea is: let's do the same thing across machines:
	    **SOCKETS**

	Write data on one machine, read it on another

	*sockets* can represent many different network protocols, but:

	--classically an interface to TCP/IP and UDP
	--sometimes an interface to IP or Ethernet (raw sockets)
	--sockets API

	/* senders and receivers */
	int sockfd = socket(AF_INET, SOCK_STREAM|SOCK_DGRAM|, 0);
	    [note: with AF_INET in the first position, the setting of
	    SOCK_STREAM vs SOCK_DGRAM controls whether the app's data is
	    going to go over TCP or UDP].
	    
	    [with UDP sockets, send atomic messages that may be
	    reordered or lost]

	    [with TCP sockets, bytes written on one end are read on the
	    other, provided no failures. but no guarantees that reads
	    will return the full amount requested ... or that the data
	    will be packetized according to the number of times the
	    sender called send(). With TCP, you *must* sit there in a
	    loop and keep reading. You know you're done because either
	    (a) the application-level protocol is expected to understand
	    where message boundaries begin and end or (b) the first
	    machine closed its connection to the server]

	int rc = close();
	select();

	struct sockaddr_in {
	    short sin_family;
	    short sin_port; 
	    uint32_t sin_addr; 
	    char sin_zero[8];
	};

	/* senders */
	int rc = connect(sockfd, &addr, addrlen);
	int rc = send(sockfd, buf, len, 0);
	int rc = sendto(sockf, buf, len, 0, &sockaddr, addrlen, 0);

	/* receivers */
	int rc = bind(sockfd, &addr, addrlen);
	int rc = listen(sockfd, backlog_len);
	int rc = accept(sockfd, &addr, &adddrlen);
	int rc = recv(sockfd, buf, len, 0);
	int rc = recvfrom(sockfd, buf, len, 0, &addr, &addrlen);


	NOTES:

	* connections are named by 5 components:

	    protocol (TCP), local IP address, local port, remote IP
	    address, remote port

	* UDP does not require connected sockets

	* OS tracks all of this state in a PCB (protocol control block).

    --What does kernel see, and what interfaces does it invoke?

	TX direction:

	--usually gets payloads from higher levels and implements
	TCP/IP, UDP, IP, and part of Ethernet

	--usually hands most of an Ethernet frame to the network device

	--but not always: could imagine a Web server implemented
	entirely in the kernel, or even a Web server implemented on a
	network card

	--(in JOS, the entire networking stack is implemented in user
	space. that is the function of the lwip library.)

	RX direction:

	--when a packet arrives, use 5-tuple (above) to find PCB and
	figure out what to do with packet

    Note that to avoid lots of copies, OS may not actually store packets
    contiguously. May store linked list of buffers. Each buffer is
    either a packet header or a payload

    Network interface cards (NICs)
 
	--Used to be dumb

	--Now sometimes do lots of stuff

	--You will get a network interface card working in lab 6


    Kernels also do *routing*

	--A machine has multiple NICs connected to different networks,
	kernel gets a packet (either from one of the NICs or from an
	application), now which NIC does it go out?
	    
	--kernel generally looks at the destination address of the
	packet and does a lookup in a table that it maintains:
	    [IP address, prefix-length] --> next-hop 

	next-hop is the physical interface to send the packet out

	This is the same routing function that Internet routers do

	there are data structures to make it efficient in time and space
	(radix trees are a decent first cut)