Class 22
CS 372H
13 April 2010

On the board
------------

1. NFS
2. Networking

---------------------------------------------------------------------------

1. NFS: case study of a network file system

    Networked file systems:

	--What's a network file system?
	    --Looks like a file system (e.g., FFS) to applications
	    --But data potentially stored on another machine
	    --Reads and writes must go over the network
	    --Also called distributed file systems
  
	--Advantages of network file systems
	    --Easy to share if files available on multiple machines
	    --Often easier to administer servers than clients
	    --Access way more data than fits on your local disk
	    --Network + remote buffer cache faster than local disk
	      
	--Disadvantages
	    --Network + remote disk slower than local disk
	    --Network or server may fail even when client OK
	    --Complexity, security issues

    NFS: seminal networked file system (NFS = Network File System)

    * Intro and background
    * How it works
    * Statelessness
    * Transparency
    * Security

    A. Intro and background

	--Reasons to study it
	    --case study of RPC transparency
	    --NFS was very successful.
	    --Still in widespread use today (we're using it in class machines).
	    --Much research uses it.
	    --Can view much networked file systems research as fixing problems with NFS

	--Is this actually a good paper? (Very interesting paper, but
	perhaps not the best example of technical communication.  Lots
	of Unix jargon, use of "I" on a multi-author paper (!), thanking
	of co-authors in the acknowledgments (!), references section not
	very useful, etc., etc.)

	--background and context
	    --designed in mid 1980s
	    --before this, Sun was selling Unix workstations
		--diskless (to save money)
		--"ND" network disk protocol (use one big central disk,
		and let the diskless workstations use it)
		--allowed disk to live somewhere else, but did not allow for
		shared file system

	   More detail on context:

	   NFS arose in the early-to-mid 1980s. Prior to NFS, each
	   computer had its own private disk and file system. That
	   worked for expensive central time-sharing systems when there
	   weren't many workstations. But in the LAN environment, with
	   workstations becoming cheaper, people wanted ways to share
	   files within organizations. The goal was to allow a user to
	   sit down at any workstation and access his or her files even
	   though the files might live on a central server. 

	       --Advantages:
		  --convenience (get your files anywhere)
		  --cost (buy workstations without disks)
	      
	--only sysadmin has to know where files live. shell, user
	program, etc. do _not_ have to know (way better than competitors
	at the time)


    B. How it works

	--What's the software/hardware structure?

	    [DRAW PICTURE]

	  --array of vnodes in both client and server

	      --vnode like a primitive C++ or Java object, methods

	      --represents an open (or openable) file

	     --Bunch of generic "vnode operations":

		--lookup, create, open, close, getattr, setattr, read,
		write, fsync, remove, link, rename, mkdir, rmdir, symlink,
		readdir, readlink, ...

		--Called through function pointers, so most system calls don't
		care what type of file system a file resides on
     
	--NFS implements vnode operations through RPC
	    --Client request to server over network, awaits response
	    --Each system call may require a series of RPCs
	    --System mostly determined by NFS RPC **protocol**

	--How does it work?

	    [TRACE RPC FOR OPEN AND WRITE: LOOKUP AND WRITE]


	--nice separation between interface and implementation

	    --loopback server

	    --replace NFS server altogether

	    --can make lots of things *look* like a file system just by
	    implementing the NFS interface. extremely powerful technique

	    --this gain mostly arises because of the power of RPC and
	    modularity, rather than anything about NFS in particular
	 
	--What does a file handle look like?

	    [FS ID | inode # | generation #]

	    Why not embed file name in file handle? (file names can change;
	    would mess everything up. client needs to use an identifier
	    that's invariant across such renames.)

	    How does client know what file handle to send? (stored with the
	    vnode)

    C. Statelessness

	--What the heck do they mean? The file server keeps files;
	that's certainly state!!

	    --What they really mean is that every network protocol
	    request contains all of the information needed to carry out
	    that request, without relying on anything remembered from
	    previous protocol requests.

	    --convince yourself of this by looking at the calls

	    --but are operations really idempotent?

		--what happens if two renames() are sent, and the reply
		to the first one is lost? client sends another one. it
		the second one returns an error code, even though the
		operation conceptually succeeded.

		    --similar issue with "mkdir", "create", etc.

	--How are READ and WRITE stateless?

	    (Answer: they contain the disk address (the inode at the server) as
	    well as an offset.)

	--What are the advantages and disadvantages?

	    +: simplifies implementation
	    +: simplifies server failure recovery 
	    -: messes up traditional Unix semantics; will discuss below

	--What happens if the server reboots while the client has a file open?
	   --Nothing!
	   --Client just uses the same file handle.
	     (file handles are usable across server failures.)
	   --NOTE: a crashed and rebooted server looks the same to clients as a slow
	   server. Which is cool.

	--Why doesn't NFS have RPCs called OPEN() and CLOSE()?

    D. Transparency and non-traditional Unix semantics

	--Note: transparency is not just about preserving the syscall
	API (which they do). Transparency requires that the system calls
	*mean* the same things. Otherwise, existing programs may compile
	and run but experience different behavior. In other words,
	formerly correct programs may now be incorrect. (This happened
	with NFS because of its close-to-open consistency.)

	--what is generation number for?

		(*) What if client A deletes a file and it (or another
		client) creates a new one that uses the same i-node?

		--generation number prevents
		    --Stale FH error
		
		--served file systems must support

		--So not fully transparent

		More detail:
    
		For *all* files that could ever be exposed by NFS, the
		server stores, in the i-node on disk, a generation
		number. Every time the server allocates a given i-node,
		it increments the i-node's generation number. When the
		server passes a FH to the client (say, in response to a
		LOOKUP RPC from the client), the server puts the given
		i-node's _current_ generation number in the FH. 

		How: The way the generation number avoids problems that
		arise from the special case in (*) is as follows: for
		each request the client makes of the server, the server
		checks to see whether the generation number in the
		client's FH matches the on-disk generation number for
		the i-node in question. If so, the client has a current
		FH, and the special case has not arisen. If not, the
		client's generation number must be older, so we are in
		the special case, and the client gets a "stale FH" error
		when it tries to READ() or WRITE().

		Why: Without the generation number, the special case in
		(*) would cause a client to read and write data it had
		no business reading or writing (since the given i-node
		now belongs to some other file).

	
	--non-traditional Unix semantics

	(i) we mentioned one examle above: error returns on successful
	operations. go through some other examples of new semantics

	(ii) server failure	
	    --previously, open() failed only if file didn't exist

	    --now, if server has failed, open() can fail or apps hang
	    [fundamental trade-off if server is remote]

	(iii) deletion of open files
	    --What if client A deletes a file that client B has "open"?
		--Unix: my reads still work (file exists until all
		   clients close() it)
		--NFS: my reads fail
	    --Why?
		--To get Unix-like behavior using NFS, server would have
		to keep track of all kinds of stuff. That state would
		have to persist across reboots.
		--But they wanted stateless server
		--So NFS just does the wrong thing
		    --RPCs fail if another client deletes a file you have open.

	detour: 
	    --Server must flush to disk before returning (why?)
		  --Inode with new block # and new length safe on disk.
		  --Indirect block safe on disk.
		  --So writes have to be synchronous
		  --so why isn't performance bad?
		     caching. not all RPCs actually happen

		[NFSv3 handles this a bit better. WRITES() go to server
		but don't necessarily cause disk accesses at server.]

	    --what kind of caching do they have?
		  --Read-caching of data. Why does this help? (re-reading files)
		  --Write-caching of data. Why does this help? (see
		  above)
		  --Caching of file attributes. Why does this help? (ls -l)
		  --Caching of name->fh mappings. Why does this help?
		  (cache prefix like /home/bob)

	(iv) but once you have a cache, you have to worry about
	coherence and semantics. what kind of coherence/consistency does
	it actually give? (Answer: close-to-open)
	    
		A: write(), then close(), B: open(), read(). B
		    sees A's data. Otherwise, B has an "old" picture.
     
		 At a high level, how do they implement it?
		    --writing client forces dirty blocks during a close()
		    --reading client checks with server during open() and asks, "is
		    this data current?"

		What's wrong with this?
		
		    (1) what if there's no more space on server?
		    (2) some_proc > out in one window ; tail -f out in another
      
		Why do they give this guarantee instead of a stronger
		guarantee? (Performance. They are trading off the
		semantics for performance.)

	(iv) Other transparency things.

	      --out of space: write() error vs close() error 

	      --delete a file when in use. why would you do this?
		if unlink() on the same client, move to .nfsXXX
		if unlink() on some other client? (see (iii) above)

	      --chmod -r while file is open()
		(permissions checked only when file is being opened. so
		if user owns file but has no write permissions, WRITE()
		will succeed anyway --> different from Unix)

	      --execute-only implies read, unlike in Unix
		 (in order to execute a file, the client must read it
		 because the CPU itself is local. once it has the file,
		 it has the file. compare to Unix, where Unix really can
		 execute a file for you but not let you read it.)

	Areas of RPC non-transparency
	  * Partial failure, network failure
	  * Latency
	  * Efficiency/semantics tradeoff
	  * Security. You can rarely deal with it transparently (see
	    below)
	  * Pointers. Write-sharing. Portable object references is hard
	    under RPC
	  * Concurrency (if multiple clients)
	  Solution 1: expose RPC to application
	  Solution 2: work harder on transparent RPC

    E. Security

	--Only security is via IP address

	--Another case of non-transparency:

	    --On local system: UNIX enforces read/write protections
		Can't read my files w/o my password

	    --On NFS:
		--Server believes whatever UID appears in NFS request
		--Anyone on the Internet can put whatever they like in the request
		--Or you (on your workstation) can su to root, then su to me
		  --2nd su requires no password
		  --Then NFS will let you read/write my files

	    --In other words, to steal data, just adopt the uid of the
	    person's whose files you're trying to read....or just spoof
	    packets.

	--So why aren't NFS servers ridiculously vulnerable?
	    --Hard to guess correct file handles.
	    --(Which rules out one class of attacks but not spoofed
	    UIDs)

	--Observe: the vulnerabilities are fixable
	    --Other file systems do it
	    --Require clients to authenticate themselves cryptographically.
	    --But very hard to reconcile with statelessness.


    F. Concluding note

	--None of the above issues prevent NFS from being useful.
	    --People fix their programs to handle new semantics.
	    --Or install firewalls for security.
	    --And get most advantages of transparent client/server.

    References
    --"RFC 1094": NFS v2
    --"RFC 1813": NFS v3

2. Other distributed file systems (disconnected operation, etc.)

    --disconnected operation: where have we seen this? (answer: git)

    --long literature on this

3. Networking

    --What's a network?
	--just a bunch of interconnected channels
	--railroad, highway, plumbing, communication, telephone
	--computer!!!!

    --computer networks are interesting
	--end-points highly programmable, middle kind of boring (only
	kind of).
	    --can program all of the nodes!
	    --extremely easy to innovate and develop new uses of the
	    network
	--contrast: telephone network: end-points ridiculously simple, middle has
	complexity.
	    --worse, can't program most phones, need FCC approval for
	    new devices, no visibility, etc.

    --Going to describe the various layers of a network. Case study will
    be what happens when you gain access to a single Web page or send a
    single RPC in NFS.

    --If you're interested in this stuff, take classes in networking! Or
    program away! Or read the RFCs (short for "Request For Comments" but
    despite the name, they are standards). Few things are as open and
    well-documented as the various protocols that form, and run over,
    the Internet

    --Network classically explained as being divided into sharply
    distinguished layers. In reality, things are messier. But still
    incredibly useful to think about layering.

    A. Physical layer

	--signals in a medium
	    --medium: coaxial cable, twisted pair (Ethernet), fiber, radio
	    --signals: endless innovation. different electrical profiles
	    correspond to different sets of bits

	--some media are point-to-point:
	    --fiber, twisted pair

	--some media are shared transmission medium (coax, radio)
	    --any message can be seen by all nodes
	    --but now there is contention

	--speed of light matters!
	    --300,000 km/sec in a vacuum, slower in fiber
	    --New York to CA: ~3000 miles = ~5000 km
	    --propagation time:
		5000 km / (300,000 km/sec) = ~17 msec
	    --round-trip: ~34 msec, assuming no computation

	--Technology improvements are not going to fix this

	--But what the heck? I thought I keep reading that networks keep
	getting faster....

	    --*delay* is never going to improve as long as the theory of
	    relativity stands

	    --throughput -- bits per second -- improves ridiculously
	    well

	    --so how do we take advantage of this?

		--concept: bandwidth-delay product
		    
		    [DRAW CYLINDER:
			bandwidth is the width, delay is the length]	
   
		--get full network utilization if you've got # bytes in
		flight = bandwidth*delay

		--but what if the network isn't doing bulk transfer?
		    
		    --then you'll get poor throughput. ping/pong (send a
		    packet, wait for a response) has terrible throughput

		    --this is one reason why concurrency is absolutely
		    critical for good network utilization: a bunch of
		    low-throughput flows may add up to good utilization

    
	Note that physical connectivity is rare.....

	    --instead, communications usually "hop" through multiple
	    device

	    --[DRAW PICTURE:
		source --> bunch of switches --> destination ]

	    --Allows links and devices to be shared for multiple purposes

	    --Must determine which bits are part of which messages
	    intended for which destinations

	Two kinds of ways to create this indirect connectivity:

	    --Circuit-switched: provide virtual links. Dump bits in at
	    source, they come out at the destination
		--example: the old telephone network. dialing the number
		set up a virtual circuit (and before that, human
		operators set up an actual circuit)
	    
	    --Packet-switched:
		--Pack a bunch of bytes together intended for same destination
		--Slap a _header_ on packet describing where it should go
		--Most networks today are packet switched


    B. Before we go further, let's look at the big picture, or the
    classic Internet technologies:

	computer  - LAN - router - cloud [lots of routers] - router - LAN - computer
	[ Web browser         <transforms packet>
	 ------
	  TCP
	 -----
	  IP ]


    C. Link layer

	next time........