FINAL EXAM IS THURSDAY 7 MAY IN THIS ROOM 10:00am -- 11:50am Implementation choices Servers & clients together? Common unix+nfs: any machine can be a server and/or a client Separate modules: Servers for files and directories are user programs so can configure some machines to offer the services and others not to Fundamentally different: Either the hardware or software is fundamentally different for clients an servers. Truth In unix some server code is in the kernel but other code is a user program (run as root) called nfsd File and directory servers together? If yes, less communication If no, more modular "cleaner" Looking up a/b/c/ when a a/b a/b/c on different servers Natural soln is for server-a to return name of server-a/b Then client contacts server-a/b gets name of server-a/b/c etc. Alternatively server-a forwards request to server-a/b who forwards to server-a/b/c. Natural method takes 6 comunications (3 RPCs) Alternative is 4 communications but is not RPC Name caching The translation from a/b/c to the inode (i.e. symbolic to binary name) is expensive even for centralize system. Called namei in unix and was once measured to be a signifcant percentage of all of kernel activity. Later unices added "namei caching" Potentially an even greater time saver for dist systems since communication is expensive. Must worry about obsolete entries. Stateless vs Stateful Should the server keep information BETWEEN requests from a user, i.e. should the server maintain state? What state? Recall that the open returns an integer called a file descriptor that is subsequentally used in read/write. With a stateless server, the read/write must be self contained, i.e. cannot refer to the file descriptor. Why? Advantages of stateless Fault tollerant--No state to be lost in a crash No open/close needed (saves messages) So space used for tables (state requires storage) No limit on number of open files (no tables to fill up) No problem if client crashes (no state to be confused by) Advantages of stateful Shorter read/write (descriptor shorter than name) Better performance Since keep track of what files are open, know to keep those inodes in memory But stateful could keep a memory cache of inodes as well (evict via LRU instead of close, not a good) Blocks can be read in advance (read ahead) Of course stateless can read ahead. Difference is that with stateful can better decide when accesses are sequential. Idempotency easier (keep seq numbers) File locking possible (the lock is state) Stateless can write a lock file by convention. Stateless can call a lock server HOMEWORK 13-6 Caching There are four places to store a file suppled by a file server (these are NOT mutually exclusive) Server's disk essentially always done Server's main memory normally done Standard buffer cache Clear performance gain Little if any semantics problems Client's main memory Considerable performance gain Considerable semantic considerations The one we will study Clients disk Not so common now Unit of caching File vs block Tradeoff of fewer access vs storage efficiency What eviction algorithm? Exact LRU feasible because can afford the time to do it (via linked lists) since access rate is low HOMEWORK 13-8 Where in client's memory to put cache The user's process The cache will die with the process No cache reuse among distinct processes Not done for normal OS. Big deal in databases Cache management is a well studied DB problem The kernel (i.e. the client's kernel) System call required for cache hit Quite common Another process "Cleaner" than in kernel Easier to debug Slower Might get paged out by kernel! Look at firgure 13-10 (handout) Cache consistency Big question Write-through All writes sent to the server (as well as the client cache) Hence does not lower traffic for writes HOMEWORK 13-10 Does not by itself fix values in other caches Need to invalidate or update other caches Can have the client cache check with server whenever supplying a block to ensure that the block is not obsolete Hence still need to reach server for all accesses but at least the reads that hit in the cache only need to send tiny msg (timestamp not data). I guess this would be called lazy invalidation. Delayed write Wait a while (30 seconds is used in some NFS implementations) and then send a bulk write msg. This is more efficient that a bunch of small write msgs If file is deleted quickly, you might never write it. Semantics are now time dependent (and ugly). HOMEWORK 13-11 Write on close Session semantics Fewer msgs since more writes than closes. Not beautiful (think of two files simultaneously opened) Not much worse than normal (uniprocessor) semantics. The difference is that it (appears) to be much more likely to hit the bad case. Really mean much less unlikely. HOMEWORK 13-12