---------------- Chapter 13--Distributed File Systems ----------------

File service vs file server

    File service is the specification

    File server is an process running on a machine to implement the
    file service for (some) files on that machine

    In a normal distributed would have one file service
    but perhaps many file servers

        If have very different kinds of filesystems might not be able
        to have a single file service as perhaps some services are not
        available

File Server Design

    File

        Sequence of bytes

            Unix

            MS-Dos

            Windows

        Sequence of Records

            Mainframes

            Keys

            We do not cover these rilesystems.  They are often
            discussed in database courses

    File attributes

        rwx perhaps a (append)

            This is really a subset of what is called
            ACL -- access control list
            or 
            Capability

            Get ACLs and Capabilities by reading columns and rows of
            the access matrix

        owner, group, various dates, size

        dump, autocompress, immutable
            
    Upload/download  vs  remote access

        Upload/download means only file services supplied are read
        file and write file.

            All mods done on local copy of file

            Conseptually simple at first glance

            Whole file transfers are efficient (assuming you are going
            to access most of the file) when compared to multiple
            small accesses

            Not efficient use of bandwidth if you access only small
            part of large file.

            Requires storage on client

            What about concurrent updates?

                What if one client reads and "forgets" to write for a
                long time and then writes back the "new" version
                overwritting newer changes from others?

        Remote access means direct individual reads and writes to the
        remote copy of the file

            File stays on server

            Issue of (client) buffering

                Good to reduce number of remote accesses.

                What about semantics when a write occurs?

                    Note that meta-data is written for a read so if
                    you want faithful semantics.  Ever client read
                    must mod metadata on server or all requests for
                    metadata (e.g ls or dir commands) must go to
                    server.

                Cache consistency question

    Directories

        Mapping from names to files/directories

        Contains rules for names of files and (sub)directories

        Hierarchy i.e. tree

        (hard) links

            gives another name to an existing file

            a new directory entry

            The old and new name have equal status

                cd ~
                mkdir dir1
                touch dir1/file1
                ln dir1/file1 file2

                Now ~/file2 is the SAME file as ~/dir1/file1

                    In unix-speak they have the same inode

                Need to do rm twice to actually delete the file

            The owner is NOT changed so

                cd ~
                ln ~joe/file1 file2

            Gives me a link to a file of joe.  Presumably joe set his
            permissions so I can't write it.

            Now joe does

                rm ~/file1

            But my file2 still exists and is owned by joe.  Most
            accounting programs would charge the file to joe (who
            doesn't know it exists).

            With hard links the filesystem becomes a DAG instead of a
            simple tree.

        Symlinks

            Symbolic (NOT symmetric).  Indeed asymetric

            Consider

                cd ~
                mkdir dir1
                touch dir1/file1
                ln -s dir1/file1 file2

            file2 has a new inode it is a new type of file called a
            symlink and its "contents" are the name of the file
            dir/file1

            When accessed file2 returns the contents of file1, but it
            is NOT equal to file1.

                If file1 is deleted, file2 "exists" but is invalid

                If a new file2 is created, file2 now points to it.

            Symlinks can point to directories as well

            With symlinks pointing to directories, the filesystem
            becomes a general graph, i.e. directed cycles are
            permitted.