================ Start Lecture #27 ================

Notes:
  1. As expected robin has confirmed that the friday room is 102 WWH, i.e., our classroom.
  2. A practice final is available on the web.

6.3.4: Shared files (links)

Hard Links

Start with an empty file system (i.e., just the root directory) and then execute:

cd /
mkdir /A; mkdir /B
touch /A/X; touch /B/Y

We have the situation shown on the right.


Now execute
ln /B/Y /A/New
This gives the new diagram to the right.

At this point there are two equally valid name for the right hand yellow file, /B/Y and /A/New. The fact that /B/Y was created first is NOT detectable.


Assume Bob created /B and /B/Y and Alice created /A, /A/X, and /A/New. Later Bob tires of /B/Y and removes it by executing

rm /B/Y

The file /A/New is still fine (see third diagram on the right). But it is owned by Bob, who can't find it! If the system enforces quotas bob will likely be charged (as the owner), but he can neither find nor delete the file (since bob cannot unlink, i.e. remove, files from /A)

Since hard links are only permitted to files (not directories) the resulting file system is a dag (directed acyclic graph). That is, there are no directed cycles. We will now proceed to give away this useful property by studying symlinks, which can point to directories.

Symlinks

Again start with an empty file system and this time execute

cd /
mkdir /A; mkdir /B
touch /A/X; touch /B/Y
ln -s /B/Y /A/New

We now have an additional file /A/New, which is a symlink to /B/Y.

The bottom line is that, with a hard link, a new name is created for the file. This new name has equal status with the original name. This can cause some surprises (e.g., you create a link but I own the file). With a symbolic link a new file is created (owned by the creator naturally) that contains the name of the original file. We often say the new file points to the original file.

Question: Consider the hard link setup above. If Bob removes /B/Y and then creates another /B/Y, what happens to /A/New?
Answer: Nothing. /A/New is still a file with the same contents as the original /B/Y.

Question: What about with a symlink?
Answer: /A/New becomes invalid and then valid again, this time pointing to the new /B/Y. (It can't point to the old /B/Y as that is completely gone.)

Note:

Shortcuts in windows contain more that symlinks in unix. In addition to the file name of the original file, they can contain arguments to pass to the file if it is executable. So a shortcut to

netscape.exe
can specify
netscape.exe //allan.ultra.nyu.edu/~gottlieb/courses/os/class-notes.html
End of Note

What about symlinking a directory?

cd /
mkdir /A; mkdir /B
touch /A/X; touch /B/Y
ln -s /B /A/New

Is there a file named /A/New/Y ?
Yes.

What happens if you execute cd /A/New/.. ?

What did I mean when I said the pictures made it all clear?
Answer: From the file system perspective it is clear. It is not always so clear what programs will do.

6.3.5: Disk space management

All general purpose systems use a (non-demand) paging algorithm for file storage. Files are broken into fixed size pieces, called blocks that can be scattered over the disk. Note that although this is paging, it is never called paging.

The file is completely stored on the disk, i.e., it is not demand paging.

Actually, it is more complicated

  1. Various optimizations are performed to try to have consecutive blocks of a single file stored consecutively on the disk. Discussed below
    .
  2. One can imagine systems that store only parts of the file on disk with the rest on tertiary storage (some kind of tape).

  3. This would be just like demand paging.

  4. Perhaps NASA does this with their huge datasets.

  5. Caching (as done for example in microprocessors) is also the same as demand paging.

  6. We unify these concepts in the computer architecture course.

Choice of block size

Storing free blocks

There are basically two possibilities

  1. An in-memory bit map.
  2. Linked list with each free block pointing to next.

6.3.6: File System reliability

Bad blocks on disks

Not so much of a problem now. Disks are more reliable and, more importantly, disks take care of the bad blocks themselves. That is, there is no OS support needed to map out bad blocks. But if a block goes bad, the data is lost (not always).

Backups

All modern systems support full and incremental dumps.

Consistency

6.3.7 File System Performance

Buffer cache or block cache

An in-memory cache of disk blocks.

Homework: 29.

Block Read Ahead

When the access pattern ``looks'' sequential read ahead is employed. This means that after completing a read() request for block n of a file. The system guesses that a read() request for block n+1 will shortly be issued so it automatically fetches block n+1.

Reducing Disk Arm Motion

Try to place near each other blocks that are going to be read in succession.

  1. If the system uses a bitmap for the free list, it can allocate a new block for a file close to the previous block (guessing that the file will be accessed sequentially).

  2. The system can perform allocations in ``super-blocks'', consisting of several contiguous blocks.
  3. For a unix-like file system, the i-nodes can be placed in the middle of the disk, instead of at one end, to reduce the seek time to access an i-node followed by a block of the file.
  4. Can divide the disk into cylinder groups, each of which is a consecutive group of cylinders.

6.3.8: Log-Structured File Systems (unofficial)

A research file system that tries to make all writes sequential. That is, writes are treated as if going to a log file. This project worked with a unix-like file system, i.e. was i-node based.

6.4: Example File Systems (watch this space)

6.4.1: CD-ROM File Systems (skipped)

6.4.2: The CP/M File System

This was done above.

6.4.3: The MS-DOS File System

This was done above.

6.4.4: The windows 98 File System (skipped)

6.4.5: The Unix V7 File System

This was done above.

Draw the steps to find /allan/gottlieb/linker.java

6.5: Research on File Systems (skipped)

6.6 Summary (read)