Operating Systems

================ Start Lecture #14 ================

6.1.4: File access

There are basically two possibilities, sequential access and random access (a.k.a. direct access). Previously, files were declared to be sequential or random. Modern systems do not do this. Instead all files are random and optimizations are applied when the system dynamically determines that a file is (probably) being accessed sequentially.

  1. With Sequential access the bytes (or records) are accessed in order (i.e., n-1, n, n+1, ...). Sequential access is the most common and gives the highest performance. For some devices (e.g. tapes) access “must” be sequential.
  2. With random access, the bytes are accessed in any order. Thus each access must specify which bytes are desired.

6.1.5: File attributes

A laundry list of properties that can be specified for a file For example:

6.1.6: File operations

Homework: 6, 7.

6.1.7: An Example Program Using File System Calls

Homework: Read and understand “copyfile”.

Notes on copyfile

6.1.8: Memory mapped files (Unofficial)

Conceptually simple and elegant. Associate a segment with each file and then normal memory operations take the place of I/O.

Thus copyfile does not have fgetc/fputc (or read/write). Instead it is just like memcopy

while ( *(dest++) = *(src++) );

The implementation is via segmentation with demand paging but the backing store for the pages is the file itself. This all sounds great but ...

  1. How do you tell the length of a newly created file? You know which pages were written but not what words in those pages. So a file with one byte or 10, looks like a page.
  2. What if same file is accessed by both I/O and memory mapping.
  3. What if the file is bigger than the size of virtual memory (will not be a problem for systems built 3 years from now as all will have enormous virtual memory sizes).

6.2: Directories

Unit of organization.

6.2.1-6.2.3: Single-level, Two-level, and Hierarchical directory systems

Possibilities

These are not as wildly different as they sound.

6.2.4: Path Names

You can specify the location of a file in the file hierarchy by using either an absolute or a Relative path to the file

Homework: 1, 9.

6.2.5: Directory operations

  1. Create: Produces an “empty” directory. Normally the directory created actually contains . and .., so is not really empty

  2. Delete: Requires the directory to be empty (i.e., to just contain . and ..). Commands are normally written that will first empty the directory (except for . and ..) and then delete it. These commands make use of file and directory delete system calls.

  3. Opendir: Same as for files (creates a “handle”)

  4. Closedir: Same as for files

  5. Readdir: In the old days (of unix) one could read directories as files so there was no special readdir (or opendir/closedir). It was believed that the uniform treatment would make programming (or at least system understanding) easier as there was less to learn.

    However, experience has taught that this was not a good idea since the structure of directories then becomes exposed. Early unix had a simple structure (and there was only one type of structure for all implementations). Modern systems have more sophisticated structures and more importantly they are not fixed across implementations. So if programs just used read() to read directories, the programs would have to be changed whenever the structure of a directory changed. Now we have a readdir() system call that knows the structure of directories. Therefore if the structure is changed only readdir() need be changed.

  6. Rename: As with files.

  7. Link: Add a second name for a file; discussed below.

  8. Unlink: Remove a directory entry. This is how a file is deleted. But if there are many links and just one is unlinked, the file remains. Discussed in more detail below.

6.3: File System Implementation

6.3.1: File System Layout

6.3.2: Implementing Files

Contiguous allocation

Homework: 12.

Linked allocation

Consider the following two code segments that store the same data but in a different order. The first is analogous to the linked list file organization above and the second is analogous to the ms-dos FAT file system we study next.

struct node_type {
    float data;
    int   next;    // index of next node on a linked list
} node[100]


float node_data[100]; int node_next[100];

With the second arrangement the data could be stored far away from the next pointers. In FAT this idea is taken to an extreme: The data, which is large (a disk block), is stored on disk; whereas, the next pointers which are small (an integer) are stored in memory in a File Allocation Table or FAT.

FAT (file allocation table)


Why don't we mimic the idea of paging and have a table giving for each block of the file, where on the disk that file block is stored? In other words a ``file block table'' mapping its file block to its corresponding disk block. This is the idea of (the first part of) the unix inode solution, which we study next.

I-Nodes


Algorithm to retrieve a block

Let's say that you want to find block N
(N=0 is the "first" block) and that
  There are D direct pointers in the inode numbered 0..(D-1)
  There are K pointers in each indirect block numbered 0..K-1

If N < D            // This is a direct block in the i-node
   use direct pointer N in the i-node
else if N < D + K   // This is one of the K blocks pointed to by indirect blk
   use pointer D in the inode to get the indirect block
   use pointer N-D in the indirect block to get block N
else   // This is one of the K*K blocks obtained via the double indirect block
   use pointer D+1 in the inode to get the double indirect block
   let P = (N-(D+K)) DIV K      // Which single indirect block to use
   use pointer P to get the indirect block B
   let Q = (N-(D+K)) MOD K      // Which pointer in B to use
   use pointer Q in B to get block N

6.3.3: Implementing Directories

Recall that a directory is a mapping that converts file (or subdirectory) names to the files (or subdirectories) themselves.

Trivial File System (CP/M)

MS-DOS and Windows (FAT)

Unix/linux

Homework: 27

6.3.4: Shared files (links)

Hard Links

Start with an empty file system (i.e., just the root directory) and then execute:

cd /
mkdir /A; mkdir /B
touch /A/X; touch /B/Y

We have the situation shown on the right.


Now execute
ln /B/Y /A/New
This gives the new diagram to the right.

At this point there are two equally valid name for the right hand yellow file, /B/Y and /A/New. The fact that /B/Y was created first is NOT detectable.


Assume Bob created /B and /B/Y and Alice created /A, /A/X, and /A/New. Later Bob tires of /B/Y and removes it by executing

rm /B/Y

The file /A/New is still fine (see third diagram on the right). But it is owned by Bob, who can't find it! If the system enforces quotas bob will likely be charged (as the owner), but he can neither find nor delete the file (since bob cannot unlink, i.e. remove, files from /A)

Since hard links are only permitted to files (not directories) the resulting file system is a dag (directed acyclic graph). That is, there are no directed cycles. We will now proceed to give away this useful property by studying symlinks, which can point to directories.

Symlinks

Again start with an empty file system and this time execute

cd /
mkdir /A; mkdir /B
touch /A/X; touch /B/Y
ln -s /B/Y /A/New

We now have an additional file /A/New, which is a symlink to /B/Y.

The bottom line is that, with a hard link, a new name is created for the file. This new name has equal status with the original name. This can cause some surprises (e.g., you create a link but I own the file). With a symbolic link a new file is created (owned by the creator naturally) that contains the name of the original file. We often say the new file points to the original file.

Question: Consider the hard link setup above. If Bob removes /B/Y and then creates another /B/Y, what happens to /A/New?
Answer: Nothing. /A/New is still a file with the same contents as the original /B/Y.

Question: What about with a symlink?
Answer: /A/New becomes invalid and then valid again, this time pointing to the new /B/Y. (It can't point to the old /B/Y as that is completely gone.)

Note:

Shortcuts in windows contain more that symlinks in unix. In addition to the file name of the original file, they can contain arguments to pass to the file if it is executable. So a shortcut to

netscape.exe
can specify
netscape.exe //allan.ultra.nyu.edu/~gottlieb/courses/os/class-notes.html
End of Note

What about symlinking a directory?

cd /
mkdir /A; mkdir /B
touch /A/X; touch /B/Y
ln -s /B /A/New

Is there a file named /A/New/Y ?
Yes.

What happens if you execute cd /A/New/.. ?

What did I mean when I said the pictures made it all clear?
Answer: From the file system perspective it is clear. It is not always so clear what programs will do.

6.3.5: Disk space management

All general purpose systems use a (non-demand) paging algorithm for file storage. Files are broken into fixed size pieces, called blocks that can be scattered over the disk. Note that although this is paging, it is never called paging.

The file is completely stored on the disk, i.e., it is not demand paging.

Actually, it is more complicated

  1. Various optimizations are performed to try to have consecutive blocks of a single file stored consecutively on the disk. Discussed below
    .
  2. One can imagine systems that store only parts of the file on disk with the rest on tertiary storage (some kind of tape).

  3. This would be just like demand paging.

  4. Perhaps NASA does this with their huge datasets.

  5. Caching (as done for example in microprocessors) is also the same as demand paging.

  6. We unify these concepts in the computer architecture course.

Choice of block size

We discussed this last chapter

Storing free blocks

There are basically two possibilities

  1. An in-memory bit map.
  2. Linked list with each free block pointing to next.

6.3.6: File System reliability

Bad blocks on disks

Not so much of a problem now. Disks are more reliable and, more importantly, disks take care of the bad blocks themselves. That is, there is no OS support needed to map out bad blocks. But if a block goes bad, the data is lost (not always).

Backups

All modern systems support full and incremental dumps.

Consistency

6.3.7 File System Performance

Buffer cache or block cache

An in-memory cache of disk blocks.

Homework: 29.

Block Read Ahead

When the access pattern “looks” sequential read ahead is employed. This means that after completing a read() request for block n of a file. The system guesses that a read() request for block n+1 will shortly be issued so it automatically fetches block n+1.

Reducing Disk Arm Motion

Try to place near each other blocks that are going to be read in succession.

  1. If the system uses a bitmap for the free list, it can allocate a new block for a file close to the previous block (guessing that the file will be accessed sequentially).

  2. The system can perform allocations in “super-blocks”, consisting of several contiguous blocks.
  3. For a unix-like file system, the i-nodes can be placed in the middle of the disk, instead of at one end, to reduce the seek time to access an i-node followed by a block of the file.

  4. Can divide the disk into cylinder groups, each of which is a consecutive group of cylinders.

6.3.8: Log-Structured File Systems (unofficial)

A file system that tries to make all writes sequential. That is, writes are treated as if going to a log file. The original research project worked with a unix-like file system, i.e. was i-node based.

6.4: Example File Systems

6.4.1: CD-ROM File Systems (skipped)

6.4.2: The CP/M File System

This was done above.

6.4.3: The MS-DOS File System

This was done above.

6.4.4: The windows 98 File System

Two changes were made: Long file names were supported and the allocation table was switched from FAT-16 to FAT-32.

  1. The only hard part was to keep compatibility with the old 8.3 naming rule. This is called “backwards compatibility”. A file has two name a long one and an 8.3. If the long name fits the 8.3 format, only one name is kept. If the long name does not fit the 8+3, an 8+3 version is produce via an algorithm, that works but the names produced are not lovely.

  2. FAT-32 used 32 bit words for the block numbers so the fat table could be huge. Windows 98 kept only a portion of the FAT-32 table in memory at a time. (I do not know the replacement policy, number of blocks kept in memory, etc).

6.4.5: The Unix V7 File System

This was done above.

6.5: Research on File Systems (skipped)

6.6 Summary (read)

The End: Good luck on the final