================ Start Lecture #16 ================

Chapter 4: File Systems

Requirements

  1. Size: Store very large amounts of data.
  2. Persistence: Data survives the creating process.
  3. Access: Multiple processes can access the data concurrently.

Solution: Store data in files that together form a file system.

4.1: Files

4.1.1: File Naming

Very important. A major function of the file system.

4.1.2: File structure

A file is a

  1. Byte stream
  2. (fixed size) Record stream: Out of date
  3. Varied and complicated beast.

4.1.3: File types

Examples

  1. (Regular) files.

  2. Directories: studied below.

  3. Special files (for devices). Uses the naming power of files to unify many actions.
        dir             # prints on screen
        dir > file      # result put in a file
        dir > /dev/tape # results written to tape
        
  4. ``Symbolic'' Links (similar to ``shortcuts''): Also studied below.

``Magic number'': Identifies an executable file.

Strongly typed files:

4.1.4: File access

There are basically two possibilities, sequential access and random access (a.k.a. direct access). Previously, files were declared to be sequential or random. Modern systems do not do this. Instead all files are random and optimizations are applied when the system dynamically determines that a file is (probably) being accessed sequentially.

  1. With Sequential access the bytes (or records) are accessed in order (i.e., n-1, n, n+1, ...). Sequential access is the most common and gives the highest performance. For some devices (e.g. tapes) access ``must'' be sequential.
  2. With random access, the bytes are accessed in any order. Thus each access must specify which bytes are desired.

4.1.5: File attributes

A laundry list of properties that can be specified for a file For example:

4.1.6: File operations

Homework: 2, 3, 4.
Read and understand ``copyfile'' on page 155.

Notes on copyfile

4.1.7: Memory mapped files

Conceptually simple and elegant. Associate a segment with each file and then normal memory operations take the place of I/O.

Thus copyfile does not have fgetc/fputc (or read/write). Instead it is just like memcopy

while ( *(dest++) = *(src++) );

The implementation is via segmentation with demand paging but the backing store for the pages is the file itself. This all sounds great but ...

  1. How do you tell the length of a newly created file? You know which pages were written but not what words in those pages. So a file with one byte or 10, looks like a page.
  2. What if same file is accessed by both I/O and memory mapping.
  3. What if the file is bigger than the size of virtual memory (will not be a problem for systems built 3 years from now as all will have enormous virtual memory sizes).