Operating Systems

================ Start Lecture #28 ================

Block Read Ahead

When the access pattern “looks” sequential read ahead is employed. This means that after completing a read() request for block n of a file. The system guesses that a read() request for block n+1 will shortly be issued so it automatically fetches block n+1.

How do you decide that the access pattern looks sequential?
- If a seek system call is issued, the access pattern is not sequential.
- If a process issues consecutive read() system calls for block n-1 and then n, the access patters is guessed to be sequential.
What if block n+1 is already in the block cache?
Ans: Don't issue the read ahead.
Would it be reasonable to read ahead two or three blocks?
Ans: Yes.
Would it be reasonable to read ahead the entire file?
Ans: No, it could easily pollute the cache evicting needed blocks and could waste considerable disk bandwidth.

Reducing Disk Arm Motion

Try to place near each other blocks that are going to be read in succession.

If the system uses a bitmap for the free list, it can allocate a new block for a file close to the previous block (guessing that the file will be accessed sequentially).
The system can perform allocations in “super-blocks”, consisting of several contiguous blocks.
- Block cache and I/O requests are still in blocks.
- If the file is accessed sequentially, consecutive blocks of a super-block will be accessed in sequence and these are contiguous on the disk.
For a unix-like file system, the i-nodes can be placed in the middle of the disk, instead of at one end, to reduce the seek time to access an i-node followed by a block of the file.
Can divide the disk into cylinder groups, each of which is a consecutive group of cylinders.
- Each cylinder group has its own free list and, for a unix-like file system, its own space for i-nodes.
- If possible, the blocks for a file are allocated in the same cylinder group as is the i-node.
- This reduces seek time if consecutive accesses are for the same file.

6.3.8: Log-Structured File Systems (unofficial)

A file system that tries to make all writes sequential. That is, writes are treated as if going to a log file. The original research project worked with a unix-like file system, i.e. was i-node based.

Assumption is that large block caches will eliminate most disk reads so we need to improve writes.
Buffer writes until have (say) 1MB to write.
When the buffer is full, write it to the end of the disk (treating the disk as a log).
Thus writes are sequential and hence fast
The 1MB units on the disk are called (unfortunately) segments. I will refer to the buffer as the segment buffer.
A segment can have i-nodes, direct blocks, indirect blocks, blocks forming part of a file, blocks forming part of a directory. In short a segment contains the most recently modified (or created) 1MB of blocks.
Note that modified blocks are not reclaimed!
The system keeps a map of where the most recent version of each i-node is located. The map is on disk (but the heavily accessed parts will be in the block cache.
So the (most up to date) i-node of a file can be found and from that the entire file can be found.
But the disk will fill with garbage since modified blocks are not reclaimed.
A “cleaner” process runs in the background and examines segments starting from the beginning. It removes overwritten blocks and then adds the remaining blocks to the segment buffer. (This is not trivial.)
Thus the disk is compacted and is treated like a circular array of segments.

6.4: Example File Systems

6.4.1: CD-ROM File Systems (skipped)

6.4.2: The CP/M File System

This was done above.

6.4.3: The MS-DOS File System

This was done above.

6.4.4: The windows 98 File System

Two changes were made: Long file names were supported and the allocation table was switched from FAT-16 to FAT-32.

The only hard part was to keep compatibility with the old 8.3 naming rule. This is called “backwards compatibility”. A file has two name a long one and an 8.3. If the long name fits the 8.3 format, only one name is kept. If the long name does not fit the 8+3, an 8+3 version is produce via an algorithm, that works but the names produced are not lovely.
FAT-32 used 32 bit words for the block numbers so the fat table could be huge. Windows 98 kept only a portion of the FAT-32 table in memory at a time. (I do not know the replacement policy, number of blocks kept in memory, etc).

6.4.5: The Unix V7 File System

This was done above.

Operating Systems

Block Read Ahead

Reducing Disk Arm Motion

6.3.8: Log-Structured File Systems (unofficial)

6.4: Example File Systems

6.4.1: CD-ROM File Systems (skipped)

6.4.2: The CP/M File System

6.4.3: The MS-DOS File System

6.4.4: The windows 98 File System

6.4.5: The Unix V7 File System

6.5: Research on File Systems (skipped)

6.6 Summary (read)