CS372H Spring 2010 Homework 8

Problem 1:

Consider a very simple file system for a tiny disk. Each sector on the disk holds 2 integers, and all data blocks, indirect blocks, and inodes are 1 disk sector in size (each contains 2 integers). All files stored on disk are interpreted as directories by the file system (there are no "data files"). The file system defines the layout for the following data types on disk:
inode = 1 pointer to a data block + 1 pointer to indirect block

indirect block = 2 pointers to data blocks

directory = a regular file containing zero or more pairs of integers; the first integer of each pair is a file name and the second is the file's inumber

The value "99" signifies a null pointer when referring to a disk block address or directory name. An empty directory has one disk block with the contents "99   99". The inumber for root directory is "/" is 0.

The following data are stored on disk:

inode array:

0 1  2  3  4   5  6  7   8   9  10 11 12 13 14 15
10
6
7
99
   8
99
      3
99
                          

disk blocks:

 0  1  2  3  4   5  6 7 8  9  10 11 12 13 14 15
   32
3
   96
1
      1
99
99
99
99
99
   57
6
              

  1. How many entries can appear in a maximum-sized directory? (Each entry is a pair of integers)

  2. List all directories stored on this disk (full path names) along with the names of the files stored in each directory.

    directory path name inumber indirect blocks data blocks contents (subdirectories)
    /
    0
         

  3. Modify the above data structures to add an empty directory called "87" to directory "/"

    directory path name inumber indirect blocks data blocks contents (subdirectories)
    /87
           

Problem 2:

Is it fundamentally necessary to store on disk the information about the unallocated disk sectors? Explain why.

Problem 3:

The FastFile file system uses an inode array to organize the files on disk. Each inode consists of a user id (2 bytes), three time stamps (4 bytes each), protection bits (2 bytes), a reference count (2 byte), a file type (2 bytes) and the size (4 bytes). Additionally, the inode contains 13 direct indexes, 1 index to a 1st-level index table, 1 index to a 2nd-level index table, and 1 index to a 3rd level index table. The file system also stores the first 436 bytes of each file in the inode.
  1. Assume a disk sector is 512 bytes, and assume that any auxilliary index table takes up an entire sector, what is the maximum size for a file in this system.
  2. Is there any benefit for including the first 436 bytes of the file in the inode?

Problem 4:

Can we implement hard links in DOS (which uses a FAT file system)? Why or why not??

Problem 5:

Can we implement symbolic links in DOS (FAT file system)? If so, show how, and if not, explain why.

Problem 6:

In some early releases of an operating system that shall remain nameless, when a file was deleted, its sectors reverted to the free list but they were not erased. What problems do you think may result from this? Why do you think the blocks were not erased?

Problem 7:

Pooh Software Ltd. is selling a file system that uses a UNIX-like file system with multi-level indexing. For more reliability, the inode array is actually replicated on the disk in two different places. The intent is that if one or a group of sectors that are storing either replica of the array become bad, the system can always recover from the replica. Discuss the effect of having this replicated data structure on performance.

Problem 8:

Consider an indexed file allocation using index nodes (inodes). An inode contains among other things, 7 indexes, one indirect index, one double index, and one triple index.
  1. What usually is stored in the inode in addition to the indexes?
  2. What is the disadvantage of storing the file name in the inode? Where should the file name be stored?
  3. If the disk sector is 512 bytes, what is the maximum file size in this allocation scheme?
  4. Suppose we would like to enhance the file system by supporting versioning. That is, when a file is updated, the system creates new version leaving the previous one intact. How would you modify the inode allocation scheme to support versioning? You answer should consider how a new version is created and deleted.
  5. In a file system supporting versioning, would you put information about the version number in the inode, or in the directory tree? Justify your answer.

Problem 9:

The LoneStar backup system for UNIX works as follows: At the beginning of each week, the backup system traverses the file system tree structure and saves all directories and files. This stage is called full backup. Then, on a daily basis, it performs incremental backups. It does so by traversing the directory tree, and saving only the files whose time stamps show that they have been modified since the previous incremental backup. Note that the LoneStar system does not follow the symbolic links when saving files, instead storing them as symbolic links. This preserves the integrity of the symbolic link.
  1. What happens if the backup system were to follow the symbolic links and save the files the link points to?
  2. Identify problems with this backup scheme.
  3. Instead of doing the above, the KeyStone backup system requires the file system to maintain a bit map for every sector on disk. If a disk sector is updated, the file system sets the corresponding bit to one. When the KeyStone system performs an incremental backup, it saves only the disk sectors whose bits are set, then it resets the entire bit map. Compare between the two backup schemes. What are  the common problems between them? What are the key advantages of each? Which one would you use?

Problem 10:

Explain what happens to the disk in UNIX when a user in a text editor saves a new file called "foo" into the current directory. Assume that foo is 15678 bytes in length, that a disk block is 1024 bytes, that the file system supports up to 2^32 sectors, and that an inode contains 10 direct block pointers, 1 indirect block pointer, 1 doubly-indirect block pointer, and 1 triply-indirect block pointer. The system uses on-disk free maps for tracking free blocks and free inodes. Here's the set of actions that you may assume that the editor makes:
fd = open(foo, O_CREAT|O_WRONLY);
p = &editBuffer;
numBlocks = editBufferSize / 1024;
for (i = 0; i < numBlocks; i++) {
write(fd, p, 1024);
    p += 1024;
}
lengthOfLastPartialBlock = editBufferSize % 1024;
write(fd, p, lengthOfLastPartialBlock);
close(fd);
Assume that creating a file is synchronous -- all writes to disk complete before the call returns. Assume that write() calls are asynchronous (they write data and metadata to memory, but don't force the writes to disk), but that close() does not return until all writes to the file and metadata are safely on disk. Assume that all of the data structures that must be read to perform this action are already in the cache.
  1. Assuming the system does not use logging, explain what blocks get written out to the disk and what each block contains.
  2. Suppose the machine crashes after writing the inode, free map, and data blocks, but before writing any indirect blocks. Describe how, if not fixed when the system reboots, such an inconsistency could cause security violations in which one user could access the files of another user (be specific: How would this happen? Which blocks are vulnerable? Can any of the user's data be exposed to other users? Can any other user's data be exposed to this user?)
  3. Suppose you were to implement write-ahead logging to fix this problem. Describe exactly what your system would write to the log to satisfy this user's requests, when the writes occur, and when the system can "apply" the updates -- copy them to their normal positions on disk.

Problem 11:

  1. An engineer has designed a FAT-like system and he has used 24 bits for each entry. For a 32-GB disk, what is the minimum size of a file allocation in this system? Justify your answer.

  2. Consider an index-based file system with the inode containing 64 indexes, 1 indirect index pointing to a disk block containing an array of direct indexes, and 1 2-level index in the usual way. Assume that each index takes 4 bytes.
    1. What is the maximum file size under this arrangement, if a disk block is 1024 bytes? Explain how do you compute this maximum size.
    2. How many disk accesses does it take to read one disk block at location 3000321 within a file, assuming no caching. Justify your answer.

Problem 12:

Versioning: It is often desirable for users to maintain different versions of the same file (for example, during program development, for recovering in the case of wrong updates applied to a version, etc.). The VMS file system implements versioning by creating a new copy for the file every time it is opened for writing. The system also supported versioning in the naming structure, appending a version number to the name of each file. For instance, when foo.c;1 is updated, the system creates foo.c;2, etc. The system implemented commands to manipulate versions, for example, the PURGE command would delete all versions except the most recent one. Also, users could disable versioning for some files. Discuss the pros and cons for this scheme.

Problem 13:

An implementation of the FSCK program traverses the file system tree and builds two lists of the disk blocks. One list contains the sectors that are shown to be in use, while the other reads the free disk-block information on disk. If the two lists for 4 blocks were as shown:

In use 0 1 0 1
Free 0 0 1 1

Identify what are the problems, if any, and what should fsck do for each block.