Released Friday, April 24, 2015
Due Wednesday, May 6, 2015, 9:00 PM
In this lab, you will add journaling (logging) and crash recovery to a simple file system.
Some notes:
We recommend doing this lab inside the development platform that you set up in the previous labs (the virtual devbox or the CIMS machines).
From within this platform,
change to your lab directory,
use Git to commit changes you've made since handing in Lab 6 (if any),
obtain
the latest version of the course
repository, and then create a local branch called lab7 based on our
origin/lab7 (which is our lab7 branch).
The commands are as follows, but you
may not need all of them (for example, if git status
indicates that
there are no changes to commit):
$ cd ~/cs202 $ git commit -am 'my solution to lab6' Created commit 254dac5: my solution to lab6 3 files changed, 31 insertions(+), 6 deletions(-) $ git pull Already up-to-date. $ git checkout -b lab7 origin/lab7 Branch lab7 set up to track remote branch refs/remotes/origin/lab7. Switched to a new branch "lab7" $ cd lab7
You are now in the lab7 directory. Your next step depends on whether you are using the CIMS machines or the devbox:
crackle1.cims.nyu.edu crackle2.cims.nyu.edu crackle3.cims.nyu.edu crackle4.cims.nyu.edu crackle5.cims.nyu.eduNote that this requirement supersedes the usual list of machines.
Next, install required library files:
$ pwd # make sure that you are in lab7 directory /home/your-netid/cs202/lab7 $ ./install-lab7-for-cims.sh You only need this on CIMS machines, press <ENTER> to continue, or <Ctrl>-C to cancel <ENTER> Copying fuse library.... Installation successful
$ sudo apt-get update # update the apt catalog ..... $ sudo apt-get install fuse libfuse-dev .....
The code provided this lab contains a filesystem that is a simplified version of the one that you completed in lab 6 (see below for the differences). Your job is to add journaling (logging) and crash recovery capabilities. Code is provided in the following files:
The file system for this lab is implemented as a FUSE driver, similar to that of lab 6. A basic overview of FUSE is available in lab 6.
The file system for this lab is similar to lab 6's file system, with a few small changes to make the system easier to work with:
The file system presents an interface by which users can commit transactions. Users can modify the file system by creating, modifying, and deleting files, and then send a commit message to save their changes. Commit messages are sent via an ioctl (I/O control) system call: more information about the ioctl interface is presented in the next section.
From the file system user's perspective, creating and comitting two new files, file1 and file2, will look like this:
$ [send commit ioctl] # end of most recent transaction ... $ touch mnt/file1 $ touch mnt/file2 $ [send commit ioctl] # at this point, file1 and file2 are both committed in a single transaction
Our file system supports this transaction interface with journaling: all operations that modify the file system state are written to a log on the disk before the file system modifies on-disk inodes and blocks. When the file system receives a commit message, a commit entry is written to the log.
The file system makes a guarantee to the user that committed transactions will survive file system crashes, and uncommitted transactions will not. When the commit ioctl returns, the file system makes a guarantee to the user that the commit operation (and all other operations in the transaction) have been persisted to the disk log. (By persisted, we mean that the sector that contains the given entry is actually on disk.)
Our use of logging is simplified, compared to a production logging system (and also compared to the protocols presented in lecture). In particular, our file system will not include checkpoints: this means that the log will continue to grow as the file system is used. When the file system is restarted, the entire on-disk structure (inodes, bitmaps, blocks...) is erased and re-constructed from the log, by replaying committed transactions. You should be able to see from this picture why non-committed transactions do not appear after a crash: they are not replayed from the log.
You will implement some components of the FUSE driver (and hence the file system): adding support for logging, crash recovery, and a transaction abstraction. Because a lot of the implementation is provided, it is important that you familiarize yourself with the provided code and the various file system interfaces.
If you haven't already, read OSTEP 42 to gain background on transactions and journaling.
Like in lab 6, the main file system code that we've provided for you resides in fsdriver.c. This file contains all the FUSE callbacks to handle I/O syscalls, as well as the main function. Once the path to the disk image and the path to the mount point (testfs.img and mnt respectively in the supplied example) have been read from the command line arguments, our FUSE driver maps the specified disk image into memory using the map_disk_image function (defined in disk_map.c), which itself initializes some file system metadata. Then, fsdriver.c calls fuse_main, which handles kernel dispatches to our registered callbacks. These callbacks are similar to the callbacks in lab 6: you will modify callbacks that write to the file system.
The FUSE driver code we provide maps the disk image into virtual memory using the mmap syscall: see lab 6's description of mmap for more information. You can think of this as caching the disk in memory. To write modified, cached contents back to the disk, one uses the command msync. Indeed, you will use msync at various points in this lab because you will need to ensure that particular disk sectors that are modified in memory (namely, those containing certain log entries) have actually "made it" to the disk. This is the underlying mechanism that gives the file system user the assurance that operations that claim to be committed are in fact fully logged.
As mentioned in the previous section, user-level processes send requests to commit to the file system using an ioctl system call. In general, ioctl is a mechanism for sending driver-defined request codes to a driver (the idea is that both the sender and the device know what a given ioctl request number means, but the kernel does not interpret them). The supplied program send_ioctl can be used to send ioctls to your driver:
$ ./send_ioctl mnt/.txn [request code]Request codes are defined in crashmod.h. The following request codes are provided:
So, for example, the following command sends a commit message to your FUSE driver:
$ ./send_ioctl mnt/.txn 4000 # When this program returns, users # of the file system have a guarantee # that their changes have been committed.
It is only after the commit ioctl returns that the file system user has a guarantee that the transaction has taken effect. If there is a crash before the ioctl is called, the transaction has not taken effect. If there is a crash during the commit ioctl, then it depends on whether the commit point was reached. (What is the commit point?)
For this lab, we are assuming the file system has a single client: you do not need to worry about multiple transactions occurring at the same time. The potential source of non-atomic behavior is if the file system crashes.
Simple log data structures and functions to manipulate them reside in
log.h
and log.c
: your job is to integrate
them with the file system.
The file system log in this lab is an array of log entries. Each log entry has an opcode. These opcodes are defined in log.h, e.g. commit entries have an opcode of 19. Almost every user update to the file system will generate a new log entry. Each log entry stores all the necessary information to reapply the effects of the corresponding user update to the file system.
Now, take a look at the union
type log_args_t
defined in log.h
. (A union, in C, is a C datatype that can
represent multiple types of structs.) This union stores all of the
information associated with a file system update (write, truncate,
etc.). User accesses will generate struct log_entry
s; each
log_entry comprises
a log_args_t
plus some log-related metadata.
log.c
, implement log_tx_add
and log_tx_done
.
log_tx_add
takes an opcode, a pointer to a log_args_t
and a timestamp; this function should append
a new entry to the log. The struct log
structure is basically an array of log entries
plus some bookkeeping data fields.
Then nentries
field of struct log
denotes the number of entries in the current log;
the txn_id
field stores the current transaction
id.
Note that you can use nentries to find the latest inserted
entry: it is log->entries[log->nentries-1]
.
Therefore, you should insert the new entry at nentries
and
advance nentries
by 1 when you are done.
New log entries should also store information about their current
transaction, specifically
the transaction number txn_id
, which can be found at log->txn_id
.
log_tx_done
is called when a user process sends an "ioctl
commit" command to the file system. Note that log_tx_done should not return
until the transaction is durable in the log: that way, once "ioctl commit"
returns, the user has a guarantee that the transaction has "taken," in
particular that it will be seen from this point forward, regardless of
whether there are crashes and restarts. (See the description of
IOCTL_COMMIT earlier.)
But log_tx_done must arrange for something else as well.
To provide crash atomicity, log_tx_done
needs to bring
the disk from one valid state to another; no intermediate state
should be visible to observers, even if
there is a crash. You will arrange for this by doing the following in
log_tx_done.
First, add a COMMIT log entry at the end of the log. Then make a call
to msync
to ensure that the COMMIT log entry, as well as
all other log entries in the same transaction, are actually on the disk.
(Use man msync to learn its usage; you want to use the
MS_SYNC flag.)
Note that at this point, this transaction is still not committed; why
not? Finally, increment nentries
and txn_id
,
and persist the
incremented variables on disk by calling
msync
again (and using the MS_SYNC flag again).
We can persist the two variables at the same time because
they will be placed on the same disk sector, and recall from your
reading and lecture that writes of single disk sectors happen
atomically, and that the disk hardware enforces this.
Once this second call to msync
returns,
the transaction is committed.
It's important to make sure that your log works correctly before proceeding to the next exercises.
Type make grade. You should be able to pass the "basic operations" test.
While this test
does not really test your implementation of log_tx_add
and log_tx_done
, it does
check that your implementation of these two functions
does not break existing file system functionality.
To do more comprehensive testing than what the grading script provides, follow the instructions below:
# run fsdriver -d $ test/makeimage.bash # create a new image (with an empty log) $ mkdir mnt # create directory mnt if you do not have mnt already $ build/fsdriver -d testfs.img mntOn a second terminal, issue the following commands:
# send commands $ ./send_ioctl mnt/.txn 5003 $ ./send_ioctl mnt/.txn 5003 $ ./send_ioctl mnt/.txn 5002 $ ./send_ioctl mnt/.txn 4000 $ ./send_ioctl mnt/.txn 5003 $ ./send_ioctl mnt/.txn 5003 $ ./send_ioctl mnt/.txn 5002If you are using one of the CIMS machines, the output from the first terminal should look similar to the following:
# [ FUSE startup output and debug information have been omitted ] ... received ioctl: 5003 adding entry... done ... received ioctl: 5003 adding entry... done ... received ioctl: 5002 dumping log... nentries: 3 entry 0: op: 0, txn_id: 0 entry 1: op: 0, txn_id: 1 entry 2: op: 0, txn_id: 1 ... received ioctl: 4000 attempting to commit... done: 0 ... received ioctl: 5003 adding entry... done ... received ioctl: 5003 adding entry... done ... received ioctl: 5002 dumping log... nentries: 6 entry 0: op: 0, txn_id: 0 entry 1: op: 0, txn_id: 1 entry 2: op: 0, txn_id: 1 entry 3: op: 19, txn_id: 1 entry 4: op: 0, txn_id: 2 entry 5: op: 0, txn_id: 2If you are using the devbox, the output will be slightly different: FUSE will check the file system first, adding some additional entries to the log:
# [ FUSE startup output and debug information have been omitted ] # [ You might see a start-up message similar to "fusermount: failed to open # /etc/fuse.conf:Permission denied". This messsage can be safely ignored. ] ... received ioctl: 5003 adding entry... done ... received ioctl: 5003 adding entry... done ... received ioctl: 5002 dumping log... nentries: 47 entry 0: op: 0, txn_id: 0 entry 1: op: 5, txn_id: 1 #[devbox only] Entries 1 to 44 are additional entries from FUSE: entry 2: op: 5, txn_id: 1 #[devbox only] When testfs.img is mounted, the system will entry 3: op: 5, txn_id: 1 #[devbox only] read from the mounted directory, generating ... #[devbox only] log entries. entry 43: op: 5, txn_id: 1 #[devbox only] entry 44: op: 5, txn_id: 1 #[devbox only] entry 45: op: 0, txn_id: 1 entry 46: op: 0, txn_id: 1 ... received ioctl: 4000 attempting to commit... done: 0 ... received ioctl: 5003 adding entry... done ... received ioctl: 5003 adding entry... done ... received ioctl: 5002 dumping log... nentries: 50 entry 0: op: 0, txn_id: 0 entry 1: op: 5, txn_id: 1 #[devbox only] Read entries from earlier. entry 2: op: 5, txn_id: 1 #[devbox only] entry 3: op: 5, txn_id: 1 #[devbox only] ... #[devbox only] entry 43: op: 5, txn_id: 1 #[devbox only] entry 44: op: 5, txn_id: 1 #[devbox only] entry 45: op: 0, txn_id: 1 entry 46: op: 0, txn_id: 1 entry 47: op: 19, txn_id: 1 entry 48: op: 0, txn_id: 2 entry 49: op: 0, txn_id: 2
During the course of testing your FUSE driver, various combinations of operations may cause your FUSE driver to stop functioning or enter a non-clean state. It may be helpful to search Piazza for information about specific error messages.
More generally, to get your system to a clean starting state, you can do:$ fusermount -u mnt # unmount the driver $ test/makeimage.bash # make a clean testfs.img $ rm -rf mnt # remove the mounting directory and its residents $ mkdir mnt # recreate the mounting directoryor you can invoke the following script, which is equivalent to the lines above:
$ ./startclean.bash
log.c
, implement log_replay
. This function is called
when the driver is (re)started: it should replay all
committed transactions -- and only committed transactions -- from
the beginning of the log to the end of the log
(log->nentries-1
).
There are several ways to do this. One way is
is to use two passes. In the first pass, record the
transaction numbers of transactions that have commit entries. In the
second pass, call
log_entry_install
on all transaction entries that are part of a committed
transaction. log_entry_install
takes a single log entry and modifies the
file system: you will fill in this function in a later exercise.
You can also implement log_replay in a single pass. To avoid undue code
complexity, it may be helpful to model the replay logic as a FSM (finite state machine).
make grade
to test your work so far.
You should be able to pass the "replay" test.
We have provided a full implementation of file system functions in
fsdriver.c
. In order for the file system to support transactions,
all file system operations that modify the disk must have corresponding
entries in the log.
(The following commands modify the disk: write
, link
,
unlink
, rename
, mknod
, and truncate
.)
fs_mknod
in fsdriver.c
to use transactions: currently, it writes changes
directly to the disk using fs_mknod_install
. First, add a log entry for the
operation using log_tx_add
: pass in the opcode OP_MKNOD and
a pointer to a populated log_args_t
. Then, call
fs_mknod_install
with the appropriate arguments.
In this exercise and the next, you will find some or all of the
following library functions useful: strcpy, memcpy,
and time (as usual, use the man pages to learn the usage of
these library calls; for example, man 2 time).
Also implement the OP_MKNOD
case branch in
log_entry_install
(defined in log.c
) by calling fs_mknod_install
with the appropriate arguments.
test_touch_crash.bash
; equivalently, you should pass the
"touch" test when doing make grade.
The test checks the following:
$ touch mnt/[file]
creates a log entry for mknod and creates a file.
Using the same pattern as in Exercise 3,
modify the following functions to use transactions: fs_write
,
fs_link
, fs_unlink
, fs_rename
,
and fs_truncate
. See the prior exercise for standard C
library calls that may be useful.
test/
and make grade
to test your
solution.
fs_write
, you should be able to pass the "echo", "time",
"big_txn" and "many_txn" tests.
fs_truncate
as well as fs_write
, you should be able to pass the "truncate" test.
fs_rename
, you should be able to pass the "mv" test.
fs_link
,
you should be able to pass the "link" test.
fs_link
and fs_unlink
,
you should be able to pass the "unlink" test.
Exercise 5 (Honors).
The logging protocol saves both committed and uncommitted transactions to the log: partially
committed transactions that can never commit waste space. Modify log_replay
so it cleans up uncommitted transactions and compresses the log before returning. On
returning, the
log should only contain contigous, committed transactions. If the driver crashes during
the middle of this clean-up, it shouldn't cause log corruption. You are free to add
necessary fields to log
and/or log_entry
.
Answer the following questions in answers.txt.
If you are working in a pair, only one of the team members should submit.
If you have not done so yet, commit your code.
$ git status # see what files you changed $ git diff # examine your changes $ git commit -am "My solutions to lab7"Then, follow these steps:
$ make handinThis command removes binaries from your directory, archives your directory in a tarball called
lab7-handin.tgz
, and invokes a
script that submits this tarball to our submission server.
Assuming that you performed the submission
setup step in lab1,
you should see something like this:
CLEAN CREATE lab7-handin.tgz SUBMITTING ... ... progress ... Submission successful
After a successful submission, you should receive email confirmation (at your NYU email address).
If you submit multiple times, we will take the latest submission and count slack hours accordingly.
This completes the lab.
Last updated: Wed May 06 17:51:31 -0400 2015 [validate xhtml]