================ Start Lecture #19
================
Notes:
- Shortcuts in windows contain more that symlinks in unix. In addition
to the file name of the original file, they can contain arguments to
pass to the file if it is executable. So a shortcut to
netscape.exe
can specify
netscape.exe //allan.ultra.nyu.edu/~gottlieb/courses/os/class-notes.html
For this reason I withdraw my objection to the name shortcut
- In answer to a question, I remarked that a hard link is a link to
a file;
whereas a symlink (or shortcut) is a link to a name.
This is a good point and I should have put it in the notes before (it
is there now).
End of Notes
What about symlinking a directory?
cd /
mkdir /A; mkdir /B
touch /A/X; touch /B/Y
ln -s /B /A/New
Is there a file named /A/New/Y ?
Yes.
What happens if you execute cd /A/New/.. ?
- Answer: Not clear!
- Clearly you are changing directory to the parent directory of
/A/New. But is that /A or /?
- The command interpreter I use offers both possibilities.
- cd -L /A/New/.. takes you to A (L for logical).
- cd -P /A/New/.. takes you to / (P for physical).
- cd /A/New/.. takes you to A (logical is the default).
What did I mean when I said the pictures made it all clear?
Answer: From the file system perspective it is clear. Not always so
clear what programs will do.
4.3.4: Disk space management
All general purpose systems use a (non-demand) paging
algorithm for file storage. Files are broken into fixed size pieces,
called blocks that can be scattered over the disk.
Note that although this is paging, it is never called paging.
The file is completely stored on the disk, i.e., it is not
demand paging.
Actually, it is more complicated, but this is unofficial (i.e.,
will not appear on exams).
- Various optimizations are
performed to try to have consecutive blocks of a single file stored
consecutively on the disk.
- One can imagine systems that store only parts of the file on disk
with the rest on tertiary storage (some kind of tape).
- This would be just like demand paging.
- Perhaps NASA does this with their huge datasets.
- Caching (as done for example in microprocessors) is also the same
as demand paging.
- We unify these concepts in the computer architecture course.
Choice of block size
- We discussed this before when studying page size.
- Current commodity disk characteristics (not for laptops) result in
about 15ms to transfer the first byte and 10K bytes per ms for
subsequent bytes (if contiguous).
- We will explain the following terms in the I/O chapter.
- Rotation rate is 5400, 7600, or 10,000 RPM (15K just now
available).
- Recall that 6000 RPM is 100 rev/sec or one rev
per 10ms. So half a rev (the average time for to rotate to a
given point) is 5ms.
- Transfer rates around 10MB/sec = 10KB/ms.
- Seek time is around 10ms.
- This favors large blocks, 100KB or more.
- But the internal fragmentation would be severe since many files
are small.
- Typical block sizes are 4KB and 8KB.
- Unofficial
- Systems that contain multiple block sizes have been tried
(i.e., the system uses blocks of size A for some files and blocks
of size B for other files).
- Some systems use techniques to try to have consecutive blocks
of a given file near each other as well as blocks of ``related''
files (e.g., files in the same directory).
Storing free blocks
There are basically two possibilities
- An in-memory bit map.
- One bit per block
- If block size is 4KB = 32K bits, 1 bit per 32K bits
- So 32GB disk (potentially all free) needs 1MB ram.
- Variation is to demand page the bit map. This saves space
(RAM) at the cost of I/O.
- Linked list with each free block pointing to next.
- Thus you must do a read for each request.
- But reading a free block is a wasted I/O.
- Instead some free blocks contain pointers to other free
blocks. This has much less wasted I/O, but is more complicated.
- When read a block of pointers store them in memory.
- See diagram on right.
4.3.5: File System reliability
Bad blocks on disks
Not so much of a problem now. Disks are more reliable and, more
importantly, disks take care of the bad blocks themselves. That is,
there is no OS support needed to map out bad blocks. But if a block
goes bad, the data is lost (not always).
Backups
All modern systems support full and
incremental dumps.
- A level 0 dump is a called a full dump (i.e., dumps everything).
- A level n dump (n>0) is called an incremental dump and the
standard unix utility dumps
all files that have changed since the previous level n-1 dump.
- Other dump utilities dump all files that have changed since the
last level n dump.
- Keep on the disk the dates of the most recent level i dumps
for all i. In Unix this is traditionally in /etc/dumpdates.
- What about the nodump attribute?
- Default policy (for Linux at least) is to dump such files
anyway when doing a full dump, but not dump them for incremental
dumps.
- Another way to say this is the nodump attribute is honored for
level n dumps if n>1.
- The dump command has an option to override the default policy
(can specify k so that nodump is honored for level n dumps if n>k).