Operating Systems

2000-2001 Fall
Tues Thurs 5-6:50
Ciww 109

Chapter -1: Administrivia

I start at -1 so that when we get to chapter 1, the numbering will agree with the text.

(-1).1: Contact Information

gottlieb@nyu.edu (best method)
http://allan.ultra.nyu.edu/~gottlieb two el's in allan
715 Broadway, Room 712

(-1).2: Course Web Page

There is a web site for the course. You can find it from my home page, which is http://allan.ultra.nyu.edu/~gottlieb

You can find these lecture notes on the course home page. Please let me know if you can't find it.
I mirror my home page on the CS web site.
I also mirror the course pages on the CS web site.
But, the official site is allan.ultra.nyu.edu. It is the one I personally manage.
The notes will be updated as bugs are found.
I will also produce a separate page for each lecture after the lecture is given. These individual pages might not get updated as quickly as the large page

(-1).3: Textbook

The course text is Tanenbaum, "Modern Operating Systems", 2nd Edition

The first edition is not adequate as there have been many changes.
Available in bookstore.
We will cover nearly all of the first 6 chapters.

(-1).4: Computer Accounts and Mailman Mailing List

You are entitled to a computer account, please get it asap.
Sign up for the Mailman mailing list for the course. http://www.cs.nyu.edu/mailman/listinfo/g22_2250_001_fa01
If you want to send mail just to me, use gottlieb@nyu.edu not the mailing list.
Questions on the labs should go to the mailing list. You may answer questions posed on the list as well.
I will respond to all questions; if another student has answered the question before I get to it, I will confirm if the answer given is correct.

(-1).5: Grades

Assuming 3 labs, which is likely, grades will computed as
.3*LabAverage + .7*FinalExam (but see homeworks below).

(-1).6: Homeworks and Labs

I make a distinction between homeworks and labs.

Labs are

Required.
Due several lectures later (date given on assignment).
Graded and form part of your final grade.
Penalized for lateness.
Computer programs you must write.

Homeworks are

Optional.
Due the beginning of Next lecture.
Not accepted late.
Mostly from the book.
Collected and returned.
Can help, but not hurt, your grade.

(-1).6.1: Doing Labs on non-NYU Systems

You may solve lab assignments on any system you wish, but ...

You are responsible for any non-nyu machine. I extend deadlines if the nyu machines are down, not if yours are.
Be sure to upload your assignments to the nyu systems.
- In an ideal world, a program written in a high level language like Java, C, or C++ that works on your system would also work on the NYU system used by the grader. Sadly this ideal is not always achieved despite marketing claims that it is achieved. So, although you may develop you lab on any system, you must ensure that it runs on the nyu system assigned to the course.
- If somehow your assignment is misplaced by me and/or a grader, we need a to have a copy ON AN NYU SYSTEM that can be used to verify the date the lab was completed.
- When you complete a lab (and have it on an nyu system), do not edit those files. Indeed, put the lab in a separate directory and keep out of the directory. You do not want to alter the dates.

(-1).6.2: Obtaining Help with the Labs

Good methods for obtaining help include

Asking me during office hours (see web page for my hours).
Asking the mailing list.
Asking another student, but ...
Your lab must be your own.
That is, each student must submit a unique lab. Naturally changing comments, variable names, etc does not produce a unique lab

(-1).7: The Upper Left Board

I use the upper left board for lab/homework assignments and announcements. I should never erase that board. View as a file it is group readable (the group is those in the room), appendable by just me, and (re-)writable by no one. If you see me start to erase an announcement, let me know.

(-1).8: A Grade of ``Incomplete''

It is university policy that a student's request for an incomplete be granted only in exceptional circumstances and only if applied for in advance. Naturally, the application must be before the final exam.

Chapter 0: Interlude on Linkers

Originally called a linkage editor by IBM.

A linker is an example of a utility program included with an operating system distribution. Like a compiler, the linker is not part of the operating system per se, i.e. it does not run in supervisor mode. Unlike a compiler it is OS dependent (what object/load file format is used) and is not (normally) language dependent.

0.1: What does a Linker Do?

Link of course.

When the compiler and assembler have finished processing a module, they produce an object module that is almost runnable. There are two remaining tasks to be accomplished for the object module to be runnable. Both are involved with linking (that word, again) together multiple object modules. The tasks are relocating relative addresses and resolving external references.

0.1.1: Relocating Relative Addresses

Each module is (mistakenly) treated as if it will be loaded at location zero.
For example, the machine instruction
jump 100
is used to indicate a jump to location 100 of the current module.
To convert this relative address to an absolute address, the linker adds the base address of the module to the relative address. The base address is the address at which this module will be loaded.
Example: Module A is to be loaded starting at location 2300 and contains the instruction
jump 120
The linker changes this instruction to
jump 2420
How does the linker know that Module M5 is to be loaded starting at location 2300?
It processes the modules one at a time. The first module is to be loaded at location zero. So relocating is trivial (adding zero). We say that the relocation constant is zero.
After processing the first module, the linker knows its length (say that length is L1).
Hence the next module is to be loaded starting at L1, i.e., the relocation constant is L1.
In general the linker keeps the sum of the lengths of all the modules it has already processed; this sum is the relocation constant for the next module.

0.1.2: Resolving External Reverences

If a C (or Java, or Pascal) program contains a function call
f(x)
to a function f() that is compiled separately, the resulting object module must contain some kind of jump to the beginning of f.
But this is impossible!
When the C program is compiled. the compiler and assembler do not know the location of f() so there is no way they can supply the starting address.
Instead a dummy address is supplied and a notation made that this address needs to be filled in with the location of f(). This is called a use of f.
The object module containing the definition of f() contains a notation that f is being defined and gives the relative address of the definitiion, which the linker can convert to an absolute address

The output of a linker is called a load module because it is now ready to be loaded and run.

To see how a linker works lets consider the following example, which is the first dataset from lab #1. The description in lab1 is more detailed.

The target machine is word addressable and has a memory of 250 words, each consisting of 4 decimal digits. The first (leftmost) digit is the opcode and the remaining three digits form an address.

Each object module contains three parts, a definition list, the program text itself, and a use list. Each definition is a pair (sym, loc). Each use is a pair (sym, loc). The address in loc points to the next use or is 888 to end the chain.

The program text consists of a count N followed by N pairs (type, word), where word is a 4-digit instruction described above and type is a single character indicating if the address in the word is Immediate, Absolute, Relative, or External.

Input set #1

1 xy 2
5 R 1004  I 5678  E 2777  R 8002  E 7002
1 z 4
0
6 R 8001  E 1777  E 1001  E 3002  R 1002  A 1010
1 z 3
0
2 R 5001  E 4777
1 z 1
1 z 2
3 A 8000  E 1777  E 2001
1 xy 2

The first pass simply produces the symbol table giving the values for xy and z (2 and 15 respectively). The second pass does the real work (using the values in the symbol table).

(Unofficial) Remark: It is faster (less I/O) to do a one pass approach, but is harder since you need ``fix-up code'' whenever a use occurs in a module that precedes the module with the definition.

            Symbol Table
                xy=2
                z=15

               Memory Map
 +0
 0:       R 1004      1004+0 = 1004
 1:       I 5678               5678
 2: xy:   E 2777 ->z           2015
 3:       R 8002      8002+0 = 8002
 4: ->z   E 7002               7015
 +5    
 0        R 8001      8001+5 = 8006
 1        E 1777 ->z           1015
 2        E 1001 ->z           1015
 3 ->z    E 3002               3015
 4        R 1002      1002+5 = 1007
 5        A 1010               1010
 +11    
 0        R 5001      5001+11= 5012
 1 ->z    E 4777               4015
 +13    
 0        A 8000               8000
 1        E 1777 ->xy          1002
 2 z:->xy E 2001               2002

The linker on unix is mistakenly called ld (for loader), which is unfortunate since it links but does not load.

Lab #1: Implement a linker. The specific assignment is detailed on the sheet handed out in in class and is due 27 September. The content of the handout is available on the web as well (see the class home page).

End of Interlude on Linkers

Chapter 1: Introduction

Homework: Read Chapter 1 (Introduction)

Levels of abstraction (virtual machines)

Software (and hardware, but that is not this course) is often implemented in layers.
The higher layers use the facilities provided by lower layers.
Alternatively said, the upper layers are written using a more powerful and more abstract virtual machine than the lower layers.
Alternatively said, each layer is written as though it runs on the virtual machine supplied by the lower layer and in turn provides a more abstract (pleasent) virtual machine for the higher layer to run on.
Using a broad brush, the layers are.
1. Applications and utilities
2. Compilers, Editors, Command Interpreter (shell, DOS prompt)
3. Libraries
4. The OS proper (the kernel, runs in privileged/kernel/supervisor mode)
5. Hardware
Compilers, editors, shell, loader. etc run in user mode.
The kernel itself is itself normally layered, e.g.
1. ...
2. Filesystems
3. Machine independent I/O
4. Machine dependent device drivers
The machine independent I/O part is written assuming ``virtual (i.e. idealized) hardware''. For example, the machine independent I/O portion simply reads a block from a ``disk''. But in reality one must deal with the specific disk controller.
Often the machine independent part is more than one layer.
The term OS is not well defined. Is it just the kernel? How about the libraries? The utilities? All these are certainly system software but not clear how much is part of the OS.

1.1: What is an operating system?

The kernel itself raises the level of abstraction and hides details. For example a user (of the kernel) can write to a file (a concept not present in hardware) and ignore whether the file resides on a floppy, a CD-ROM, or a hard magnetic disk

The kernel is a resource manager (so users don't conflict).

How is an OS fundamentally different from a compiler (say)?

Answer: Concurrency! Per Brinch Hansen in Operating Systems Principles (Prentice Hall, 1973) writes.

The main difficulty of multiprogramming is that concurrent activities can interact in a time-dependent manner, which makes it practically impossibly to locate programming errors by systematic testing. Perhaps, more than anything else, this explains the difficulty of making operating systems reliable.

Homework: 1. (unless otherwise stated, problems numbers are from the end of the chapter in Tanenbaum.)

1.2 History of Operating Systems

Single user (no OS).
Batch, uniprogrammed, run to completion.
- The OS now must be protected from the user program so that it is capable of starting (and assisting) the next program in the batch).
Multiprogrammed
- The purpose was to overlap CPU and I/O
- Multiple batches
  - IBM OS/MFT (Multiprogramming with a Fixed number of Tasks)
    - The (real) memory is partitioned and a batch is assigned to a fixed partition.
    - The memory assigned to a partition does not change
  - IBM OS/MVT (Multiprogramming with a Variable number of Tasks) (then other names)
    - Each job gets just the amount of memory it needs. That is, the partitioning of memory changes as jobs enter and leave
    - MVT is a more ``efficient'' user of resources but is more difficult.
    - When we study memory management, we will see that with varying size partitions questions like compaction and ``holes'' arise.
- Time sharing
  - This is multiprogramming with rapid switching between jobs (processes). Deciding when to switch and which process to switch to is called scheduling.
  - We will study scheduling when we do processor management
Personal Computers
- Serious PC Operating systems such as linux, Windows NT (2000) and (the newest) MacOS are multiprogrammed OSes.
- GUIs have become important. Debate as to whether it should be part of the kernel.
- Early PC operating systems were uniprogrammed and their direct descendants still are (e.g. Windows ME).