Class 5
CS372H
31 January 2012

On the board
------------

1. Last time
2. Page faults
    --uses
    --costs
3. Page replacement
4. Processes
5. Process control: the shell

---------------------------------------------------------------------------

1. Last time

    --reinforced virtual memory, TLB, page faults

    --today: finish page faults

2. Page faults

    Last time: we discussed the mechanics at a high-level (page fault
    handler runs)

    A. Uses
	Last time:
	--paging (virtual memory larger than RAM)
	--distributed shared memory
	--copy-on-write

   	Paging in day-to-day use:

	 --Demand paging 

	 --Growing the stack 

	 --BSS page allocation 

	 --Shared text 

	 --Shared libraries 

	 --Shared memory 
	 
	 --Copy-on-write (fork, mmap, etc.) 

    B. Page faults: costs

    --What does demand paging (i.e., paging from the disk) cost?

	--let's look at average memory access time (AMAT)

	--AMAT = (1-p)*memory access time + p * page fault time,
	where p is the prob. of a page fault.
	
	memory access time ~ 100ns 
	disk access time   ~ 10 ms = 10^7 ns

	--QUESTION: what does p need to be to ensure that paging hurts
	performance by less than 10%?

	1.1*t_M = (1-p)*t_M + p*t_D
	p = .1*t_M / (t_D - t_M) ~ 10^1 / 10^7 = 10^{-6} 

	so only one access out of 1,000,000 can be a page fault!!

	--basically, page faults are super-expensive (good thing the
	machine can do other things during a page fault)

    --Thrashing is even worse

	Memory overcommitted -- pages tossed out while still needed 
     
	Example:

	    --one program touches 50 pages (each equally likely); only 
	      have 40 physical page frames 
	    
	    --If have enough pages, 100ns/ref 
     
	    --If have too few pages, assume every 5th reference leads
	    to a  page fault 
     
	    --4refs x 100ns  and 1 page fault x 10ms for disk I/O 

	    --this gets us
		5 refs per (10ms + 400ns) = 2ms/ref = 20,000x slowdown!!! 
     

	--What we wanted: virtual memory the size of disk with access
	time the speed of physical memory 

	--What we have here: memory with access time roughly of disk
	(2 ms/mem_ref compare to 10 ms/disk_access)

	Concept is much larger than OSes: need to pay attention to the
	slow case if it's really slow and common enough to matter.

3. Page replacement

    A. What's the high-level point?

	--when dealing with page-to-the-disk, memory is a cache of the
	disk.

	--We'll mostly ignore the problem of page replacement, except
	for the next ten minutes. (JOS doesn't page. If it runs out of
	physical memory, it returns an error.  Plus, memory is a lot
	cheaper than it used to be, so I'd wager that most of you don't
	run into physical memory limitations very often.)

	--The high-level idea is that when there's a page fault (because
	a page that 'looked' to the process like it was in RAM was
	actually on the disk), the OS has to decide which page to evict.

	--Lots of algorithms for this. We mostly won't discuss them.

	--Most of them use some bits (for accounting inside the page
	structures)

	    --The two big ones are: the Use bit and the Modified bit. 

    B. Some implementation points

	Note that many machines, x86 included, maintain 4 bits per page
	table entry:

	    --*use*: Set when page referenced; cleared by an algorithm like
	    CLOCK (the bit is called "Accessed" on x86)

	    --*modified*: Set when page modified; cleared when page written
	    to disk (the bit is called "Dirty" on x86)

	    --*valid*: Program can reference this page without getting a
	    page fault. Set if page is in memory? [no. it is "only if", not
	    "if". *valid*=1 implies page in physical memory. but page in
	    physical memory does not imply *valid*=1; in other words,
	    *valid*=0 does not imply page is not in physical memory.]

	    --*read-only*: program can read page, but not modify it. Set if
	    page is truly read-only? [no. similar case to above, but
	    slightly confusing because the bit is called "writable". if a
	    page's bits are such that it appears to be read-only, it may or
	    may not be because it is truly "read only". but if a page is
	    truly read-only, it better have its bits set to be read-only.]

	Do we actually need Modified and Use bits in the page tables
	set by the harware?

	    --[again, x86 calls these the Dirty and Accessed bits]

	    --answer: no.

	    --how could we simulate them?

	    --for the Modified [x86: Dirty] bit, just mark all pages
	    read-only. Then if a write happens, the OS gets a page fault
	    and can set the bit itself. Then the OS should mark the page
	    writable so that this page fault doesn't happen again

	    --for the Reffed [x86: Accessed] bit, just mark all pages as
	    not present (even if they are present). Then if a reference
	    happens, the OS gets a page fault, and can set the bit,
	    after which point the OS should mark the page present (i.e.,
	    set the PRESENT bit).

    C. Is caching always a win?  
    
	No. Here are some cases when it may not buy anything:

	    --process doesn't reuse memory

	    --process reuses memory but it doesn't fit.

	    --individually, all processes fit, but too much for the system

	what do we do?

	    --well, in the first two cases, there's nothing you can do,
	    other than restructuring your computation or buying memory
	    (e.g., expensive hardware that keeps entire customer
	    database in RAM)

	    --in the third case, can and must shed load. how?
	
	two approaches:
	    a. working set
	    b. page fault frequency

	a. working set

	    --only run a set of processes s.t. the union of their
	    working sets fit in memory

	    --book defines working set. short version: the pages a processed
	    has touched over some trailing window of time

	b. page fault frequency

	    --track the metric (# page faults/instructions executed)

	    --if that thing rises above a threshold, and there is not enough
	    memory on the system, swap out the process

---------------------------------------------------------------------------

admin

    --project partners due Friday. we expect two emails from each team.
    the redundancy helps eliminate errors.

    --MikeD concurrency lecture on Thursday

    --Thursday is a discussion day. We'll call on people. Don't show up
    if you haven't done the reading.

---------------------------------------------------------------------------

4. Processes

    write on the board
    ------------------
    
    * what is a process?
    * how do they interact with the operating system?
    * how do they come into being?

    A. What is a process?

	--abstraction of a virtual machine (virtual memory, virtual CPU,
	etc.). instance of a running program.

	    [draw picture]

	    --here's an implementation:

		     PCB
		-----------------
		|   process id  |
		|   state       |   (ready, runnable, blocked, etc.)
		|   user id     |
		|   IP          |
		|   open file   |
		| VM structures |
		|   registers   |
		|   .....       |  (signal mask, terminal, priority, ...) 
		----------------

		called "proc" in Unix, "task_struct" in Linux, and "struct
		env" in JOS
    
	    --each one has its own cr3 and hence its own view of virtual
	    memory, which contains:
		--program code (aka "text")
		--constants
		--zeroed-out area for variables
		--stack
		--heap

	    --its own registers

	    --state of OS resources

	    --very little else is actually needed, but a modern process
	    does have a lot of associated information:

		--signal state

		--UID, signal mask, controlling terminal, priority,
		whether being debugged, etc., etc.


	--typically has less privilege than operating system

	    --OS can manipulate the hardware. processes cannot.

	    --OS (obviously) can manipulate OS abstractions. processes cannot.

	--the hardware knows the difference between privileged and
	unprivileged mode (on the x86, these are called ring 0 and ring
	3. The middle rings aren't used in the classical setup, but they
	are used in some approaches to virtualization.)

    B. How do processes interact with the operating system?

	--syscalls: the interface to the operating system. 

	--lots of these

	--on Unix, type "man 2 <syscall>" to get documentation.

	--here are three relevant ones on Unix:

	    int fd = open(const char* path, int flags)
	    write(fd, const void *, size_t)
	    read(fd, void *, size_t)

	--fd is a *file descriptor*. this is an abstraction, provided by
	the operating system, that represents an open file

	--every process can usually expect to begin life with three file
	descriptors already open:
	    0: represents the input to the process (e.g., tied to
		terminal)
	    1: represents the output
	    2: represents the error output

	    these are sometimes known as stdin, stdout, stderr

	--we mentioned in class 1 that Unix hides for processes the
	difference between a device and a file. This is an example.

	    --we'll see in ten minutes or so how powerful this is.

	--here are some other system calls (these are included in the
	notes so that you know what the basic interface to a Unix-like
	OS looks like):

	    --int open(char*, int flags, [, int mode]);
	    --int read(int fd, void*, int nbytes):
	    --int write(int fd, void* buf, int nbytes);
	    --off_t lseek(int fd, off_t pos, int whence)
	    --int close(int fd);
	    --int kill(int pid, int signal)
	    --void exit (int status)
	    --int fork(void)
	    --int waitpid(int pid, int* stat, int opt)
	    --int execve(char* prog, char** argv, char** envp)
	    --int dup2 (int oldfd, int newfd)
	    --int pipe(int fds[2])
	

    C. How does a process come into being?

	--answer: another system call!
	    
	--in Unix, it is fork()

	--on JOS, it is exo_fork()

	--fork creates an exact copy (almost; the return value is
	different).

	--thus, what happens if a system had two important users, and
	one of them runs a process that executes this code:

	    for (i = 0; i < 10; i++) {
		fork();
	    }

	    while (1) {}

	[answer: one of the users gets a LOT more of the CPU than
	another]
   
	    --what behavior do you want?

		[this actually corresponds to research. OSes are only just
		applying resource containers.]

5. Process control: the shell

    A. How does the shell start programs?

	--example: 
	    $ ls

	[see panel 1 on handout; go line-by-line]

	--calls fork(), which creates a copy of the shell. now there are
	two copies of the shell running

	--then calls exec(), which loads the new program's instructions
	into memory and begins executing them.
	    --(exec invokes the loader, which we'll talk about.)

	while (1) {
		write(1, "$ ", 2);
		readcommand(command, args); // parse input
		if ((pid = fork()) == 0) // child?
			exec(command, args, 0);
		else if (pid > 0) // parent?
			wait(0); //wait for child
		else
			perror("failed to fork");
	}

	--how can shell wait for the end of a process?
	    --with wait() or waitpid() system calls

	--WAIT, WHY ARE FORK() AND EXEC() SEPARATE? 
	
	    * We will come back to this.

    B. Redirection

	* What is redirection?

	    $ ls > tmp1

	* How is it implemented?

	    key lines: just before exec, shell does:
		close(1)
		open ("tmp1", O_TRUNC | O_CREAT | O_WRONLY, 0666)
	    which automatically assigns tmp1 to be fd 1

	* What about 

	    $ sh < script > tmp1

	    where script contains 
		echo abc
		echo def

	    [draw picture]