Class 4
CS 372H
27 January 2011

On the board
------------

(One handout)

1. Last time
2. What is a process?
3. Context switches
4. x86 memory intro
5. Segmentation

---------------------------------------------------------------------------

1. Last time

    --discussed two important sets of interfaces:
	--hardware/software, OS/user-level

    --regarding hardware/software, some of you asked about why the
    convention is that ebx is callee saved while eax,ecx,edx are caller
    saved. here's an answer.

	    eax: "accumulator" register
	    ebx: "base" register
	    ecx: "count" register
	    edx: "data" register

	    idea was that ebx would point to the base of a data structure
	    (just as ebp meant "buffer pointer" and pointed to the base of a frame]

	    ebx often points to a data segment (e.g., for a dynamic
	    library's data), by setting it up at the beginning of a function
	    and keeping it constant throughout
	    
	    Because it's a "stable" pointer, makes sense to have the callee
	    save it.

	    eax/ecx/edx is more ephemeral: used for particular calculations
	    and such. Makes sense to require the caller to save it if it
	    needs the values in there.

    --began to discuss the interface between the user-level software and
    the OS

	--syscalls.

	--two example syscalls: fork() and exec()

	--we discussed the power of the fork()+exec() separation

	--handout explains this point in more depth. you can read the
	code on your own. here in class, the main point is that *after*
	the fork() but before the exec(), the *child* shell arbitrarily
	configures its own environment, arranging it to look a certain
	way. the result is that an execed() program can wind up doing
	something very different from what its author intended, for
	instance, the output can go to a file instead of to the screen.

2. What is a process?

    at the highest level: a process is an instance of a running program
	--browser, text editor, etc.
	--but multiple frames in your text editor are *not* multiple processes

    --registers

    --view of memory, which contains:
	--program code
	--constants 
	--stack
	--heap
   
       --this view is (almost) entirely encapsulated by %cr3, a pointer
       to a set of page tables. [as we'll see when we start on virtual
       memory (today or next time)]

    --signal state

    --state of OS resources, such as file descriptors

    --what else?

	--OS maintains lots of other information, per-process

	--UID, signal mask, controlling terminal, priority, whether
	being debugged, etc., etc.

    --how do processes actually get loaded into memory?

	--happens in two pieces:

	    fork() creates a clone of the parent. **completely
	    identical, save return value of fork() syscall itself**

	    exec() loads in new code. invokes the loader, which puts the
	    program into (virtual) memory so that its functions are
	    where it expects them

    --how is a process implemented?

	     PCB
	-----------------
	|   process id  |
	|   state       |   (ready, runnable, blocked, etc.)
	|   user id     |
	|   IP          |
	|   open file   |
	| VM structures |
	|   registers   |
	|   .....       |  (signal mask, terminal, priority, ...) 
	----------------

	    called "proc" in Unix, "task_struct" in Linux, and "struct
	    env" in JOS
	
	QUESTION: wait, why only registers and such saved? where does
	the process's memory go? [we'll see.....]


3. Context switches

    --When do we switch between processes?

	CPU preemption
	    --happens when kernel gets control. that happens when:

	a. running process traps 
	   (i) performs a system call
		--syscall by one process can result in that process being
		put to sleep (e.g., if it does a read from the disk) or some
		other proess becoming runnable (e.g., if it wakes up another
		process with an inter-process message)

	    (ii) exception
		--divide by 0
		--page fault
		--...

	b. periodic timer interrupt
	    if a process uses up its quantum, another one gets scheduled

	    --explanation: timer that invokes the operating system,
	    giving it the opportunity to schedule something out

	    [analogy with handlers of celebrities. celebrity arranges to
	    have handler periodically say to well-wishers, "I'm sorry,
	    we really have to be going". In the analogy, the OS is the
	    celebrity, and the timer is the handler.]

	c. device interrupt
	    disk request completed, new data arrived from network, etc.
	
	    --this might make runnable a previously waiting process
	
    --save state; reload other process's state

    --how do we implement context switches?

	--[draw picture from notes]

	--save IP and registers
	--change VM translations

   --context switches can be expensive

    --More detail.......

	--Different contexts:
	    user-level:
		running an application
	    kernel process context (*top-half*):
		running kernel code on behalf of an application (e.g.,
		  syscall)
		handling faults
		kernel-only processes (in-kernel Web server)
	    kernel code not associated with a process
		timer interrupt (hardclock)
		device interrupt (network data, e.g.)
		"softirqs",	"tasklets", ..., in Linux (**bottom half** in Unix)
	    context switch code
		changing address spaces 


	--Transitions:
	
	    user-->top-half: syscall, page fault

	    {user, top-half} --> device/timer interrupt: hardware

	    top half --> {user, context switch}:  return

	    top half --> context switch: sleep (because waiting)

	    context switch --> {user/top half}: inevitable

---------------------------------------------------------------------------

Admin

    --lab1 due yesterday

    --lab 2a due Monday

    --we will give you the option to do pair programming but only after
    lab 2. this decision will be one-time and irrevocable.

    --pair programming means both of you at one keyboard, one screen

---------------------------------------------------------------------------

4. memory intro

   A. top-most idea:
	--let programs use addresses like 0, 0xc000, whatever.
	--OS arranges for hardware to translate these addresses 
	    --what piece of hardware does this? (A: MMU)
	--what doesn't OS just translate the stuff itself? [slow]
	  
	idea is to fool programs

	but OS also fools itself! (JOS thinks it is running at the top
	of physical memory [0xf0000000], but it is not)

	--draw picture:

	[CPU ---> translation box --> physical addresses]

	that translation box gives us a bunch of things
	    --protection: processes can't touch each other's memory
		--idea: if you cannot name it, you cannot use it. deep idea.
	    --relocation:
		--two instances of program foo are each loaded, each
		think they're using memory addresses like 0,0x1234,
		whatever, but of course they're not using the same
		actual memory cells
	    --sharing:
		--processes share memory under controlled circumstances,
		but that physical memory may show up at very different
		virtual addresses
		--that is, two processes have a different way to refer
		to the same physical memory cells

    B. applied to x86:

	logical [virtual] addresses ---> linear addresses ---> physical addresses

	    --logical addresses are also known as virtual addresses
	    --physical addresses are what is on the CPU's address pins
		--do they address RAM?
		--no, they refer to the physical memory map (i.e.,
		hardware may do more translation)

	the first translation happens via *segment translation*

	the second translation happens via *page translation*

	segmentation is old-school and these days mostly an annoyance
	(but it cannot be turned off!)
	
	    --however, it comes in handy every now and then for
	    things like sandboxing (advanced topic) or thread-local
	    memory (another advanced topic, though by the time the
	    midterm comes around, you should see why segmentation could
	    be useful to the implementer of a threads package)

5. segmentation	

    A. segmentation in general
    
	segmentation means:

	    memory addresses treated like offsets into a contiguous
	    region.

	QUESTION: if segmentation can't be turned off, how do we pretend
	it's not there?

	    setting its mapping to be the identity function
		
	    offset of 0 and no limit

    B. segmentation on the x86

	linear address = base + virtual_address
	    (virtual_address is the offset here)

	what's the interface to segmentation?

	there are tables:
	    
	    GDT, LDT

	    processor told where this table lives via
		LLDT, LGDT, SLDT, SGDT

	every instruction comes with an implicit *or* explicit segment
	register (the implicit case is the usual one):

	    pop %ebx	               ; implicitly uses %ss
	    call $0x7000               ; implicitly uses %cs
	    movl $0x1234, (%eax)       ; implicitly uses %ds
	    movl $0x1234, %gs:(%eax)   ; explicitly uses %gs

	    [all references to %eip (such as instruction fetches) uses %cs for translation.]
 
	    some instructions can take "far addresses":
	           ljmp $selector, $offset 
   
	a segment register holds a segment selector

	    different registers for the stack (ss), data (ds), code
	    (cs), string [extra] operations (es), other fun stuff (fs,
	    gs)

	a selector indexes into the LDT or GDT, and chooses *which* table
	and which *entry* in that table

	determines base, limit, **protection** (R/W/X, user/kernel, etc/), type

	offset better be less than limit
	 
	example #1:
	   
	    say that %ds refers to this descriptor entry:
		base 0x30000000
		limit 0x0f0

	    now, when program does:
		
		mov 0x50, %eax

		what happens?
		
		[0x50 gets translated into 0x3000 0050]

	example #2:

	    what about if program does:

		mov 0x100, %eax ?

	        [error.]
	
	example #2:
	
	    when JOS begins, the base is -0xf000 0000. that way, the
	    kernel's access to, say, 0xf0010000 gets translated to
	    physical memory access at, say, 0x00010000.

	    prevew: later, **paging** will be used to ensure that
	    kernel's access to, say, 0xf0010000 gets translated to
	    physical memory access at, say, 0x00010000....

	NOTES:

	    --Current privilege level (CPL) is in the low 2 bits of CS

	    --CPL=0 is privileged O/S, CPL=3 is user
	    
	    --can app modify the descriptors in the LDT? it's in memory...
		yes it can. useful for certain things, like one
		user-level program sandboxing another.

	    --app cannot just lower the CPL 

	    --don't confuse LDT and GDT with **IDT** (which you'll see
	    in lab 3)