Class 4
CS 372H
28 January 2010

On the board
------------

1. Last time
2. What is a process?
3. Context switches
4. x86 memory intro
5. segmentation

----------------------------------------------------------------------------

1. last time (answering questions)

    saw gcc calling conventions, virtual machines, privileged v
    unprivileged mode, and how user-level programs and the kernel
    interact

    a. bios questions
    
	--mapped to 0xf0000 to 0xfffff because it is crammed at the top
	of 1 MB of address space (8088 had 20 bit address space, but no
	one thought there would actually be a megabyte of RAM)

	    --they *thought* they were locating the BIOS in an
	    out-of-the-way place

	    --history repeated itself. the ROMs on PCI cards get mapped
	    to the top of the 32 bit address space because, after all,
	    no one is going to have 4 GB of RAM in a computer

	--%eip starts at 0xffff0 because hardware people didn't want to
	make assumptions about where the BIOS started. easier just to
	require that it be at the top of the address space and then
	start with a jump

	    --and why 0xffff0 instead of 0xffffa?

	    --0xffff:0000 in segment:offset is easy to code in hardware

	--8088 had 20 bit address space, so 0xfffff was top of what
	    could be addressed
	    
	
    b. how does varargs work? 
    
    (draw picture)

	       --------------
	           %ebp
		local variables
		callee saved registers
		......

	       +------------+   |
	       | arg 2      |   \
	       +------------+    >- previous function's stack frame
	       | arg 1      |   /
	       +------------+   |
	       | ret %eip   |   /
	       +============+   
	       | saved %ebp |   \
	%ebp-> +------------+   |
	       | caller saved|  |
	       | local variables |
	       |   local    |   \
	       | variables, |    >- current function's stack frame
	       |    etc.    |   /
	       |            |   |
	       |            |   |
	%esp-> +------------+   /


	i. someone calls

	    printf("%d %d", 3, 4)

	    [draw picture]

	ii. someone calls

	    printf("%d", 3, 4)

	    <compiler complains>

	iii. what if someone does

	    char* foo = "%x %x %x %x %x %x %x %x %x %x %x"

	    printf(foo)

	    ?
	    
	    <compiler cannot know that something is wrong>

	    it will just use the arguments on the stack

     c. correction: pid is the return value of fork(), not the process
     id]

	    pid = fork()

	    if (pid == 0) { //I am child
		// rearrange file descriptors
		// set up environment
		// perform arbitrary computations to set up
		//   child's resources
		exec("/bin/ls, arg0, arg1, ...)
	    } else {
		wait(0); // or waitpid(...)
	    }
		
	    in the parent:
		pid is set to pid of child

	    in the child:
		child has a pid 
		but the return value is 0

		how can child learn its own pid?  [getpid() syscall]

		how can child learn pid of its parent? [getppid()
		syscall]


2. What is a process?

    --registers

    --view of memory, which contains:
	--program code
	--constants 
	--stack
	--heap
   
       --this view is entirely encapsulated by %cr3, a pointer to a set of
       page tables. [as we'll see in a moment]

    --signal state

    --state of OS resources, such as file descriptors

	[draw fd table]

    --what else?

	--OS maintains lots of other information, per-process

	--UID, signal mask, controlling terminal, priority, whether
	being debugged, etc., etc.


    --how do processes actually get loaded into memory?

	--happens in two pieces:

	    fork() creates a clone of the parent. **completely
	    identical**

	    exec() loads in new code

    --linking/loading

	--linking: take compiled .o files and turn them into a coherent
	program image

	--loading: putting the program into (virtual) memory so that its
	functions are where it expects them

    --draw picture:
	
	[hardware registers]
	[OS has pointer to saved state of all of the processes.....]
		    
		    ----


		    ----


		    ----


	QUESTION: wait, why only registers and such saved? where does
	the process's memory go? [we'll see.....]

	
3. Context switches

    --When do we switch between processes?

	CPU preemption
	    --happens when kernel gets control. that happens when:

	a. running process traps 
	    --performs a system call
	    --exception
		--divide by 0
		--page fault
		--...

	b. periodic timer interrupt
	    if a process uses up its quantum, another one gets scheduled

	    --explanation: timer that invokes the operating system,
	    giving it the opportunity to schedule something out

	    [analogy with handlers of celebrities. celebrity arranges to
	    have handler periodically say to well-wishers, "I'm sorry, we really
	    have to be going".]

	c. device interrupt
	    disk request completed, new data arrived from network, etc.
	
    --save state; reload other process's state

    --[draw picture from notes]

    --context switches can be expensive

    --More detail.......

	--Different contexts:
	    user-level:
		running an application
	    kernel process context (*top-half*):
		running kernel code on behalf of an application (e.g.,
		  syscall)
		handling faults
		kernel-only processes (in-kernel Web server)
	    kernel code not associated with a process
		timer interrupt (hardclock)
		device interrupt (network data, e.g.)
		"softirqs",	"tasklets", ..., in Linux (**bottom half** in Unix)
	    context switch code
		changing address spaces 


	--Transitions:
	
	    user-->top-half: syscall, page fault

	    {user, top-half} --> device/timer interrupt: hardware

	    top half --> {user, context switch}:  return

	    top half --> context switch: sleep (because waiting)

	    context switch --> {user/top half}: inevitable

4. memory intro

    top-most idea:
	--let programs use addresses like 0, 0xc000, whatever.
	--OS arranges for hardware to translate these addresses 
	    --what piece of hardware does this? (A: MMU)
	--what doesn't OS just translate the stuff itself? [slow]
	  
	idea is to fool programs

	but OS also fools itself! (JOS thinks it is running at the top
	of physical memory, but it is not)

	--draw picture:

	[CPU ---> translation box --> physical addresses]

	that translation box gives us a bunch of things
	    --protection: processes can't touch each other's memory
		--idea: if you cannot name it, you cannot use it. deep idea.
	    --relocation:
		--two instances of program foo are each loaded, each
		think they're using memory addresses like 0,0x1234,
		whatever, but of course they're not using the same
		memory
	    --sharing:
		--processes share memory under controlled circumstances,
		but that physical memory may show up at very different
		virtual addresses
		--that is, two processes have a different way to refer
		to the same physical memory cells

    x86:

	logical addresses ---> linear addresses ---> physical addresses

	    --logical addresses are also known as virtual addresses
	    --physical addresses are what is on the CPU's address pins
		--do they address RAM?
		--no, they refer to the physical memory map (i.e.,
		hardware may do more translation)

	the first translation happens via *segment translation*

	the second translation happens via *page translation*

	segmentation is old school and these days mostly an annoyance
	(but it cannot be turned off!)
	
	    --however, it comes in handy every now and then for
	    things like sandboxing (advanced topic)

	segmentation means:

	    memory addresses treated like offsets into a contiguous
	    region.

	QUESTION: if segmentation can't be turned off, how do we pretend
	it's not there?

	    setting its mapping to be the identity function
		
	    offset of 0 and no limit

5. segmentation	

	linear address = base + virtual_address
	    (virtual_address is the offset here)

	segment register holds segment selector

	    different selectors for the stack (ss), data (ds), code
	    (cs), string [extra] operations (es), other fun stuff (fs,
	    gs)

	selector indexes into a table of descriptors
	    chooses *which* table
	    chooses the entry in that table

	determines base, limit, protection, type
    
	choice of seg register usually implicit in instruction

	    * ESP uses SS, EIP uses CS, others (mostly) use DS
	    * some instructions can take far addresses:
	          o ljmp $selector, $offset 

	draw example:
	   
	    say that %ds refers to this descriptor entry:
		base 0x30000000
		limit 0x0f0

	    now, when program does:
		
		mov 0x50, %eax

		what happens?
		
		[0x50 gets translated into 0x3000 0050]

	    what about if program does:

		mov 0x100, %eax ?

	        [error.]
		    

	QUESTION:

	    where did the descriptor table come from? and how does the
	    hardware know about it? ([LS][LG]DT)

	    --Current privilege level (CPL) is in the low 2 bits of CS

	    --CPL=0 is privileged O/S, CPL=3 is user
	    
	    --can app modify the descriptors in the LDT? it's in memory...
		yes it can. useful for certain things, like one
		user-level program sandboxing another.

	    --what about system calls? how do they transfer to kernel?

	    --app cannot just lower the CPL