Class 2
CS 202
03 February 2015

On the board
------------

1. Last time
2. Privileged vs. unprivileged mode
3. Process view of memory; stacks
4. Unix system calls, file descriptors

---------------------------------------------------------------------------

1. Last time

    --introduced course
    
    --discussed OSes and processes at a high level

    --whole point to processes:

        --share the resources of the machine (one task waits for the input,
        and the other task can do stuff.) ...

        --... while giving each task the illusion that it has access to
        a full machine

2. Privileged vs. unprivileged mode

    --the difference between these modes is something that the
    *hardware* understands and enforces

    --the OS runs in privileged mode

        --can mess with the hardware

        --can manipulate OS abstractions (obviously, but worth
        repeating)

    --users' tasks run in unprivileged mode

        --cannot mess with the hardware

        --sees a picture of a virtual machine

    --the hardware knows the difference between privileged and
    unprivileged mode (on the x86, these are called ring 0 and ring 3.
    The middle rings aren't used in the classical setup, but they are
    used in some approaches to virtualization.)

    --how do we get into kernel mode?

        3 ways:
        1. exception: user program does something bad, like divide by 0,
           or attempt to access memory it's not supposed to.
        2. trap: user program asks the kernel for help. this happens
           with the assembly instruction "int" on the x86.
        3. interrupt: (i) timer or (ii) device event (I/O completion, key
           stroke pressed, etc.)

    --how do we get back out of kernel mode, and where do we go?

        "iret" returns to place where user program was.

    --note that when the kernel is invoked, it may not "iret" to the
    same process that was running at the time that the kernel was
    invoked!

        --there could be a context switch

        --we will see roughly how context switches work in the context
        of threads. for processes, see the function x86.c:run() in lab1.

    --[draw context switch picture]

    --[see Tables 6.2 and 6.3 in OSTEP. May have different labels
    shortly: the figures in Chapter 6.]

3. Process's view of memory; stacks


    think of memory as a contiguous array

    [text/code | data | heap -->    <--- stack | kernel memory]


    pushl %eax   [  subl $4, %esp 
		    movl %eax, (%esp) ]

    
    popl %eax    [ movl (%esp), %eax
		   addl $4, %esp     ]

    
    call 0x12345  [ pseudo:
		      pushl %eip
		      movl $0x12345, %eip]

    ret	       [ pseudo:
			pop %eip ]


   gcc calling conventions

    --above we see how call and ret interact with the stack
	--call: updates %eip and pushes old %eip on the stack
	--ret: updates %eip by loading it with stored stack value

    --but what happens to a function's state, that is, the registers,
    when a function is called? they might need to be saved, or not. 
	
    --purely a matter of convention in the compiler **not** hardware
    architecture

    --here's what gcc does:

	--[
		draw blocks of code:
		    
		    main
		    f
		    g

		draw registers:

		    eip
		    ebp
		    eax
		    ecx
	    
		draw stack ]


	--at entry of a function:

	    looks like this: 
	
			arg 3
			arg 2
	                arg 1
		-->esp  [ret_addr]

	    [fill in picture above]

	    %eip points at first instruction of function
	    %esp+4 point at first argument 
	    %esp points at return address

	--after ret instruction:

	    %eip contains return address
	    %esp points at arguments pushed by caller
	    %eax contains return value (or trash if function is void)

	    %ecx, %edx may be trashed
	    %ebp, %ebx, %esi, %edi need to look the way that they did at
	      the time of the call
		
	--in other words:

	     %eax, %ecx, %edx are "caller save": caller's job to push
		them on the stack if it wants to save them

	     %ebp, %ebx, %esi, %edi are "callee save": callee's job to
	        push them on the stack after function call, and pop
		them (meaning restore their values by removing them from
		the stack) just before doing return


	FRAME POINTER

	--here's the picture of the stack when one function calls
	another:

	       +------------+   |
	       | arg 2      |   \
	       +------------+    >- previous function's stack frame
	       | arg 1      |   /
	       +------------+   |
	       | ret %eip   |   /
	       +============+   
        %ebp-> | saved %ebp |   \
	       +------------+   |
	       |            |   |
	       |   local    |   \
	       | variables, |    >- current function's stack frame
	       |  callee-   |   /
	       | saved vars,|
	       | etc.       |
	%esp-> +------------+   /


	--%esp moves to make stack bigger/smaller

	--%ebp points at saved %ebp from previous function
	    --saved %ebs form chain; can walk stack
	    --arguments and locals at fixed offsets from ebp

	--function prologue:
	    
		pushl %ebp
		movl %esp, %ebp
	       
	--function epilogue
		
		movl %ebp, %esp
		popl %ebp 
		ret


        --NOTE:

            every function returns once 
                except in Scheme et al., where there is call/cc
                also, setjmp/longjmp in C do something similar

            once it does so, all of the memory on the stack that it used
            should be viewed as "gone". 
                --> therefore if you pass a pointer to function-local
                memory out of the function, the program will have a bug
                (potentially one that will be difficult to track down)


	aside: if you're wondering why the convention is that ebx
	is callee saved while eax,ecx,edx are caller saved, here's
	an answer.

	    eax: "accumulator" register
	    ebx: "base" register
	    ecx: "count" register
	    edx: "data" register

	    idea was that ebx would point to the base of a data
	    structure (just as ebp meant "buffer pointer" and
	    pointed to the base of a frame).

	    ebx often points to a data segment (e.g., for a dynamic
	    library's data), by setting it up at the beginning of a function
	    and keeping it constant throughout
	    
	    Because it's a "stable" pointer, makes sense to have the
	    callee save it. ("if you're gonna touch this, put it back the
	    way you found it")

	    eax/ecx/edx are more ephemeral: used for particular calculations
	    and such. Makes sense to require the caller to save it if it
	    needs the values in there. ("if you really want to keep
	    these values, save them".)

---------------------------------------------------------------------------	

admin

    lab due on Friday
    
    review sessions have started meeting

    makeup classes; will be videotaped

    questions?

---------------------------------------------------------------------------	

4. system calls, file descriptors

    --going to look at the typical Unix style

    --examples:

        [write on board]

        int fd = open(const char* path, int flags, mode_t mode)
        int rc = write(int fd, const void *, size_t)
        int rc = read(int fd, void *, size_t)

    --fd is a *file descriptor*. this is an abstraction, provided by
    the operating system, that represents an open file

    --every process can usually expect to begin life with three file
    descriptors already open:
        0: represents the input to the process (e.g., tied to
            terminal)
        1: represents the output
        2: represents the error output

        these are sometimes known as stdin, stdout, stderr

    --NOTE: Unix hides for processes the difference between a device and a
    file. this is a very powerful hiding (or abstraction), as we will
    see soon

    --on Unix/linux/OS X, type "man 2 <syscall>" to get documentation

    --here are some other system calls (these are included in the
    notes so that you know what the basic interface to a Unix-like
    OS looks like):

        --int open(char*, int flags, [, int mode]);
        --int read(int fd, void*, int nbytes):
        --int write(int fd, void* buf, int nbytes);
        --off_t lseek(int fd, off_t pos, int whence)
        --int close(int fd);
        --int kill(int pid, int signal)
        --void exit (int status)
        --int fork(void)
        --int waitpid(int pid, int* stat, int opt)
        --int execve(char* prog, char** argv, char** envp)
        --int dup2 (int oldfd, int newfd)
        --int pipe(int fds[2])