Class 3
CS 202
12 Sep 2023

1. Last Class
2. Stack frames continued
3. Syscall
4. Process/OS control transfer
5. Shell

1. Last class
   - Processes: what? why?
   - Process view of memory and registers
   - Some basic assembly
   - Stack frames

2. Stack frames continued
        - Finally, note that the epilogue for f (starting on line 49)
        does the reverse of the prologue, thus restoring the stack to
        how it was before.


                            |
                            |
                            |
	   +------------+   |
	   | ret %rip   |   /
	   +============+
    %rbp-> | saved %rbp |   \
	   +------------+   |
	   |            |   |
	   |   local    |   \
	   | variables, |    >- current function's stack frame
	   |  call-     |   /
	   | preserved  |  / 
	   |   regs,    | |
	   | etc.       |/
    %rsp-> +------------+   

    /* here's a different example, not the one we worked in class */

    main():

        # set up frame pointer (aka "base pointer")
        pushq %rbp
        movq  %rsp, %rbp

        # push any call-clobbered register that we will need onto the
        # stack, for example:
        pushq %rcx
        pushq %r8
        pushq %r9

        call f
        
        # restore call-clobbered registers that we saved. In the above
        # example, it would be:
        popq %r9
        popq %r8
        popq %rcx

        # epilogue: restore call-preserved registers
	movq %rbp, %rsp
	popq %rbp 
	ret


    f():

        # set up frame pointer (aka "base pointer")
        pushq %rbp
        movq  %rsp, %rbp

        ...


    --what happens to a function's state, that is, the registers,
    when a function is called? they might need to be saved, or not.

    --purely a matter of convention in the compiler, **not** hardware
    architecture

    Unix calling conventions:

        --on x86-64, *arguments* are passed through registers: %rdi, %rsi, %rdx,
        %rcx, %r8, %r9
        (more than six? then spill to stack). And the *return value* is
        passed from callee to caller in %rax.


    // the points here are:

    -  Calling a function requires agreement between caller and callee
    about how arguments are passed, and which of them is responsible for
    saving and restoring registers.

    - In an executing program, the stack is partitioned into a set of
    stack frames, one for each function. The stack frame for the current
    function starts at the base pointer and extends down to the stack
    pointer. 
    
        ** Stack frames are how functional scope in languages like C are
        actually implemented -- allowing each function invocation to
        refer to different variables with the same name.  In other
        words, the programmer thinks they are writing a function with
        local variables; compiler has arranged to implement that with
        stack frames.

    - de-mystifying pointers: a pointer (like "int* foo") is an
    address. that's it. repeat: a pointer is an address. that
    address can be:
        - on the stack
        - on the heap
        - in the text section of the program

    - because of how stack frames work, it's unequivocally a bug
    to pass a pointer from a prior stack frame.

3. System calls

    - System calls are the process's main interface to the operating
    system 
        
        The set of system calls is the API exposed by the kernel to user
        programs

        In other words, **syscalls are the mechanism by which user-level
        programs ask the operating system to do things for them.**

    - To the C programmer, a system call looks exactly like a function
    call: you just issue the function, get a return value, and keep
    going.

        - here are some example system calls:

	        int fd = open(const char* path, int flags)
	        write(fd, const void *, size_t)
	        read(fd, void *, size_t)

	    (Aside: fd is a *file descriptor*. This is an abstraction,
	    provided by the operating system, that represents an open
	    file. We'll come back to this later in the course.)

        - lots more:

            - you will work with a few system calls in lab2: stat(), readdir().

            - you will also work with some system calls in the concurrency lab

            - on Unix, type "man 2 <syscall>" to get documentation.

4. Process/OS control transfers

    - To the C compiler (or the assembly programmer) and the machine as
    a whole, a system call has some key differences versus function
    calls (even though both are transfers of control):
    
	(i) there is a small difference in calling conventions -- a
	process knows that when it invokes "syscall", ALL registers
	(except RAX) are call-preserved. That means that the callee (in
	this case the kernel) is required to save and restore all
	registers (except RAX, which is the exception because that is
	where return values go).
     
        (ii) Rather than using the "call" instruction, the process uses
        a different instruction (helpfully called the `syscall`
        instruction). This causes privilege levels to switch.

        The picture looks like this:

              user-level application
		     |   (open)
		     v            user-level
	    ---------------------------                ^
		    |             kernel-level         |
		    |____>  [table]     open()         |
				           .....       |
				        iret  ---------

    - Vocabulary: when a user-level program invokes the kernel via a
    syscall, it is called *trapping* to the kernel

   
    Key distinction: privileged versus unprivileged mode

        --the difference between these modes is something that the
        *hardware* understands and enforces

        --the OS runs in privileged mode

	    --can mess with the hardware configuration 

        --users' tasks run in unprivileged mode

	    --cannot mess with the hardware configuration

        --the hardware knows the difference between privileged and
        unprivileged mode (on the x86, these are called ring 0 and ring
        3.  The middle rings aren't used in the classical setup, but
        they are used in some approaches to virtualization.)


    - Overall, there are three ways that the OS (also known as the
    kernel) is invoked:

        A. system calls, covered above.

        B. interrupts.
    
            An _interrupt_ is a hardware event; it allows a device,
            whether peripheral (like a disk) or built-in (like a timer)
            to notify the kernel that it needs attention. 

                (As we will see later, timers are essential for
                ensuring that processes don't hog the CPU.)
       
            Interrupts are **implicit**: in most cases, the application
            that was running at the time of the interrupt _has no idea
            that an interrupt even triggered_, despite the fact that
            handling the interrupt requires these high-level steps:
                - process stops running
                - CPU invokes interrupt handler
                - interrupt handler is part of kernel code, so kernel
                starts running
                - kernel handles interrupt
                - kernel returns control

            In other words, from the process's viewpoint, it executed
            continuously, but an omniscient observer would know
            perfectly well that the process was in fact _interrupted_
            (hence the term).

            In order to preserve this illusion, the processor (CPU) and
            kernel have to be designed very carefully to save _all_
            process state on an interrupt, and restore all of it.
            
            We will discuss the underlying mechanisms for these control
            transfers later in the course.

        C. exceptions

            An _exception_ means that the CPU cannot execute the
            instruction issued by the processor.

            Classically (and for this part of the course), you can think
            of this as "the process did something erroneous" (a software
            bug): dividing by 0, accessing a bogus memory address, or
            attempting to execute an unknown instruction. But there are
            non-erroneous causes of exceptions (an example is demand
            paging, as we will see in the virtual memory unit).

            When an exception happens, the processor (the CPU) knows
            about it immediately. The CPU then invokes an 
            _exception handler_ (code implemented by the kernel).
            The kernel can handle exceptions in a variety of ways:

            - kill the process (this is the default, and what is
            happening when you see a segfault in one of your programs).
            
            - signal to your process (this is how runtimes like Java
            generate null-pointer exceptions; processes _register_ to
            catch signals).
            
            - silently handle the exception (this is how the kernel
            handles certain memory exceptions, as in the demand paging
            case).
            
            The mechanisms here relate to those for interrupts.


5. The shell, part I

    --a program that creates processes

    --the human's interface to the computer

    --GUIs (graphical user interfaces) are another kind of shell.

    A. How does the shell start programs?

	--example: 
	    $ ls

Aside:

     - Fork bomb at the bash command prompt:

            $ :(){ : | : & }; :

---------------------------------------------------------------------------

--here are some other system calls (these are included in the
notes so that you know what the basic interface to a Unix-like
OS looks like):

    --int open(char*, int flags, [, int mode]);
    --int read(int fd, void*, int nbytes):
    --int write(int fd, void* buf, int nbytes);
    --off_t lseek(int fd, off_t pos, int whence)
    --int close(int fd);
    --int kill(int pid, int signal)
    --void exit (int status)
    --int fork(void)
    --int waitpid(int pid, int* stat, int opt)
    --int execve(char* prog, char** argv, char** envp)
    --int dup2 (int oldfd, int newfd)
    --int pipe(int fds[2])


[Acknowledgments: Mike Walfish]