Review session 5 Outline: 1. Lab 4 overview [50 min] a. Introduction [5 min] b. Questions - Lab 4 [5 min] c. What are we trying to solve? [7 min] d. Surveying the code [33 min] e. Review what we've found [5 min] 2. Questions [5 min] 1. Lab 4 overview a) Introduction [5 min] - This review session will be dedicated to going over Lab 4 - In order to do the labs effectively we need to create a mental model of what is going on. - Mapping out the code helps us to do this - (Reading all of the provided documentation also helps with this - in the case of the labs, reading the entire html file, not just the purple boxes) - Today we're going to go over what lab 4 does and the code, section by section - The goal of this is to understand how to map out code: figure out which tasks we need to accomplish and the relevant parts of the code - We need to know the following: - At a high level: what portions of the code are associated with each other, and are associated with what we want to do - How functions are logically grouped together - What parts of the code modify what - Where information is stored b) Questions - Lab 4 - Some of them we'll hit while we go through the code, others will be saved until the end c) What are we trying to solve? [7 min] - What is the program? -> basic OS that runs tasks - Two main sections of lab 4: - Scheduling - The OS already has one scheduling algorithm - The OS probably has most of the mechanics for scheduling (selecting a process, yielding, interrupts) - we need to find them - We need to find where scheduling policies are defined - Synchronization - From the spec: user-level operations can be preempted at any time, while the kernel is never preempted. - To get synchoniztion working properly, we'll either to use 1) user-space concurrency primitives, 2) atomic operations, or 3) code running in kernel space (since it can't be pre-empted) - Portions of code we're concerned with: - Where is the scheduler? How is the policy decided? When are new processes chosen? - What data structures define a process? How do we get information about processes? - Synchronization - what concurrency primitives are available? How to we implement system calls to get code to run in kernel space? d) Surveying the code [33 min] [3 min] - Multiple approaches to this: read all the header files, start with main and trace control flow... - For most programs: the program starts at main() - From program instructions: kernel is initialized in kernel.c:start() - If the instructions weren't provided, how would we find the start? (docs, start at a random file and trace the control flow backwards) - This program is small enough that we can feasibly read all of the code (this is not always true) - but we're still only interested in certain sections, so we don't want to have to understand every line of every file. [10 min] -> 13 - kernel.c - proc_array: the container for process data structures - NPROCS: 5 ("0" is unused, and 1-4 for each user process) - process_t: pid, registers, state (empty, runnable, blocked, zombie), exit status - *current: a pointer to the currently running process - we don't have to worry about setting this - scheduling_algorithm: this determines which scheduling algorithm is run - start() - First step is process setup... we probably don't have to worry about the init() functions. If we change the process data structure down the line, our init code probably goes here - 106: sets scheduling algorithm. Using grep, looks like this is the only place this is set - Invokes first process: it doesn't look like run() is supposed to return. - run() contains an assembly instruction to: - 1) reload the relevant registers, 2) call the assembly instruction iret: interrupt return - This is a kernel -> user level switch that the hardware handles when this specific instruction is invoked - interrupt() - Handler code for each type of interrupt. Using tags: interrupts are defined in schedos.h - Yield: calls schedule() - Exit: sets process data, calls schedule() - User-specified calls: does nothing, calls run() - Clock: calls schedule() - How we would implement an interrupt: set process data, then call run() or schedule() depending if the process should continue to run or give up control - schedule() - Decides a new pid, then calls run() - This function is only concerned with the scheduler's policy - The variable scheduling_algorithm determines which policy is invoked [5 min] -> 18 - kernel.h - procstate, process: we already knew what these do, but now we know they're mostly handled by the kernel in kernel.c - We have good reason to believe that user processes don't touch these (based on how OS's usually work) but can also check: - via grep: looks like process_t is used in lower-level code and the kernel only. - via grep: kernel.h is only included in x86.c, k-loader.c, kernel.c. - grep flags: -r, -i, -w, -v - Other items: definitions that don't look relevant to our task - Function headers: interrupt/schedule (we know about these), functions having to do with registers, controllers, and the console - Looking through these: this is the interface to clear the console and read in text [2 min] -> 20 - x86.h - Register struct - A whole bunch of assembly wrappers and flags - doubtful we'll need these, but this is how to find them [3 min] -> 23 - process.h - System calls - yield and exit are defined here - it looks like they call an interrupt via assembly while setting a flag. (Do the INT_* items look familiar?) - No modifiying of "process" data structures here: this part of the code only invokes interrupts - If we want to add system calls, we'll want to add them here. Without having to figure out additional assembly instructions, it looks like we get one status code and one argument (copying the interface of sys_exit) [4 min] -> 27 - p-schedos-app-1.c --> Try to open the app code, looking up the name with !ls. To make it easier to copy, enter :new and :!ls, then open! p-schedos-app-1.c. :q closes the window - The character to print is defined here. The other apps define their own character and include the file. (When the C preprocessor comes across #include, it literally includes the file before passing it to the compiler.) - Apps "write" by manipulating cursorpos directly: the pointer value is set to a code. The screen is "advanced" by incrementing the pointer. (We saw function definitions for clearing the screen earlier). This is an unusual interface. It looks like a printf() exists (grep -> lib.h), but the apps don't use it. - The apps yield after every character, and yield forever when they are done. [6 min] -> 33 - Other files: elf.h, lib.h, x86sync.h - elf.h: low level code - lib.h: small version of the C standard library - (Looks like there's a printf and putc after all, even though the apps don't use them. Where are they used? Grep brings up many items: try piping it to grep -v asm and grep -v tags to filter out the ones we don't care about) - It looks like it's used in some way in the app assembly code, but we don't really care: the pattern of writing to the cursorpos array directly works. - x86sync.h: - atomic_swap, compare_and_swap, fetch_and_add. These could be useful for part 3.2 for implementing locking: atomic_swap is one way to implement a spinlock. (Thinking of mutexes: grep -r mutex . returns nothing - we'll have to implement our own) - "inline" header functions work by suggesting to the compiler that the function body should be spliced into the code. (this is not a guarantee) e) Review what we've found [5 min] - Consider the questions from part a: - Where is the scheduler? How is the policy decided? When are new processes chosen? --> kernel.c, an integer value set in start, whenever schedule() is invoked - What data structures define a process? How do we get information about processes? --> process_t in kernel.h, lookups by pid via the global proc_array in kernel.c, info about the currently running process by *current - Synchronization - what concurrency primitives are available? How to we implement system calls to get code to run in kernel space? --> atomic calls in x86sync.h - no higher-level concurrency primitives, such as mutexes, are available --> system calls: interface in process.h, implementation in kernel.h. Interrupt numbers are defined in schedos.h. To create a system call: define it in the header, and add the correct behavior in kernel.h:interrupt() 2) Questions on Lab 4 or Homework 5 - Anything not answered or covered in enough detail earlier