Class 4 CS 372H 28 January 2010 On the board ------------ 1. Last time 2. What is a process? 3. Context switches 4. x86 memory intro 5. segmentation ---------------------------------------------------------------------------- 1. last time (answering questions) saw gcc calling conventions, virtual machines, privileged v unprivileged mode, and how user-level programs and the kernel interact a. bios questions --mapped to 0xf0000 to 0xfffff because it is crammed at the top of 1 MB of address space (8088 had 20 bit address space, but no one thought there would actually be a megabyte of RAM) --they *thought* they were locating the BIOS in an out-of-the-way place --history repeated itself. the ROMs on PCI cards get mapped to the top of the 32 bit address space because, after all, no one is going to have 4 GB of RAM in a computer --%eip starts at 0xffff0 because hardware people didn't want to make assumptions about where the BIOS started. easier just to require that it be at the top of the address space and then start with a jump --and why 0xffff0 instead of 0xffffa? --0xffff:0000 in segment:offset is easy to code in hardware --8088 had 20 bit address space, so 0xfffff was top of what could be addressed b. how does varargs work? (draw picture) -------------- %ebp local variables callee saved registers ...... +------------+ | | arg 2 | \ +------------+ >- previous function's stack frame | arg 1 | / +------------+ | | ret %eip | / +============+ | saved %ebp | \ %ebp-> +------------+ | | caller saved| | | local variables | | local | \ | variables, | >- current function's stack frame | etc. | / | | | | | | %esp-> +------------+ / i. someone calls printf("%d %d", 3, 4) [draw picture] ii. someone calls printf("%d", 3, 4) iii. what if someone does char* foo = "%x %x %x %x %x %x %x %x %x %x %x" printf(foo) ? it will just use the arguments on the stack c. correction: pid is the return value of fork(), not the process id] pid = fork() if (pid == 0) { //I am child // rearrange file descriptors // set up environment // perform arbitrary computations to set up // child's resources exec("/bin/ls, arg0, arg1, ...) } else { wait(0); // or waitpid(...) } in the parent: pid is set to pid of child in the child: child has a pid but the return value is 0 how can child learn its own pid? [getpid() syscall] how can child learn pid of its parent? [getppid() syscall] 2. What is a process? --registers --view of memory, which contains: --program code --constants --stack --heap --this view is entirely encapsulated by %cr3, a pointer to a set of page tables. [as we'll see in a moment] --signal state --state of OS resources, such as file descriptors [draw fd table] --what else? --OS maintains lots of other information, per-process --UID, signal mask, controlling terminal, priority, whether being debugged, etc., etc. --how do processes actually get loaded into memory? --happens in two pieces: fork() creates a clone of the parent. **completely identical** exec() loads in new code --linking/loading --linking: take compiled .o files and turn them into a coherent program image --loading: putting the program into (virtual) memory so that its functions are where it expects them --draw picture: [hardware registers] [OS has pointer to saved state of all of the processes.....] ---- ---- ---- QUESTION: wait, why only registers and such saved? where does the process's memory go? [we'll see.....] 3. Context switches --When do we switch between processes? CPU preemption --happens when kernel gets control. that happens when: a. running process traps --performs a system call --exception --divide by 0 --page fault --... b. periodic timer interrupt if a process uses up its quantum, another one gets scheduled --explanation: timer that invokes the operating system, giving it the opportunity to schedule something out [analogy with handlers of celebrities. celebrity arranges to have handler periodically say to well-wishers, "I'm sorry, we really have to be going".] c. device interrupt disk request completed, new data arrived from network, etc. --save state; reload other process's state --[draw picture from notes] --context switches can be expensive --More detail....... --Different contexts: user-level: running an application kernel process context (*top-half*): running kernel code on behalf of an application (e.g., syscall) handling faults kernel-only processes (in-kernel Web server) kernel code not associated with a process timer interrupt (hardclock) device interrupt (network data, e.g.) "softirqs", "tasklets", ..., in Linux (**bottom half** in Unix) context switch code changing address spaces --Transitions: user-->top-half: syscall, page fault {user, top-half} --> device/timer interrupt: hardware top half --> {user, context switch}: return top half --> context switch: sleep (because waiting) context switch --> {user/top half}: inevitable 4. memory intro top-most idea: --let programs use addresses like 0, 0xc000, whatever. --OS arranges for hardware to translate these addresses --what piece of hardware does this? (A: MMU) --what doesn't OS just translate the stuff itself? [slow] idea is to fool programs but OS also fools itself! (JOS thinks it is running at the top of physical memory, but it is not) --draw picture: [CPU ---> translation box --> physical addresses] that translation box gives us a bunch of things --protection: processes can't touch each other's memory --idea: if you cannot name it, you cannot use it. deep idea. --relocation: --two instances of program foo are each loaded, each think they're using memory addresses like 0,0x1234, whatever, but of course they're not using the same memory --sharing: --processes share memory under controlled circumstances, but that physical memory may show up at very different virtual addresses --that is, two processes have a different way to refer to the same physical memory cells x86: logical addresses ---> linear addresses ---> physical addresses --logical addresses are also known as virtual addresses --physical addresses are what is on the CPU's address pins --do they address RAM? --no, they refer to the physical memory map (i.e., hardware may do more translation) the first translation happens via *segment translation* the second translation happens via *page translation* segmentation is old school and these days mostly an annoyance (but it cannot be turned off!) --however, it comes in handy every now and then for things like sandboxing (advanced topic) segmentation means: memory addresses treated like offsets into a contiguous region. QUESTION: if segmentation can't be turned off, how do we pretend it's not there? setting its mapping to be the identity function offset of 0 and no limit 5. segmentation linear address = base + virtual_address (virtual_address is the offset here) segment register holds segment selector different selectors for the stack (ss), data (ds), code (cs), string [extra] operations (es), other fun stuff (fs, gs) selector indexes into a table of descriptors chooses *which* table chooses the entry in that table determines base, limit, protection, type choice of seg register usually implicit in instruction * ESP uses SS, EIP uses CS, others (mostly) use DS * some instructions can take far addresses: o ljmp $selector, $offset draw example: say that %ds refers to this descriptor entry: base 0x30000000 limit 0x0f0 now, when program does: mov 0x50, %eax what happens? [0x50 gets translated into 0x3000 0050] what about if program does: mov 0x100, %eax ? [error.] QUESTION: where did the descriptor table come from? and how does the hardware know about it? ([LS][LG]DT) --Current privilege level (CPL) is in the low 2 bits of CS --CPL=0 is privileged O/S, CPL=3 is user --can app modify the descriptors in the LDT? it's in memory... yes it can. useful for certain things, like one user-level program sandboxing another. --what about system calls? how do they transfer to kernel? --app cannot just lower the CPL