Class 3 CS 439 22 January 2013 On the board ------------ 1. Last time 2. More about process state 3. Shell: a program that starts processes 4. Threads 5. Introduction to concurrency --------------------------------------------------------------------------- 1. Last time --privileged vs. unprivileged mode --the concept and purpose of a process --->|wait for input|---->|wait for input| gcc----------------> reduce latency: A goes for 80 s, B goes for 20 s A-----------> B --> : takes B 100 s run A and B concurrently, say for 10s each, makes B finish faster. --where processes come from: the fork() call --context switches. clarify: it can happen on three events, not just hardware timers interrupt from devices from the hardware timer traps (system calls) exceptions (divide-by-0, page fault, ...) when that happens, the OS gets control, saves the state of one process, and starts running another 2. More about process state [draw state diagram] scheduling: lots of things to care about. we'll spend time on this later. preemption 3. Shell --a program that creates programs. makes heavy use of fork/exec while (1) { write(1, "$ ", 2); readcommand(command, args); // parse input if ((pid = fork()) == 0) // child? exec(command, args, 0); else if (pid > 0) // parent? wait(0); //wait for child else perror("failed to fork"); } --QUESTION: why is fork different from exec? What the heck? --Aside: [ what happens if there are two users (two humans) using a machine, and one of them does this: for (i = 0; i < 10; i++) { fork(); } while (1) {} ] --Detour: What does this do? $ ls > tmp1 How is it implemented? --just before exec, shell does: close(1) open("tmp1", O_TRUNC | O_CREAT | O_WRONLY, 0666) which automatically assigns tmp1 to be fd 1 --More generally, lots of things we might want to manipulate in the child process: file descriptors, environment, resource limits. --yet fork() requires no arguments! --Okay, now the power of the fork/exec separation: * Contrast with CreateProcess on Windows: BOOL CreateProcess( name, commandline, security_attr, thr_security_attr, inheritance?, other flags, new_env, curr_dir_name, .....) there's also CreateProcessAsUser, CreateProcessWithLogonW, CreateProcessWithTokenW, ... * The issue is that any conceivable manipulation of the environment of the new process has to be passed through arguments, instead of via arbitrary code. --There is another example on your handout from last time, which we will post E. What makes a good abstraction? --simple but powerful --examples we've seen: --stdin (0), stdout (1), stderr (2) [nice by itself, but when combined with the mechanisms below, things get even better] --file descriptors --fork/exec() separation --very few mechanisms lead to a lot of possible functionality --------------------------------------------------------------------------- admin announcement: some of you should have had section. Wednesday sections: Parth Upadhyay --------------------------------------------------------------------------- 4. Threads classical picture: [draw picture from notes] why threads? most popular abstraction for concurrency way to get multiple execution contexts (stack, registers, etc.) seeing the same memory allows one process to use multiple CPUs or cores overlap I/O and computation Web server example: compare the performance of: I. while() get network message (URL) from client get data from disk compose response send response II. Create a number of threads, and for each thread do: get network message (URL) from client get data from disk compose response send response [connects back to picture from before of vi, gcc. Each thread is overlapped, just like vi, gcc] Interface to threads: * tid thread_create (void (*fn) (void *), void *); Create a new thread, run fn with arg * void thread_exit (); * void thread_join (tid thread); Wait for thread 'thread' to exit * plus a lot of synchronization primitives, which we'll see in the coming classes Threads can be preemptive or not, and implemented by kernel (so kernel schedules threads) or in user library (so user library schedules threads) Under user-level threads, the kernel has no idea that these threads exist...... Under kernel-level threads, the kernel knows about threads [these bring different trade-offs] For now, assume preemptive 5. Introduction to concurrency There are many sources of concurrency. QUESTION: what are some of them? multiple processors (and common memory), interrupts, preemptive scheduling (which is fundamentally possible b/c of interrupts, but we list it separately) Why is it hard? * Hard to reason about all possible interleavings 1a: x = 1 or x = 2. 1b: x = 13 or x = 25. 1c: x = 1 or x = 2 or x = 3 2: incorrect list structure 3: incorrect count in buffer all of these are called race conditions; not all of them are errors, though --worst part of errors from race conditions is that a program may work fine most of the time but only occasionally show problems. why? (because the instructions of the various threads or processes or whatevever get interleaved in a non-deterministic order.) --and it's worse than that because inserting debugging code may change the timing so that the bug doesn't show up