Class 14 CS 202 26 March 2015 On the board ------------ 1. Last time 2. Cooperative multithreading 3. Event-driven programming 4. Disks --------------------------------------------------------------------------- 1. Last time --finished memory unit --discussed uses of page faults, costs of page faults --discusses cache eviction policies, regarding RAM as a cache of the disk --the discussion about thrashing at the end: moral of the story is that if the workload is not cache-friendly, the policy is irrelevant. --> in that case, need to restructure computation, do less work, or buy more hardware --today, let's revisit concurrency a bit...... we will study two alternatives to preemptive multithreading 2. Cooperative multithreading --Review: what *is* a kernel-managed thread? --basically same as a process, except two threads in the same process have the same value for %cr3 --recall: kernel threads are always preemptive --Today: we can also have *user*-level threading, in which the kernel is completely ignorant of the existence of threading. [draw picture] T1 T2 T3 thr package OS H/W --in this case, the threading package is the layer of software that maintains the array of TCBs (thread control blocks) --threading package has other responsibilities as well: --make a new stack for each new thread. --scheduling! --user-level threading can be preemptive or non-preemptive --to simplify the discussion, let's assume that they are non-preemptive. *non-preemptive multithreading* is also called *cooperative multithreading*. --this means that a context switch takes place only at well-defined points: when the thread calls yield() and when the thread would block on I/O [ What if we wanted to make user-level threads switch non-deterministically? --deliver a periodic timer interrupt or signal to a thread scheduler [setitimer() ]. When it gets its interrupt, swap out the thread. --makes it more complex to program with user-level threads --in practice, systems aren't usually built this way, but sometimes it is what you want (e.g., if you're simulating some OS-like thing inside a process, and you want to simulate the non-determinism that arises from hardware timer interrupts). ] --how do context switches interact with I/O calls? thread calls fake_blocking_read(). high-level idea: fake_blocking_read() { if read would block yield() swtch() } a bit more detail on the call chain above: int fake_blocking_read(int fd, char* buf, int num) { int nread = -1; while (nread == -1) { /* this is a non-blocking read() syscall */ nread = read(fd, buf, num); if (nread == -1) { /* read would block */ yield(); } } return nread; } void yield() { tid next = pick_next_thread(); /* get a runnable thread */ tid current = get_current_thread(); swtch(current, next); /* when 'current' is later rescheduled, it starts from here */ } --how are context switches implemented? --see handout..... --[draw picture of the two stacks] --what do you think are the advantages and disadvantages of cooperate vs preemptive multithreading? --cooperative: --Upside: Makes it easier to avoid errors from concurrency --Downside: Harder to program because now the threads have to be good about yielding, and you might have forgotten to yield inside a CPU-bound task. --downside of user-vs-kernel threads: --Can we imagine having two user-level threads truly executing at once, that is on two different processors? (Answer: no. why?) --What if the OS handles page faults for the process? (then a page fault in one thread blocks all threads). --(not a huge issue in practice) --Similarly, if a thread needs to go to disk, then that actually blocks *all* threads (since the kernel won't allow the run-time to make a non-blocking read() call to the disk). So what do we do about this? --extend the API; or --live with it; or --use elaborate hacks with memory mapped files (e.g., files are all memory mapped, and runtime asks to handle its own page faults, if the OS allows it) --[skipped in class] Old debates about user-level threading vs. kernel-level threading. The "Scheduler Activations" paper, by Anderson et al., [ACM Transactions on Computer Systems 10, 1 (February 1992), pp. 53--79] proposes an abstraction that is a hybrid of the two. --basically OS tells process: "I'm ready to give you another virtual CPU (or to take one away from you); which of your user-level threads do you want me to run?" --so user-level scheduler decides which threads run, but kernel takes care of multiplexing them Quick comparison between user-level threading and kernel-level: (i). high-level choice: user-level or kernel-level (but can have N:M threading, in which N user-level threads are multiplexed over M kernel threads, so the choice is a bit fuzzier) (ii). if user-level, there's another choice: non-preemptive (also known as cooperative) or preemptive [be able to answer: why are kernel-level threads always preemptive?] --*Only* the presence of multiple kernel-level threads can give: --true multiprocessing (i.e., different threads running on different processors) --asynchronous disk I/O using Posix interface [because read() blocks and causes the *kernel* scheduler to be invoked] --but many modern operating systems provide interfaces for asynchronous disk I/O, at least as an extension --Windows --Linux has AIO extensions --Mac OS X --thus, even user-level threads can get asynchronous disk I/O, by having the run-time translate calls that *appear* blocking to the thread [e.g., thread_read()] into a series of instructions that: register for interest in an I/O event, put the thread to sleep, and switch() to another thread --[moral of the story: if you find yourself needing async disk I/O from user-level threads, use one of the non-Posix interfaces!] 3. Event-driven programming while (1) { events = getEvents(); /* this uses select() or epoll() */ for (e in events) processEvent(e); } book describes this in detail basic idea is that you write a bunch of *event handlers*. QUESTION: can the event handlers block? Like what if processEvent(e) required the code to call some function f(): f() { make_a_blocking_call(); } Is that okay? ANSWER: no. Why? Because then the other events won't get processed. QUESTION: so what do we do? ANSWER: f() conceptually is split into two pieces: the "before" piece registers for interest in some *further* event. the "after" piece will later be called when the event in question is complete. Advantages and disadvantages: --> vs user-level threading? Not much of an advantage, actually. --> vs preemptively scheduled kernel threads? - this model doesn't provide true concurrency. - can't use multiple CPUs, for example + no race conditions, no deadlocks QUESTION: can the kernel tell the difference between whether a process is using event-driven programming and user-level threading? ANSWER: no. looks the same to the kernel. kernel just sees one process and one thread. [thanks to David Mazieres for content in portions of this lecture.] 4. Disks Disks are *the* bottleneck in many systems (although this becomes less and less true every year, as solid state drives, or SSDs, become cheaper and cheaper) [Reference: "An Introduction to Disk Drive Modeling", by Chris Ruemmler and John Wilkes. IEEE Computer 1994, Vol. 27, Number 3, 1994. pp17-28.] [Reference: "An Introduction to Disk Drive Modeling", by Chris Ruemmler and John Wilkes. IEEE Computer 1994, Vol. 27, Number 3, 1994. pp17-28.] What is a disk? --stack of magnetic platters --Rotate together on a central spindle @3,600-15,000 RPM --Drive speed drifts slowly over time --Can't predict rotational position after 100-200 revolutions ------------- | platter | ------------ | | | ------------ | platter | ------------ | | | ------------ | platter | ------------ | --Disk arm assembly --Arms rotate around pivot, all move together --Pivot offers some resistance to linear shocks --Arms contain disk heads--one for each recording surface --Heads read and write data to platters [interlude: why are we studying this? disks are still widely in use everywhere, and will be for some time. Google, Facebook, etc. all still pack their data centers full of cheap, old disks. Also, for them, disk failure is the common case, not the random/weird case, (they have so many disks that it only makes sense that they would be failing relatively often) so they can't cram their datacenters with expensive SSDs. As a second point, it's technical literacy; many filesystems were designed with the disk in mind (sequential significantly better than random). You have to know how these things work as a computer scientist and as a programmer. ] To be continued...