Class 10 CS 372H 18 February 2010 On the board ------------ 1. Last time 2. Condition variables --Motivation --Usage 3. Semaphores 4. Monitors 5. Advice and standards for concurrent programming --Advice --Standards --------------------------------------------------------------------------- 1. Last time --to deal with concurrency, need atomic operations --in the presence of multiple CPUs, we get atomic operations by using special hardware instructions --different options on different architectures --test_and_set() very common --on the x86, one uses xchg to implement test_and_set() --if you have a single CPU, can sometimes get atomic operations by turning off interrupts --confusingly, applies at both kernel-level and user-level --"turning off interrupts" in a user-level thread package means "ignoring the signals from the timer that would invoke the run-time via the handler of the timer signal" --Aside: here's a better-performing spinlock: void acquire(Lock* lock) { pushcli(); while (xchg_val(&lock->locked, 1) == 1) { while (lock->locked) ; } } void release(Lock* lock) { xchg_val(&lock->locked, 0); popcli(); } --Today: --going to present some more synchronization primitives, along with rules for their use (and non-use) --the primitives may seem strange, and the rules may seem arbitrary: why one thing and not another? --there is no absolute answer here --**However**, history has tested the approach that we're going to present. If you use the recommended primitives and follow their suggested use, you will find it easier to write correct code --For now, just take the recommended approaches as a given, and use them for a while. If you can come up with something better after that, by all means do so! --But please remember three things: a. lots of really smart people have thought really hard about the right abstractions, so a day or two of thinking about a new one or a new use is unlikely to yield an advance over the best practices. b. the consequences of getting code wrong can be atrocious. see for example: http://www.nytimes.com/2010/01/24/health/24radiation.html http://sunnyday.mit.edu/papers/therac.pdf http://en.wikipedia.org/wiki/Therac-25 c. people who tend to be confident about their abilities tend to perform *worse*, so if you are confident you are a Threading and Concurrency Ninja and/or you think you truly understand how these things work, then you may wish to reevaluate..... http://www.nytimes.com/2000/01/23/weekinreview/january-16-22-i-m-no-doofus-i-m-a-genius.html 2. Condition variables A. Motivation --producer/consumer queue --very common paradigm. also called "bounded buffer": --producer puts things into a shared buffer --consumer takes them out --producer must wait if buffer is full; consumer must wait if buffer is empty --shows up everywhere --Soda machine: producer is delivery person, consumer is soda drinkers, shared buffer is the machine --DMA buffers --producer/consumer queue using mutexes --what's the problem? --answer: busy waiting --It is convenient to break synchronization into two types: --*mutual exclusion*: allow only one thread to access a given set of shared state at a time --*scheduling constraints*: wait for some other thread to do something (finish a job, produce work, consume work, accept a connection, get bytes off the disk, etc.) B. Usage --API --void cond_init (Cond *, ...); --Initialize --void cond_wait(Cond *c, Mutex* m); --Atomically unlock m and sleep until c signaled --Then re-acquire m and resume executing --void cond_signal(Cond* c); --Wake one thread waiting on c [UPDATE: in some pthreads implementations, the analogous call wakes *at least* one thread waiting on c. Check the the documentation (or source code) to be sure of the semantics.] --void cond_broadcast(Cond* c); --Wake all threads waiting on c --QUESTION: Why must cond_wait both release the mutex and sleep? (see handout) --Answer: can get stuck waiting. Producer: while (count == BUFFER_SIZE) Producer: release() Consumer: acquire() Consumer: ..... Consumer: cond_signal(&nonfull) Producer: cond_wait(&nonfull) --Producer will never hear the signal! --QUESTION: Why not use "if"? (Why use "while"?) --Answer: we can get an interleaving like this: --The signal() puts the waiting thread on the ready list but doesn't run it --That now-ready thread is ready to acquire() the mutex --But a *different* thread (a third thread: not the signaler, not the now-ready thread) could acquire() the mutex, work in the critical section, and now invalidates whatever condition was being checked --Our now-ready thread eventually acquire()s the mutex... --...with no guarantees that the condition it was waiting for is still true --Solution is to use "while" when waiting on a condition variable --DO NOT VIOLATE THIS RULE; doing so will (almost always) lead to incorrect code 3. Semaphores --Don't use these. We're mentioning them only for completeness and for historical reasons: they were the first general-purpose synchronization primitive, and they were the first synchronization primitive that Unix supported. --Introduced by Edsger Dijkstra in late 1960s --Dijkstra was a highly notable figure in computer science who spent the latter part of his career here at UT --Semaphore is initialized with an integer, N --Two functions: --Down() and Up() [also known as P() and V()] --The guarantee is that Down() will return only N more times than Up() is called --Basically a counter that, when it reaches 0, causes a thread to sleep() --Another way to say the same thing: --Semaphore holds a count --Down() is an atomic operation that waits for the count to become positive; it then decrements the count by 1 --Up() is an atomic operation that increments the count by 1 and then wakes up a thread waiting on Down(), if any --Don't use these! (Notice that Andrew Birrell [who is a Threading Ninja] doesn't even mention them in his paper.) --Problems: --semaphores are dual-purpose (for mutual exclusion and scheduling constraints), so hard to read code and hard to get code right --semaphores have hidden internal state --getting a program right requires careful interleaving of "synchronization" and "mutex" semaphores 4. Monitors --High-level idea: an object (as in object-oriented systems) --in which methods do not execute concurrently; and --that has one or more condition variables --More detail --Every method call starts with acquire(&mutex), and ends with release(&mutex) --Technically, these acquire()/release() are invisible to the programmer because it is the programming language (i.e., the compiler+run-time) that is implementing the monitor --So, technically, a monitor is a programming language concept --Book follows this technical definition --But technical definition isn't hugely useful because no programming languages in widespread usage have true monitors --Java has something close: a class in which every method is "synchronized" (i.e., implicitly protected by a mutex) --Not exactly a monitor because there's nothing forcing every method to be synchronized --And we can *use* mutexes and condition variables to implement our own manual versions of monitors, though we have to be careful --Given the above, we are going to use the term "monitor" more loosely to refer to both the technical definition and also a "manually constructed" monitor, wherein: --all method calls are protected by a mutex (that is, the programmer inserts those acquire()/release() on entry and exit from every procedure *inside* the object) --synchronization happens with condition variables whose associated mutex is the mutex that protects the method calls --In other words, we will use the term "monitor" to refer to the programming conventions that you should follow when building multithreaded applications --you must follow these conventions on lab T --Example: see handout --RULE: --hold lock when doing condition variable operations --Some (e.g., Birrell) will say: for experts only, no need to hold the lock when signaling. IGNORE THIS. Putting the signal outside the lock is only a small performance optimization, and it is likely to lead you to write incorrect code. --to get credit in Lab T, you must hold the associated mutex when doing a condition variable operation --Different styles of monitors: --Hoare-style: signal() immediately wakes the waiter --What the book calls Hansen-style: signal() required to be last statement in a procedure --What everyone else calls Hansen-style and what we will use: signal() eventually wakes the waiter. Not an immediate transfer --Can we replace SIGNAL with BROADCAST, given our monitor semantics? (Answer: yes, always.) Why? --while() condition tests the needed invariant. program doesn't progress pass while() unless the needed invariant is true. --result: spurious wake-ups are acceptable.... --...which implies you can always wakeup a thread at any moment with no loss of correctness.... --....which implies you can replace SIGNAL with BROADCAST [though it may hurt performance to have a bunch of needlessly awake threads contending for a mutex that they will then acquire() and release().] --Can we replace BROADCAST with SIGNAL? --Answer: not always. --Example: --memory allocator --threads allocate and free memory in variable-sized chunks --if no memory free, wait on a condition variable --now posit: --two threads waiting to allocate chunks of memory --no memory free at all --then, a third thread frees 10,000 bytes --SIGNAL alone does the wrong thing: we need to awaken both threads 5. Advice and standards for concurrent programming A. Advice --Use item #2 on handout as a case study.... ** General approach: --Decompose problem into objects object oriented style of programming: encapsulate shared state and synchronization variables inside of objects Note: (1) Shared objects are separate from threads (2) Shared object encapsulates code, synchronization variables, and state variables --What are threads, what are shared objects on handout? (producer/consumer; MyBuffer) --Warning: most examples in the book talk about "thread 1's code" and "thread 2's code", etc. This is b/c most of the "classic" problems were studied before OO programming was widespread. --Don't manipulate synchronization variables or shared state variables in the code associated with a thread; do it with the code associated with a shared object. --Threads tend to have "main" loops. These loops tend to access shared objects. *However*, the "thread" piece of it should not include locks or condition variables. Instead, locks and CVs should be encapsulated in the shared objects. --Why? (1) Locks are for synchronizing across multiple threads. Doesn’t make sense for one thread to "own" a lock. (2) Encapsulation -- details of synchronization are internal details of a shared object. Caller should not know about these details. "Let the shared objects do the work." --Common confusion: trying to do synchronization within the threads' code (i.e., not following the advice above). No! Synchronization should happen within the shared objects. Mantra: "let the shared objects do the work". --[Note: our earlier examples don't actually follow the advice, but that is in part so you can see a full example. As we enter the object-oriented world, we are going to encapsulate the details inside objects.] ** Design approach: 1. Getting started: 1a. Identify units of concurrency. Make each a thread with a go() method or main loop. Write down the actions a thread takes at a high level. 1b. Identify shared chunks of state. Make each shared *thing* an object. Identify the methods on those objects, which should be the high-level actions made *by* threads *on* these objects. Plan to have these objects be monitors. 1c. Write down the high-level main loop of each thread. Advice: stay high level here. Don't worry about synchronization yet. Let the objects do the work for you. Separate threads from objects. The code associated with a thread should not access shared state directly (and so there should be no access to locks/condition variables in the "main" procedure for the thread). Shared state and synchronization should be encapsulated in shared objects. --QUESTION: how does this apply to the example on the handout? --separate loops for producer(),consumer(), and synchronization happens inside MyBuffer. Now, for each object: 2. Write down the synchronization constraints on the solution. Identify the type of each constraint: mutual exclusion or scheduling. For scheduling constraints, ask, "when does a thread wait"? --NOTE: usually, the mutual exclusion constraint is upheld by the fact that we're programming with monitors. --QUESTION: how does this apply to the example on the handout? --Only one thread can manipulate the buffer at a time (mutual exclusion constraint) --Producer must wait for consumer to empty slots if all full (scheduling constraint) --Consumer must wait for producer to fill buffers if all empty (scheduling constraint) 3. Create a lock or condition variable corresponding to each constraint --QUESTION: how does this apply to the example on the handout? --Answer: need a lock and two condition variables. But lock was sort of a given from the monitor. 4. Write the methods, using locks and condition variables for coordination B. Standards --see Mike D's "Programming With Threads", linked from lab T --You are required to follow this document --You will lose points (potentially many!) on the lab and on the exam if you stray from these standards --Note that in his example in section 4, there needs to be another line: --right before mutex->release(), he should have: assert(invariants hold) C. Reflections --Number one piece of advice: SAFETY FIRST. --Locking at coarse grain is easiest to get right, so do that (one big lock for each big object or collection of them) --Don't worry about performance at first --In fact, don't even worry about liveness at first --In other words don't view deadlock as a disaster --Key invariant: make sure your program never does the wrong thing