Class 11 CS372H 22 February 2011 (One handout) On the board ------------ 1. Last time 2. Advice 3. Some practice with concurrent programming 4. Trade-offs and problems from locking A. deadlock --------------------------------------------------------------------------- 1. Last time --spinlocks, mutexes, condition variables, monitors --standards: you must follow MikeD's "Programming With Threads", linked from lab T --You are required to follow this document --You will lose points (potentially many!) on the lab and on the exam if you stray from these standards --Note that in his example in section 4, there needs to be another line: --right before mutex->release(), he should have: assert(invariants hold) --more about the standards/advice --the primitives may seem strange, and the rules may seem arbitrary: why one thing and not another? --there is no absolute answer here --**However**, history has tested the approach that we're using. If you use the recommended primitives and follow their suggested use, you will find it easier to write correct code --For now, just take the recommended approaches as a given, and use them for a while. If you can come up with something better after that, by all means do so! --But please remember three things: a. lots of really smart people have thought really hard about the right abstractions, so a day or two of thinking about a new one or a new use is unlikely to yield an advance over the best practices. b. the consequences of getting code wrong can be atrocious. see for example: http://www.nytimes.com/2010/01/24/health/24radiation.html http://sunnyday.mit.edu/papers/therac.pdf http://en.wikipedia.org/wiki/Therac-25 c. people who tend to be confident about their abilities tend to perform *worse*, so if you are confident you are a Threading and Concurrency Ninja and/or you think you truly understand how these things work, then you may wish to reevaluate..... --Dunning-Kruger effect --http://www.nytimes.com/2000/01/23/weekinreview/january-16-22-i-m-no-doofus-i-m-a-genius.html --MikeD stands on the desk when proclaiming the standards 2. Advice A. Top-level piece of advice: SAFETY FIRST. --Locking at coarse grain is easiest to get right, so do that (one big lock for each big object or collection of them) --Don't worry about performance at first --In fact, don't even worry about liveness at first --In other words don't view deadlock as a disaster --Key invariant: make sure your program never does the wrong thing B. More detailed advice: design approach [We will use item #1 on handout as a case study.....] --Here's a four-step design approach: 1. Getting started: 1a. Identify units of concurrency. Make each a thread with a go() method or main loop. Write down the actions a thread takes at a high level. 1b. Identify shared chunks of state. Make each shared *thing* an object. Identify the methods on those objects, which should be the high-level actions made *by* threads *on* these objects. Plan to have these objects be monitors. 1c. Write down the high-level main loop of each thread. Advice: stay high level here. Don't worry about synchronization yet. Let the objects do the work for you. Separate threads from objects. The code associated with a thread should not access shared state directly (and so there should be no access to locks/condition variables in the "main" procedure for the thread). Shared state and synchronization should be encapsulated in shared objects. --QUESTION: how does this apply to the example on the handout? --separate loops for producer(),consumer(), and synchronization happens inside MyBuffer. Now, for each object: 2. Write down the synchronization constraints on the solution. Identify the type of each constraint: mutual exclusion or scheduling. For scheduling constraints, ask, "when does a thread wait"? --NOTE: usually, the mutual exclusion constraint is satisfied by the fact that we're programming with monitors. --QUESTION: how does this apply to the example on the handout? --Only one thread can manipulate the buffer at a time (mutual exclusion constraint) --Producer must wait for consumer to empty slots if all full (scheduling constraint) --Consumer must wait for producer to fill buffers if all empty (scheduling constraint) 3. Create a lock or condition variable corresponding to each constraint --QUESTION: how does this apply to the example on the handout? --Answer: need a lock and two condition variables. But lock was sort of a given from the monitor. 4. Write the methods, using locks and condition variables for coordination C. More advice 1. Don't manipulate synchronization variables or shared state variables in the code associated with a thread; do it with the code associated with a shared object. --Threads tend to have "main" loops. These loops tend to access shared objects. *However*, the "thread" piece of it should not include locks or condition variables. Instead, locks and CVs should be encapsulated in the shared objects. --Why? (a) Locks are for synchronizing across multiple threads. Doesn't make sense for one thread to "own" a lock. (b) Encapsulation -- details of synchronization are internal details of a shared object. Caller should not know about these details. "Let the shared objects do the work." --Common confusion: trying to acquire and release locks inside the threads' code (i.e., not following this advice). Bad idea! Synchronization should happen within the shared objects. Mantra: "let the shared objects do the work". --Note: our first example of condition variables -- 4c on handout from last class (l10-handout) -- doesn't actually follow the advice, but that is in part so you can all of the parts working together. 2. Different way to state what's above: --You want to decompose your problem into objects, as in object-oriented style of programming. --Thus: (1) Shared object encapsulates code, synchronization variables, and state variables (2) Shared objects are separate from threads --Warning: most examples in the book talk about "thread 1's code" and "thread 2's code", etc. This is b/c most of the "classic" problems were studied before OO programming was widespread. 3. Practice with concurrent programming --sleeping barber question from prior midterm posted (as today's reading). use it as practice --we guarantee to test concurrent programming on the midterm --today, we work a different example: --workers interact with a database --motivation: banking, airlines, etc. --readers never modify database --writers read and modify data --using only a single mutex lock would be overly restrictive. Instead, want --many readers at the same time --only one writer at a time --let's follow the concurrency advice from last time (and above)..... 1. Getting started a. what are units of concurrency? [readers/writers] b. what are shared chunks of state? [database] c. what does the main function look like? read() check in -- wait until no writers access DB check out -- wake up waiting writer, if appropriate write() check in -- wait until no readers or writers access DB check out -- wake up waiting readers or writers 2. and 3. Synchronization constraints and objects --reader can access DB when no writers (condition: okToRead) --writer can access DB when no other readers or writers (condition: okToWrite) --only one thread manipulates shared variables at a time. NOTE: **this does not mean only one thread in the DB at a time** (mutex) 4. write the methods --inspiration required: int AR = 0; // active readers int AW = 0; // # active writers int WR = 0; // waiting readers int WW = 0; // waiting writers --see handout for the code --QUESTION: why not just hold the lock all the way through "Execute req"? (Answer: the whole point was to provide more concurrency, i.e., to move away from exclusive access.) --QUESTION: what if we had shared locks? The implementation of shared locks is given on the handout --------------------------------------------------------------------------- --Go over survey feedback and labs --seems like people are a bit frustrated with the labs --note that detective work is part of the game; that's part of what you're learning --need to use a combination of cognitive tools (deduction) and technical tools (grep, ctags, etags, etc.) --------------------------------------------------------------------------- 5. Trade-offs and problems from locking Locking (in all its forms: mutexes, monitors, semaphores) raises many issues: A. deadlock B. starvation C. priority inversion D. broken modularity ..... A. Deadlock --see handout: simple example based on two locks --see handout: more complex example --M calls N --N waits --but let's say condition can only become true if N is invoked through M --now the lock inside N is unlocked, but M remains locked; that is, no one is going to be able to enter M and hence N. --can also get deadlocks with condition variables --lesson: dangerous to hold locks (M's mutex in the case on the handout) when crossing abstraction barriers --deadlocks without mutexes: --Real issue is resources & how required --non-computer example **[picture of bridge]** --bridge only allows traffic in one direction --Each section of a bridge can be viewed as a resource. --If a deadlock occurs, it can be resolved if one car backs up (preempt resources and rollback). --Several cars may have to be backed up if a deadlock occurs. --Starvation is possible. --other example: --one thread/process grabs disk and then tries to grab scanner --another thread/process grabs scanner and then tries to grab disk --how do we get around deadlock? (i) ignore it: worry about it when happens (ii) detect and recover: not great --could imagine attaching debugger --not really viable for production software, but works well in development --threads package can keep track of resource-allocation graph --see book --For each lock acquired, order with other locks held --If cycle occurs, abort with error --Detects potential deadlocks even if they do not occur (iii) avoid algorithmically --banker's algorithm (see book) --very elegant but impractical --if you're using banker's algorithm, the gameboard looks like this: ResourceMgr::Request(ResourceID resc, RequestorID thrd) { acquire(&mutex); assert(system in a safe state); while (state that would result from giving resource to thread is not safe) { wait(&cv, &mutex); } update state by giving resource to thread assert(system in a safe state); release(&mutex); } Now we need to determine if a state is safe.... To do so, see book --disadvantage to banker's algorithm: --requires every single resource request to go through a single broker --requires every thread to state its maximum resource needs up front. unfortunately, if threads are conservative and claim they need huge quantities of resources, the algorithm will reduce concurrency (iv) prevent them by careful coding --negate one of the four conditions: 1. mutual exclusion 2. hold-and-wait 3. no preemption 4. circular wait --can sort of negate 1 --put a queue in front of resources, like the printer --virtualize memory --not much hope of negating 2 --can sort of negate 3: --consider physical memory: virtualized with VM, can take physical page away and give to another process! --what about negating #4? --in practice, this is what people do --idea: partial order on locks --Establishing an order on all locks and making sure that every thread acquires its locks in that order --why this works: --can view deadlock as a cycle in the resource acquisition graph --partial order implies no cycles and hence no deadlock --three bummers: 1. hard to represent CVs inside this framework. works best only for locks. 2. compiler can't check at compile time that partial order is being adhered to because calling pattern is impossible to determine without running the program (thanks to function pointers and the halting problem) 3. Picking and obeying the order on *all* locks requires that modules make public their locking behavior, and requires them to know about other modules' locking. This can be painful and error-prone. --we will see Linux's filemap.c as an example of complexity introduced by having a locking order (v) Static and dynamic detection tools --See, for example, these citations, citations therein, and papers that cite them: Engler, D. and K. Ashcraft. RacerX: effective, static detection of race conditions and deadlocks. Proc. ACM Symposium on Operating Systems Principles (SOSP), October, 2003, pp237-252. http://portal.acm.org/citation.cfm?id=945468 Savage, S., M. Burrows, G. Nelson, P. Sobalvarro, and T. Anderson. Eraser: a dynamic data race detector for multithreaded programs. ACM Transactions on Computer Systems (TOCS), Volume 15, No 4., Nov., 1997, pp391-411. http://portal.acm.org/citation.cfm?id=265927 a long literature on this stuff --Disadvantage to dynamic checking: slows program down --Disadvantage to static checking: many false alarms (tools says "there is deadlock", but in fact there is none) or else missed problems --Note that these tools get better every year. I believe that Valgrind has a race and deadlock detection tool