# CS202 (003): Operating Systems Concurrency V Instructor: Jocelyn Chen # Quiz Time! ## Last Time ``` T1: T2: acquire(mutexA); acquire(mutexB); acquire(mutexB); acquire(mutexA); // do some stuff release(mutexB); release(mutexA); release(mutexA); ``` Example 1 ``` class M { private: Mutex mutex m; // instance of monitor N N another_monitor; // Assumption: no other objects // in the system hold a pointer // to our "another_monitor" public: M(); ~M(); void methodA(); void methodB(); ``` ``` class N { private: Mutex mutex n; Cond cond_n; int navailable; public: N(); ~N(); void* alloc(int nwanted); void free(void*); N: M: acquire(&mutex_m); n.alloc(nwanted) acquire(&mutex n) navailable < nwanted</pre> ``` acquire(&mutex m); release(&mutex\_n) Example 2: Code see handout Happens when all four conditions are present: - (1) Mutual exclusion - (2) Hold and wait - (3) No pre-emption - (4) Circular wait # Preventing deadlock Ignore It! "admit defeat" **Detect & Recover** Works in development, not really viable for production **Avoid Algorithmically** There are ways but we don't cover them in this class<sup>1</sup> **Negate Any of the Conditions** Mutual exclusion put a queue for accessing resources Hold and wait not likely to work No preemption not likely to work Circular dependency put partial order on locks (=> no cycles) **Static/Dynamic Analysis** Static: detect potential errors without running the code<sup>2</sup> Dynamic: detect (potential) error during/after execution<sup>3</sup> # Other progress issues Starvation Priority Inversion Thread waiting indefinitely (if low priority and/or resource is contended) T1: T2: T3: (highest priority) (middle priority) (lowest priority) hold the lock start preempt T3 waiting for lock start running Why does T2 control the CPU? # Priority inversion - potential fixes Solution 1 Solution 2 T1: **T2: T1: T2**: T3: T3: (highest priority) (middle priority) (highest priority) (highest priority) (middle priority) (lowest priority) hold the lock hold the lock start start waiting for lock waiting for lock finish T3 disable interrupt release the lock finish T3 release the lock acquire the lock running acquire the lock running . . . . . . Solution 3 #### Don't handle it. Design the code wisely so that only adjacent priority processes/threads share the lock #### Performance issues and tradeoffs Implementation of spinlocks/ mutexes can be **expensive** Coarse locks **limit** available parallelism Fine-grained locking leads to complexity and hence bugs #### Mutex costs: - •instructions to execute "mutex acquire" - •sleep/wake up brings resource cost Only 1 CPU can execute anywhere in the part of your code protected by a lock See "filemap.c" in handout #### Spinlock costs: - cross-talk among CPUs - cache line bounces - fairness issues But, you should still start with coarse locks! ## Programmability issues Loss of modularity To avoid deadlock, you need to understand how program call each other You also need to know, whether library functions is thread-safe when you call it. If not, add mutex! What's the fundamental problem? Shared memory programming model is hard to use correctly # Some moments of reality about interleaving Remember sequential consistency? Modern multi-CPU hardware does not guarantee sequential consistency ``` struct foo { int abc; int def; Where is the bug? static int ready = 0; static mutex_t mutex; static struct foo* ptr = 0; void doublecheck_alloc() if (!ready) { /* <-- accesses shared variable w/out holding mutex */</pre> mutex_acquire(&mutex); if (!ready) { ptr = alloc_foo(); /* <-- sets ptr to be non-zero */</pre> ready = 1; mutex_release(&mutex); return; ``` # Yet, if you use mutex correctly... You don't have to worry about arbitrary interleaving Critical sections execute atomically You don't have to worry about what hardware is truly doing Threading library and compiler do the hard work for you ## That does not apply if you do low-level programming MUST ensure the compiler is not reordering key instructions MUST know the memory model (of the hardware) MAY know when to insert memory barriers ``` move $1, 0x10000 # write 1 to memory address 10000 move $2, 0x20000 # write 2 to memory address 20000 MFENCE move $3, 0x10000 # write 3 to memory address 10000 move $4, 0x30000 # write 4 to memory address 30000 ``` If any memory write after **MFENCE** (in program order) is visible to another CPU, then that other CPU also sees all memory writes before the **MFENCE** "acquire" and "release" in mutexes need memory barriers "xchg" on x86 includes an implicit memory barrier ``` struct foo { int abc; int def; Where is the bug? static int ready = 0; static mutex_t mutex; static struct foo* ptr = 0; void doublecheck_alloc() if (!ready) { /* <-- accesses shared variable w/out holding mutex */</pre> mutex_acquire(&mutex); if (!ready) { ptr = alloc_foo(); /* <-- sets ptr to be non-zero */</pre> ready = 1; mutex_release(&mutex); return; ```