Class 18 CS 439 21 March 2013 On the board ------------ 1. Last time 2. Revisit locking --------------------------------------------------------------------------- 1. Last time I/O and disks. thanks Parth some people wonder why we study the geometry. answer is that it affects how systems are built. 2. Revisit locking A. Recall game plan for managing concurrency: --build a lock/unlock primitive with hardware support if there's one CPU, we implement lock/unlock as disable/enable if there are multiple CPUs, we use spinlocks we saw one type of spinlock last time today, study another one: MCS locks. --then we build higher-level abstractions from the low-level lock/unlock: mutexes monitors/CVs B. Review disadvantages of locking --hard to get right (though the advice we give you helps) --performance --performance/complexity trade-off --starvation .... C. Focus on performance. Use it as an excuse to cover some things. quick digression: --_dance hall_ architecture: any CPU can "dance with" any memory equally (equally slowly) --NUMA (non-uniform memory access): each CPU has fast access to some "close" memory; slower to access memory that is further away --AMD Opterons like this --Intel CPUs moving toward this --see first page of handout --two further choices: cache coherent or not. in the former case, hardware runs a cache coherence (cc) protocol to invalidate caches when a local change happens. in the latter case, it does not. former case is far more common. let's assume ccNUMA machines...back to performance issues.... the performance issues are: (i) fairness --one CPU gets lock because the memory holding the "locked" variable is closer to that CPU --allegedly, Google had fairness problems on Opterons (I have no proof of this) (ii) lots of traffic over memory bus: if lots of contention for lock, then cache coherence protocol creates lots of remote invalidations every time someone tries to do a lock acquisition (iii) cache line bounces (same reason as (ii)) (iv) locking inherently reduces concurrency mitigation of (iv): more fine-grained locking mitigation of (i)--(iii): better locks --MCS locks --see handout --advantages --guarantees FIFO ordering of lock acquisitions (addresses (i)) --spins on local variable only (addresses (ii), (iii)) --[not discussing this, but: works equally well on machines with and without coherent caches] --NOTE: with fewer cores, spinlocks are better. why?