CS202 Review Session 4
[Xiangyu Gao](https://xiangyug.github.io/), TA from Fall 2021, 
Edited by Khanh Nguyen, TA Spring 2022
Edited by Jinli Xiao, TA Spring 2023
Edited by Sophia Watts, TA Spring 2024
Edited by Saeed Bafarat, TA Fall 2024
Edited by Andrew Hua, TA Fall 2025
Edited by Shiv Dhar, TA Spring 2026

1. Process vs thread
  1.1 Link to concurrency
2. What is concurrency?
3. Approach to concurrency
  3.1 Concurrency Commandments
  3.2 Condition Variables
  3.3 Advice for solving problems
4. C++ primer
5. Lab 3
  5.1 Overview
  5.2 Lab 3 architecture
  5.3 Notable files
  5.4 Common pitfalls
6. Sequential Consistency
7. Q&A
8. Resources

---------------------------------------------------------------------

1. Process vs thread

Process:
    - An instance of a program
    - Each process has its own memory space and system resources (like file
    handles and network connections)
    - Processes are isolated, so the crash of one process won't affect others

Thread:
    - A unit of execution within a process
    - A process may have one thread (single-threaded) or many threads (multi-threaded)
    - Different threads in a process share address spaces and resources
        - Threads in a process have the same view of memory, which is exactly what allows them to work
          constructively on the same jobs/functionality
    - When we think about concurrency in this class, we are generally thinking in the context of threads 
      and how they interact with shared state


Process Memory Visualization:

┌────────────────┐
│ Stack Thread 1 │<──┐
├────────────────┤   │
│ Stack Thread 2 │<─┐│
├────────────────┤  ││
│ Stack Thread 3 │<┐││
└────────────────┘ │││
|                | │││
                   │││
|                | │││
                   │││
├────────────────┤ │││
│                │ │││
│                │<┘││
│      Heap      │<─┘│
│                │<──┘
│                │
├────────────────┤
│      Data      │
├────────────────┤
│  Program Code  │
└────────────────┘

1.1 Link to concurrency
  - Critical to understand that all threads have this same view of memory -> exactly what distinguishes
    threads from processes, which have separated views of memory; more on this in the VM unit
      - Stacks are separated, but heap and code are shared -> threads interact with both, which is 
        what allows concurrency to occur
  - Importantly, both processes and threads have their scheduling controlled by the kernel-> have to be
    extra careful with code exactly because we can be interrupted at any time
        - A future class will briefly explore ways in which we can implement cooperative scheduling
          (user-level threading), but for now and for the purposes of the code you write in class,
          we live in a world where threads cannot plan scheduling around each other
          

2. What is concurrency?
    -In essence: multiple things happening at the same time, making use of certain shared resources.

    - Why concurrency?
        - Allows for one process to work on multiple tasks at once
             - Think of any kind of job/problem that can be decomposed into multiple independent
               parts, for example neural networks -> lot of mathematical work that can be split into
               independent steps and processed concurrently -> exactly why we like GPUs for this work
             - A particular case of this that's useful is the ability to work separately on 
               I/O and computation -> waiting on, for example, a disk read doesn't have to mean all
               computation is stopped; more on this in the I/O lecture down the line

    - How do issues arise?
        - Programs sometimes need to share resources -> for example, multiple threads may need to
          update the same shared object in memory
              - This can easily cause problems if they access the object concurrently, resulting in 
                one thread reading in (or even writing) inconsistent data
        - A real-life example of shared resources is GitHub: Users take the state, then locally
          change it before pushing / pull requesting changes to change the shared state

    - Two sources of concurrency trouble:
        - Temporal -> to do with context switches and scheduling
            - Even in a single-CPU system, scheduling decisions made by the kernel can cause threads'
              execution to become interleaved
            - Simple operations like incrementing a variable decompose into multiple steps in asm, 
              so not being able to enforce some notion of atomicity can lead to a range of issues
        - Spatial -> to do with physical parallelism
            - In a multi-CPU system, can have two physical processes accessing shared memory at the
              same physical instant -> leads to even weirder behavior

How do we manage concurrency?
    - Various concurrency models and primitives have been developed to control access to shared 
      resources -> e.g. monitors, message passing, mutexes, conditional variables, and semaphores
    - In CS202, we will be using a monitor, which is a combination of mutexes and condition variables, 
      to handle concurrency
        - Mutexes give us the ability to control access to critical sections by implementing
          mutual exclusion <- one thread entering the critical section blocks others from
          doing so
        - CVs give ability to enforce scheduling constraints -> offer an element of control over which
          thread(s) enters the critical section next, based on relevant constraints


3. Approach to concurrency
 
3.1 Concurrency Commandments
 
  Rule 1: Always acquire/release locks at beginning/end of methods
      - Ensure that a given object's methods will largely be executed exclusively
      - If doing this means there is a lot of redundant work happening with the mutex held, you're 
        probably not applying enough modularity -> consider whether parts of your code can be placed in 
        separate methods

  Rule 2: Always hold lock when doing condition variable operations

  Rule 3: A thread in wait() must be prepared to be restarted at any time, not just when another thread
  calls signal

        In other words, if you have:

        if(!safe_to_proceed()) {
            wait(mtx);
        }
        MUST replace with a while loop, because the wait could spontaneously
        be woken and proceed without it being guaranteed to be safe.
 
  Rule 4: Do not call sleep()
      - Often (or always, in the context of this class) there is a better way to implement whatever you
        may be trying to implement

  Implied rule: always code your concurrency the same way
      Being consistent will allow you to look back a day later, a week later, and still be able to
      understand instead of having to decode each ad-hoc implementation. It may be more efficient code, 
      but it's less efficient coding.

3.2 Condition variables
    - We always call cond_wait() after acquiring the mutex. Why doesn't this block other threads?
        - cond_wait() releases mutex, waits for signal, and re-acquires mutex once signal received
    - Why do we need to re-check the constraint after being woken (as enforced by the while loop)? 
        - Even while waiting on a CV, thread may be woken without a signal AND
        - Even if the thread was woken by a signal, much can have changed between the signal initially
          being sent and the signalled thread resuming execution (e.g. if other threads were woken first)
    - signal vs broadcast
        - It is correct (but inefficient) to replace signal with broadcast -> there could be a large # of
          threads woken redundantly, but they should all check the constraint before proceeding anyway
        - It is not correct to replace broadcast with signal -> consider malloc example from class, or
          even simply a producer-consumer setup where producers produce more units than consumers consume

3.3 Advice for solving problems
    - General policy
          - Safety and correctness are of higher priority than efficiency -> okay to go super coarse
            with your locks (at least at first)
          - Specifically:
              - Coarse grain locking locks a lot of (or even all of) the state when any thread is
                acting -> safer, but less concurrent
              - Fine-grain locking breaks state up into smaller "lockable" chunks -> this is harder, 
                since you need to keep track of more concurrency structures

    1. Planning (the most important part):

        1a. Identify units of concurrency. Write down threads and the actions they'd
        take at a high level.

        1b. Identify shared chunks of state. Plan to have these objects be monitors.

        1c. Write down the high-level main loop of each thread.

        Advice: stay high level here. Don't worry about synchronization
        yet. Let the objects do the work for you.

    2. Write down the synchronization constraints on the
        solution, Identify the type of each constraint: mutual
        exclusion or scheduling. For scheduling constraints, ask,
        "when does a thread wait" and "what does it wait for"?

    3. Create a lock or condition variable corresponding to each
        constraint

    4. Write the methods, using locks and condition variables for
        coordination.
        Separate threads from objects. Shared state and
        synchronization should be encapsulated in shared objects.


4. C++ primer
    - Why C++: Allows us to create and use classes.

    - Dynamic Allocation:

        Rule of thumb:
        new = heap allocate
        delete = heap free

        Heap Allocate Int:
            int* p = NULL;
            p = new int;
            *p = 123;
            
            // OR

            int* p = new int;
            *p = 123;

            // OR

            int* p = new int(123);

        Free Int:
            delete p;

        Heap Allocate Int Array:
            int* arr = new int[10];
            arr[0] = 123;
            arr[1] = 456;
            arr[2] = 789;
            ...

        Free Int Array:
            delete[] arr;

    - Constructor & Destructor:
        Syntax: classname::~classname(). E.g: TaskQueue::~TaskQueue()
        Opposite of constructor
        Used to clean up memory, destroy mutex and conditional variables

    - Printing
        printf(...)
        std::cout << ""...
        In case want to use C printf(...) do #include <cstring> and just use the C String you are used to. 
        [cppreference](cppreference.com) is a great resource for both C and C++ libraries that you use


5. Lab 3

5.1 Overview
  - Goal: Creating a store where suppliers can add items or change their attributes while consumers buy it
    if they have the budget
  - Step 1:
    - What could the shared states be?
        - What could a coarse-grained approach look like? -> single lock and CV for whole store
        - What could a fine-grained approach look like? -> each product gets a lock
    - What types would users fall into, and how would they interact with the state?

5.2 Lab 3 architecture:

  - Tasks (that describe some sort of production or purchase transaction) are generated by the
    RequestGenerators and fed into TaskQueues
  - Customer and supplier threads (created using their own generators) receive tasks from their respective
    TaskQueues
        - Don't let the shared use of the word "generators" confuse you, the RequestGenerators generate
          objects (which should not be unfamiliar procedure to any of you), while the customer and supplier
          generators are generating threads <- this may be newer for some of you!
  - Supplier and customer threads attempt to make requests, while the RequestHandlers implement the 
    relevant handling of each type of request within the EStore

5.3 Notable files
    - sthread.cpp: Our thin wrapper around pthread; use this as your threading library
    - TaskQueue.cpp: The task queue that suppliers and consumers dequeue to operate on
    - estoresim.cpp: Main entry point for the code, creating threads
    - EStore.cpp: The shared object that suppliers and consumers are operating on
    - RequestHandlers.cpp: Handler so that worker thread knows what to do

5.4 Common pitfalls:
    - Forgetting to release the mutex
        - One of the reasons this lab starts by teaching grep -> you can use it to quickly locate every
          call to smutex_lock in your code; ensure each such call has a corresponding unlock
        - As an extension, make sure that no matter how a function ends, any locked mutex is unlocked
    - Forgetting to free/destroy resources with the destructor
        - Again, grep is useful in finding initializations of objects, so you can ensure each one has a 
          corresponding destruction
    - Improper condition checking -> ensure you don't hard-code any values that are changing (e.g. price)
        Code of the form:
        int condition = some_condition
        while(condition) { .... }
    	-> This is wrong because condition might rely on shared variables so need to be recomputed
        Example:
        Want to buy an item but you do not have the budget
        If you do price=item.id then do while(budget<price) this is wrong because you are hard coding 
        the value
        If a supplier changes the prices of the item so that now you can buy it because of hard coding
        the price you can't 
    - Deadlock in fine-grained mode
        - The buyManyItems request will attempt to acquire multiple locks -> not following a defined order
          for acquiring locks here will lead to deadlock

For reference:
acquire(m);
while (!p) {
	wait(m, cv);
}
signal(cv2);
release(m);


6. Sequential Consistency
    - "Informally, sequential consistency implies that operations appear to take
    place in some total order, and that that order is consistent with the order of
    operations on each individual process”
    (https://jepsen.io/consistency/models/sequential)
    Mental Model:
    T1: ---------A1------------A2------
    T2: ------B1-------------B2-------
    In sequential consistency model, we can think of this as a game where you have
    to provide an ordering of all the events across all the threads. But the catch
    is that you can't re-order event within single thread. However, order
    across different thread can be re-ordered
    For instance, here are some valid orderings:
    - B1-A1-A2-B2
    - B1-A1-B2-A2
    - A1-B1-A2-B2
    And here is some invalid orderings:
    - B1-A2-A1-B2 // A2 happens before A1 but both in T1
    - B2-B1-A1-A2 // B2 happens before B1 but both in T2


7. Q&A + Conclusion

8. Resources:
- Jepsen: https://jepsen.io/consistency/models/sequential
- cppreference: https://en.cppreference.com/w/