CS202: HW 5

These problems should be done on your own. They are not to be turned in. Getting help from AI (besides being ruled out by the course policies) will actually hurt you, since the point of these questions isn’t so you can deliver an answer somewhere and check a box; it’s for you to gain the practice and experience of working through the problems. You will need that general skill (of absorbing something by practicing it) in life, and in this semester you will need the specific problem-solving skills that are emphasized in these homeworks.

Time-of-check-to-time-of-use (TOCTTOU) bugs

Alice and Bob each have an account in a bank. Bob wants to transfer money to Alice. (We write the code below in terms of the synchronization primitives that you will see in Lab 3.)

// assume all the variables are initialized correctly
double alice_balance, bob_balance;
smutex_t mtx;

bool
transferBob2Alice(double trans) {
  if (bob_balance > trans) {
    smutex_lock(&mtx);
    bob_balance = bob_balance - trans;
    alice_balance = alice_balance + trans;
    smutex_unlock(&mtx);
    return true;
  }
  return false;
}

The implementation of function transferBob2Alice is not correct.

What's wrong? (Give a problematic interleaving.)
State the fix in one sentence.

Deadlock

The bank decides to use fine-grained locking. Here is its implementation:

// assume all the variables are initialized correctly
double balance[2]; // 0 for alice, 1 for bob
smutex_t mtx[2];    // 0 for alice, 1 for bob

bool transfer(int from, int to, double trans) {
  smutex_lock(&mtx[from]);
  smutex_lock(&mtx[to]);

  bool result = false;
  if (balance[from] > trans) {
    balance[from] = balance[from] - trans;
    balance[to] = balance[to] + trans;
    result = true;
  }

  smutex_unlock(&mtx[to]);
  smutex_unlock(&mtx[from]);
  return result;
}

Write down an interleaving that results in deadlock.
Keeping the same data structures, rewrite transfer() to eliminate the possibility of deadlock

More Deadlock

You are programming an application that has a very large array as a data structure, and multiple threads of control concurrently operate on the data structure. Performing a given operation on the array requires a thread to modify several array items at once, but the application logic requires all of the modifications associated with a given operation to happen atomically. Also, you cannot predict at compile time which items your code will be accessing together.

You approach the problem as follows. (1) For maximum concurrency, you create a lock for each item in the array. (2) When a thread performs an operation, it: (a) acquires all of the locks for the items that the operation needs to modify, then (b) modifies the array items, and then (c) releases all of the locks. This approach provides the required atomicity, but it creates the risk of deadlock.

In this scenario, what do you do to avoid deadlock? Answer briefly (no more than two sentences).

Priority Inversion

In this problem, the system has three tasks: one at high priority, one at medium priority, and one at low priority. Assume that the intent is to schedule according to strict priority (although we will see that this intent will be thwarted). Some assumptions:

The system runs one task a time (so assume a single CPU).
All three tasks are begun before the first task ends.
If a task with higher priority is ready to run, it will preempt the running task (note that if a thread is waiting on a mutex that is owned by another thread, then the waiting thread is NOT ready to run!).
Preemption can happen inside the critical section (just as when you code using mutexes in application space).
If a thread cannot continue (for example because it is waiting for a mutex), it yields.

Here are the three tasks:

smutex_t res;

void highPriority() {
  ... // do something
  smutex_lock(&res);
  ... // handle resource
  smutex_unlock(&res);
  printf("A ");
}

void mediumPriority() {
  ... // do something
  printf("B ");
}

void lowPriority() {
  smutex_lock(&res);
  ... // handle resource
  smutex_unlock(&res);
  ... // do something
  printf("C ");
}

Which of the following outputs are possible?
```
 A B C
 A C B
 B A C
 C B A
```
Explain.

A common bug in C coding

Consider the following code:

#include <stdlib.h>
#include <stdio.h>

// A box. Each box has an ID and a pointer to the box that resides inside
// of it. If the box has nothing inside of it, inner_box should be equal
// to NULL.
struct box {
    int id; 
    struct box *inner_box;
};

// Insert box: places the box "inner" inside of the box "outer".
// Since "outer" is being modified, we pass a pointer to "outer".
// Since "inner" is not being modified, we pass in "inner" directly.
void insert_box(struct box* outer, struct box inner) {
    printf("insert box: placing id %d inside id %d\n", inner.id, outer->id);
    outer->inner_box = &inner;
}

// Print box: prints a box and the box inside of it. This function
// is recursive and will end once a box is empty.
void print_box(struct box* first, int level) {
    int i;
    if (!first) 
        return;

    for (i=0; i < level; ++i) {
        printf("- ");
    }       
    printf("id: %d\n", first->id);
    print_box(first->inner_box, level+1);
}

int main() {
    // Create three boxes.
    struct box box1 = { .id = 37, .inner_box = NULL };
    struct box box2 = { .id = 12, .inner_box = NULL };
    struct box box3 = { .id = 19, .inner_box = NULL };

    // The box ordering should be box1 -> box2 -> box3
    insert_box(&box1, box2);
    insert_box(&box2, box3);
    
    // Print the boxes starting from the outside box.
    print_box(&box1, 0);

    return 0;
}

Using the program and the comments, describe at a high level what this code is intended to do. What output did the programmer expect the program to generate?

What happens when the program is run? (Copy/paste it into a file called box.c, compile it, and run it.)

What is the error in the programmer's reasoning?

Change this program to produce the intended output.

Reader Writer Spinlock (Challenge)

This problem is intended to be somewhat harder than the others. You will implement a multiple-reader, single-writer lock as a spinlock. Here is the description:

  struct sharedlock {
    int value; // when the lock is created, value is initialized to 0
  };

It allows multiple readers OR one single writer, and there are four functions:
```
  reader_acquire(struct sharedlock*)
  reader_release(struct sharedlock*)
  writer_acquire(struct sharedlock*)
  writer_release(struct sharedlock*)
```
We have given you the first of these, and your task is to write the last three of these. Each of these three functions only needs to be a single line of code.
When the lock is unlocked (no readers or writers holding the lock), its value is 0.
When there are one or more readers holding the lock (that is, multiple threads have completed reader_acquire() but have not called reader_release()), the lock's value equals the number of readers.
When the lock is held by a writer (i.e., a thread has made it past writer_acquire() but has not called writer_release()), its value is -1.
We are unconcerned here with fairness, efficiency, or starvation; just write something that is safe and that eventually allows a waiting thread, reader or writer, to make progress, even though a waiting writer may have to wait until there are no readers.
Assume that the lock is never acquired by an interrupt handler, so you don’t need to worry about enabling and disabling interrupts. You may also assume that the hardware provides sequential consistency.

You will likely need to call two atomic primitives, described below:

int cmpxchg_val(int* addr, int oldval, int newval): This is an atomic operation that compares oldval to *addr, and if the two are equal, it sets *addr = newval. It returns the old contents of *addr.
void atomic_decrement(int* arg): This atomically performs *arg = *arg - 1.

(We also include their pseudocode and inline assembly implementations in an appendix. However, you do not need this appendix material to do the problem.)

  // we are giving you the code for the first of the four functions:
  void reader_acquire(struct sharedlock* lock) {
    int curr_val;
    while (1) {

      // spin while a writer owns the lock
      while ((curr_val = lock->value) == -1) {}

      assert(curr_val >= 0);

      // try to atomically increment the count, based on our best
      // guess of how many readers there had been. if we were
      // wrong, keep looping. if we got it right, then we
      // succeeded in incrementing the count atomically, and we
      // can proceed.
      if (cmpxchg_val(&lock->value, curr_val, curr_val + 1) == curr_val)
        break;
    }
    // lock->value now contains curr_val + 1
  }

Write the other three functions! (Again, each needs only a single line of code.)

Appendix: Implementation of atomic primitives

Here is pseudocode and actual x86 assembly code for the implementations of the atomic primitives in the reader-writer spinlock question.

cmpxchg_val()

  /* pseudocode */
  int cmpxchg_val(int* addr, int oldval, int newval) {
    LOCK: // remember, this is pseudocode
    int was = *addr;
    if (*addr == oldval)
      *addr = newval;
    return was;
  }

  /* inline assembly */
  int cmpxchg_val(int* addr, int oldval, int newval) {
      int was;
      asm volatile("lock cmpxchg %3, %0"
        : "+m" (*addr), "=a" (was)
        : "a" (oldval), "r" (newval)
        : "cc");
       return was;
  }

atomic_decrement()

  /* pseudocode */
  void atomic_decrement(int* arg) {
    LOCK: // remember, this is pseudocode
    *arg = *arg - 1;
  }

  /* inline assembly */
  void atomic_decrement(int* arg) {
    asm volatile("lock decl %0" : "+m" (*arg) : "m" (arg));
  }