CS439 Spring 2013 Lab 2: Multi-threaded programming

Handed out Monday, January 28, 2013
Part A due Monday, February 4, 2013, 9:00 PM
Part B due Friday, February 8, 2013, 9:00 PM

Introduction

Administrative Note: You are doing this lab in pairs. See this page for details of what we expect.

In this lab, you will implement a model of an online store. Your model will support safe parallel updates to a central inventory database, which maintains stock of all items available for sale from the online store. Updates to this inventory database will come from both suppliers (who add goods to the inventory and enable or disable deals on items, such as discounts and free shipping) and customers (who, by purchasing items from the online store, remove them from the inventory). In the model, customers and suppliers will be represented by different threads, all executing in parallel.

Though your store is only a model, the concepts you will learn and use throughout this lab could be applied similarly in a real-world setting. Suppliers and customers would be on different computers, talking to a central inventory database over the Internet. Requests sent over the Internet from suppliers and customers would be handled by different threads on the central inventory computer, and any operations on the database would have to be done with proper synchronization, just as in your model.

Getting Started

Use Git to commit your Lab sh source, fetch the latest version of the course repository, and then create a local branch called lab2 based on our lab2 branch, origin/lab2:

tig% cd ~/cs439/labs
tig% git commit -am 'my solution to labsh'
Created commit 254dac5: my solution to labsh
 3 files changed, 31 insertions(+), 6 deletions(-)
tig% git pull
Already up-to-date.
tig% git checkout -b lab2 origin/lab2
Branch lab2 set up to track remote branch refs/remotes/origin/lab2.
Switched to a new branch "lab2"
tig% make tidy
Removing ...
tig% 

Lab Requirements

This lab is divided into two parts, A and B. You should make turnin-partA your lab before the Part A deadline, at which point your code must implement all the part A exercises. By the Part B deadline, your code must implement all of the part B exercises (use make turnin-partB).

Hand-In Procedure

When you are ready to hand in your lab code and write-up, run make turnin-parti where i is the part you want to turnin in the lab directory. This will first do a make clean to clean out any .o files and executables, and then create a tar file called lab2i-handin.tar.gz with the entire contents of your lab directory and submit it via the CS turnin utility. If you submit multiple times, we will take the latest submission and count lateness accordingly.

We will be grading your code largely by inspection, as it is impossible to test the correctness of multi-threaded code programmatically. Thus, you will be required to follow a set of coding conventions (which we provide later in this lab) as you complete the exercises for this project. If you do not follow these conventions, you will get a poor grade. Though we are not providing a grading script, you should still write some test cases for your code where appropriate to ensure that your code is not obviously wrong (this may not catch subtle bugs, but it should catch glaring errors).

A Quick Introduction to C++

The right way to think of shared state is as shared objects. We're therefore going to use C++, C's object-oriented descendant, for this project.

Don't worry. For the subset of C++ relevant to this project, the learning curve will be short (especially assuming you already know Java.) Furthermore, we will provide template code to which you will add the details; this should largely insulate you from having to learn much C++ syntax.

Read A Quick Introduction to C++, and you should be good to go. Note that this document was written over a decade ago, so a few of the comments on the state of standards and tools are a bit out of date (for example, the document warns against using C++ templates because debuggers didn't understand them well back then; this warning is much less applicable today.) Nonetheless, it provides a good, quick overview of the key ideas to use (and some issues/pitfalls to avoid.)

Working with Threads

Before you begin the assignment, read Coding Standards for Programming with Threads, by Mike Dahlin. You are required to follow these standards for this project. Because it is impossible to determine the correctness of a multithreaded programming via testing, grading on this project will primarily be based on reading your code not by running tests. Your code must be clear and concise. If your code is not easy to understand, then your grade will be poor, even if the program seems to work. In the real world, unclear multi-threaded code is extremely dangerous -- even if it "works" when you write it, how will the programmer who comes after you debug it, maintain it, or add new features? Feel free to sit down with the TA or instructor during office hours for code inspections before you turn in your project (note: to take advantage of this, start on time! [which is probably earlier than you think]).

To simplify your task, we supply a simple thread package (written by Mike Dahlin) on top of the standard POSIX thread library (which is known as pthreads). The idea is to shield you from irrelevant detail. This way, you use the standard package but you also focus on the project at hand. However, you are not required to use the wrapper; you may instead use pthreads if you so choose. The code for the simple thread package (which we will refer to hereafter as sthreads) we provide is in sthread.cpp and sthread.h.

The package provides threads (sthread_ts), mutex locks (smutex_ts), and condition variables (scond_ts) as well as some other utility functions that you may need. We suggest that you read over these functions and know how to use them. It may be helpful to write a couple of example programs using sthreads before starting this project. For more information, see the man pages for the pthreads library functions used in the sthread.cpp code.

You should keep the following in mind as you code these labs:

Part A: Task Queue and Coarse-Grained Store Synchronization

For our model, we will simulate a fixed number of customers and suppliers. For each customer and each supplier, there will be a unique thread representing that customer or supplier. We will often refer to these threads throughout this document as worker threads. These worker threads will get jobs to work on from a task queue, which you will implement. There will be one queue for all customer threads and one queue for all supplier threads. This task queue must allow for multiple worker threads to simultaneously attempt to add or remove jobs while still maintaining the integrity of the queue's internal data structures by using locks (i.e., your task queue must be thread-safe).

It is a common pattern in multi-threaded programming to have a single (or small number of) task queue(s) for a large number of threads. A thread-safe task queue makes the job of allocating work to threads (and having threads allocate work to other threads, should the case arise) easy to do and (relatively) easy to reason about.

Remember, you must follow the restrictive coding standards specified. There are specific reasons for these rules: we believe that if you follow these rules, you are very likely to learn to write good multi-threaded code. Conversely, if you violate these rules, we fear that you may not learn this material as well. Code that fails to conform to these rules is incorrect and will receive little credit when this lab is graded.

Exercise 1. Look through the queue interface in TaskQueue.h and the documentation for its methods in TaskQueue.cpp. Fill out the rest of the members in class TaskQueue in TaskQueue.h to finish the definition of your task queue class, then implement the rest of the queue methods in TaskQueue.cpp. You are welcome to use any standard C++ container to help build the functionality of your queue (such as a std::queue or std::deque), or you can create your own data structures (like a linked list) and add any needed helper structures to do the same.

A task in the task queue is represented by a struct Task (also found in TaskQueue.h), which consists of a pointer to a function and an argument to be passed to that function. When a worker thread removes a task from the task queue, it should call the function given in the struct Task with the argument given in the Task.

Run make to make sure your queue code compiles. For now, there's no executable to run.

A task queue is little good to worker threads if there is no work available to be put on it. We have provided some code to generate work requests to be put on your task queues in RequestGenerator.h and RequestGenerator.cpp. Become familiar with this code. Specifically, there are two subclasses of the main class RequestGenerator that you should know the purpose of: class CustomerRequestGenerator and class SupplierRequestGenerator. The former CustomerRequestGenerator, as its name would suggest, generates requests for customer threads to perform (such as buying items). The latter SupplierRequestGenerator generates requests for supplier threads to perform (such as adding or removing items from inventory, or putting items on sale). You will use these generator classes to implement task generator threads, which will produce work for customer and supplier threads.

Before you start implementing any thread functions, however, you should actually have some threads running. In the main source file for our simulator, estoresim.cpp, is a function called startSimulation. This function kicks off all threads in the simulator (generators, customers, and suppliers) and then, after starting all threads, waits for them to finish. When all threads have finished working, the simulation is complete.

Exercise 2. Read the documentation for, and then implement, the startSimulation function in estoresim.cpp. Use sthread_create to create threads. The worker thread functions supplierGenerator, supplier, customerGenerator and customer all reside in estoresim.cpp. Use the provided class Simulation to keep track of task queues and the number of customers and suppliers. When you have finished writing your code, run make run-sim to run the simulator. You may want to have the threads produce some output to make sure that you are starting them correctly.

Don't worry about the meaning of the fineMode variable in Simulation for now, just make sure it is set to the value passed in the parameter useFineMode.

You may also find it useful in this exercise, and in others throughout this lab, to call printf in some choice places (where worker threads start, for instance). The code we provide you doesn't print anything to the terminal, and so when you run the simulator with make run-sim, you won't see any output unless you put the printfs in yourself.

At the moment, all of the threads you have created start and then immediately stop, as none of their associated functions are implemented (they just return when called). However, since you have actual threads running, you can proceed to implement the generator, customer, and supplier thread functions. If you put printfs in these functions as suggested, you should see all the threads print to the terminal.

Exercise 3. Implement the customer, supplier, supplierGenerator, and customerGenerator functions in estoresim.cpp. Read the code referred to above (e.g. RequestGenerator.cpp) and the comments for these functions to get an idea of what they should be doing. The generator functions should look similar to each other, as should the supplier and customer functions to each other. Produce some output from these threads to make sure they're running and that jobs are being pushed between them properly.

For the supplierGenerator and customerGenerator functions, you will also have to implement the enqueueStops method in RequestGenerator.cpp.

Run make run-sim to run the simulator. You should see the output of your worker threads. You can stop the simulator by pressing Control-C.

Now you have many customers and suppliers working on jobs generated by the generator threads. But there is no inventory for items in the store, nor are any of the work functions produced by the request generators implemented. So, for the moment, our simulator spawns many workers in parallel to do very little. For the workers to have anything to work on, there should be an inventory of items to purchase from and add to. To this end, we have provided the skeleton of this inventory for you as class EStore, in the files EStore.h and EStore.cpp.

Exercise 4. Design and implement the EStore class, filling in all the method skeletons we provide and adding any new methods you may find necessary. Read through all the provided comments in the files related to the EStore class (but don't worry about the buyManyItems method for now). The EStore will have to keep track of many Items and ensure that all modifications to Items in the EStore are synchronized.

You should implement the EStore as a monitor; that is, there should be a single lock on the entire store, which is acquired upon entering any of the store's methods and released upon exit.

As a reminder, the buyItem method in EStore should wait for an item to become available or on discount if necessary. So, if a customer comes in looking to buy an item outside of the given budget, the customer should wait until the item is being sold at a sufficiently high discount and then buy, instead of just immediately returning.

Run make run-sim to make sure your code compiles and doesn't have any segfault-inducing bugs in the constructor for class EStore. For the moment, no threads are actually interacting with the EStore, so if you put any printfs in its methods, you won't see them output anything just yet.

There is only one piece remaining to make the simulator work now: the handlers for the jobs created by the request generators and pushed to the worker threads. These jobs consist of adding and removing items from the EStore inventory or setting discounts on items (for suppliers) and of purchasing items (for customers). We've provided skeletons for these handlers in RequestHandlers.cpp. Before you implement these handlers, you may find it helpful to read through all the different kinds of requests that exist in Request.h.

Exercise 5. Go through and implement each handler in RequestHandlers.cpp one-by-one. Print a message at the beginning of each handler which contains the name of the handler and the fields of the request struct passed to it (you don't need to print the value of the store pointer). This will produce a trace of the work done by your threads as they process jobs.

Your handler functions should largely make use of the methods you implemented previously in EStore.

After implementing all the handlers (including stop_handler), your simulator may not terminate on its own. This will happen in the case where a customer is blocked on buying an item, but the item never gets discounted enough for the customer to buy, which will happen sometimes and should be expected. In this case, you will still need to kill the simulator with Control-C.

Run make run-sim after you implement each handler and make sure that requests are being dispatched to your new handlers.

Here's some example output from the staff solution to part A of the lab to give you an idea of what running your simulator should look like. You don't have to worry about matching our output format exactly.

Handling AddItemReq: item_id - 77, quantity - 94, price - $6370.15, discount - 0.00
Handling BuyItemReq: item_id - 83, budget - $14308.86
Handling BuyItemReq: item_id - 35, budget - $33853.86
Handling AddItemReq: item_id - 92, quantity - 22, price - $5167.49, discount - 0.00
Handling BuyItemReq: item_id - 62, budget - $9900.27
Handling AddItemReq: item_id - 90, quantity - 64, price - $5201.59, discount - 0.00
Handling BuyItemReq: item_id - 26, budget - $6805.40
Handling AddItemReq: item_id - 26, quantity - 37, price - $892.72, discount - 0.00
...
Handling ChangeItemPriceReq: item_id - 96, new_price - $7007.23
Handling BuyItemReq: item_id - 46, budget - $19934.51
Handling ShippingCostReq: new shipping cost - $29.21
Handling BuyItemReq: item_id - 79, budget - $30974.88
Handling AddItemReq: item_id - 64, quantity - 42, price - $8883.28, discount - 0.00
...
Handling StopHandlerReq: Quitting.
Handling StopHandlerReq: Quitting.
Handling StopHandlerReq: Quitting.

You should now have a working store simulator. Run make run-sim a few times and watch the output of your workers as they process jobs. If you notice anything suspicious in the output, go back and check to make sure you did your synchronization correctly and that you are following the multi-threaded coding guidelines.

This ends part A of the lab. Make sure you have committed all your changes, then run make turnin-partA.

Part B: Fine-Grained Store Synchronization

In part A, you implemented a store simulator using coarse-grained synchronization: there was one lock on the entire EStore that kept concurrent accesses from stepping on each other, which effectively serialized execution of the program (as no two threads could modify the inventory in EStore at the same time). However, you might have noticed that you don't necessarily need to lock the entire inventory if two threads are trying to modify two different items in the inventory. Thus, in part B, we will explore an approach that exposes more parallelism, by using fine-grained locking. Instead of using one large lock on the entire EStore, you will use locks over smaller units of code. In this way, multiple threads that aren't touching the same data can operate on the inventory concurrently.

Keep in mind throughout the next exercises that you should be able to run the simulator using your old coarse-grained locking approach by running make run-sim and using your yet-to-be-implemented fine-grained locking approach by running make run-sim-fine. The latter command, as might be expected, runs the simulator in "fine-grained mode" (this is related to the mysterious fineMode variable you saw before, which is set to false with make run-sim, but set to true with make run-sim-fine). You may want to abstract some of your locking with methods whose function varies depending on the value of fineMode. To reiterate, your simulator must use your coarse-grained locking approach from part A when we run make run-sim and use your new fine-grained locking approach to be done in part B when we run make run-sim-fine!

Exercise 6. Change your EStore implementation to allow multiple threads to access different items at the same time. If two threads try to access the same item at the same time, then only one should be allowed to modify the item at a time.

Don't worry about modifying the implementation of buyItem. In fine-grained mode, buyItem is not called. You will implement the fine-grained mode version of buyItem, called buyManyItems, in the next exercise.

Run make run-sim-fine to make sure your code still runs. Since you are now running in fine-grained mode, BuyItem requests are not being generated; instead, BuyManyItems requests are generated. This function is still unimplemented, so only suppliers will be making any modifications to the state of the EStore at the moment.

To flex the new fine-grained locking features of the simulator, you must implement the handler and EStore functionality for the BuyManyItems request. This request requires that many items be looked at (with synchronization!) and, if the right conditions hold, modified (i.e. purchased).

Exercise 7. Implement the EStore method buyManyItems and its corresponding handler in RequestHandlers.cpp. Make sure that, as with previous request handlers, you print out the name of the request and the fields of the request struct passed to the handler at the beginning of the buyManyItems handler.

Run make run-sim-fine. You should now see buyManyItems requests being handled by threads. You should also make sure that your old code still functions correctly with make run-sim.

Here's some sample output from the staff solution to part B. As before, your output doesn't have to look exactly like ours, but this should give you an idea of what we'd like to see when we run your simulator.

Handling AddItemReq: item_id - 62, quantity - 91, price - $4901.27, discount - 0.00
Handling BuyManyItemsReq: item_ids - 15 35 49 77 86 92 93 
Handling BuyManyItemsReq: item_ids - 26 40 63 
Handling AddItemReq: item_id - 36, quantity - 69, price - $53.11, discount - 0.00
...
Handling BuyManyItemsReq: item_ids - 21 29 35 37 58 74 93 95 
Handling AddItemReq: item_id - 43, quantity - 30, price - $3360.28, discount - 0.00
Handling BuyManyItemsReq: item_ids - 4 11 43 63 76 
Handling BuyManyItemsReq: item_ids - 18 
...
Handling StopHandlerReq: Quitting.
Handling StopHandlerReq: Quitting.
Handling StopHandlerReq: Quitting.

This completes the required set of exercises for part B. However, if you feel up to the task and think you might be able to use a few points of extra credit, we've provided some challenge problems for you to try. If you implement challenge problems, write up which challenge problems you did and give an overview of how you solved them in the file answers.txt in your lab directory. Make sure to git add answers.txt so that your write-up is included in your turn-in. These challenges are listed in the documentation for buyManyItems in EStore.cpp and are stated below. [UPDATE] If you attempt these challenge problems, you must create a new branch called lab2-challenge and commit your challenge code there. The lab2 branch should only include the code for the mandatory parts of this lab; we will grade under this assumption.

Challenge! Implement a version of the EStore::buyManyItems method that will wait until the order can be fulfilled instead of giving up. The implementation should not wake up threads unecessarily. For instance, if an item decreases in price, only threads that are waiting to buy an order that includes that item should be signaled (though all such threads should be signaled).

Challenge! Ensure that the shipping cost and store discount does not change while processing an order in EStore::buyManyItems.

This ends part B of the lab. Make sure you have committed all your changes, then run make turnin-partB.

Acknowledgements

Parts of the documentation for this lab, as well as the sthreads library, were taken from Mike Dahlin's Network Scheduler lab.


Last updated: Thu Feb 07 06:37:53 -0600 2013 [validate xhtml]