Designing multicore-aware software is a vital component of writing modern, performant code. Through the 1990s, processors got faster and faster, but around the turn of the millennium, CPUs started to reach physical limits. To drive innovation and produce more powerful computers, manufacturers began to pack multiple cores into each processor, and today, even consumer CPUs may have dozens of cores. However, most software is conceived by the programmer as a linear progression of instructions. Compilers can find some parallelization opportunities, but that is not enough to take full advantage of multiple cores. This course discusses many modern techniques to exploit multiple cores, working from the very lowest level upwards. Topics include:

Students will be expected to have experience with a programming-intensive course. Students must have previous experience with C++, but no experience writing threading or multithreaded C++ programs is necessary. Unlike the graduate version of this course, previous experience with Operating Systems (OS) is not required. The course includes lectures, homework, cumulative labs, a final presentation, and a midterm and final exam.

Labs & Assignments

A series of labs will challenge students to make their understanding of the concepts taught in the course more concrete. The labs will be implemented in C++ (not C). An introductory lab (Lab 0) ensures that all students are roughly on the same page with the necessary tools (gcc/g++, gdb, and git). The remaining labs will involve building aggregating pieces of a larger multicore system, utilizing genetic algorithms to solve problems.

Several homeworks will also be assigned to evaluate students' absorption of the theoretical concepts taught in class, in preparation for the final exam.

Academic integrity will be strictly enforced, and a zero-tolerance policy will be applied to instances of cheating. Please refamiliarize yourself with the department's academic integrity policy. In this class, you can show a colleague how to use a given tool. You can discuss strategies to solve the problems with a colleague, if (and only if) you mention it in writing in your assignment hand-in. This kind of collaboration is encouraged. Each student must type, compile, debug, benchmark, and discuss any aspect of their own code. You are not allowed to look at a colleague's code, and copying code from existing resources (whether online or offline, public or private) will both not teach you anything and be subject to severe disciplinary action from a zero on the assignment up to an F in the class and possible further action. This isn't just boilerplate: infractions are taken seriously, and are a waste of time and disrespectful to your classmates and the instructor.

Mailing List

Unlike previous semesters, the NYU Classes mailing list system will be used ( Questions about the lecture material and readings should go to the mailing list before contacting the instructor directly. If you don't beleive you've been added to the mailing list, please contact the instructor.


The readings column is updated each week. Note that the readings listed are the ones you should or must (as noted) read by that class; they're not the readings assigned on that day.

Date & Lecture Material Assignments & Reading
Tuesday, May 28, 2019
Lecture 1 (Slides)
Syllabus, course outline
The Advent of the Multicore Processor
Thursday, May 30, 2019
Lecture 2 (Slides)
A C++ Crash-Course
Tuesday, June 4, 2019
Lecture 3 (Slides)
Parallelism, Concurrency, and Performance
Thursday, June 6, 2019
Lecture 4 (Slides)
Programming Models
GDB Review
Tuesday, June 11, 2019
Lecture 5 (Slides)
Know Your Hardware
Thursday, June 13, 2019
Lecture 6 (Slides)
Parallel Programming Primitives
Tuesday, June 18, 2019
Lecture 7 (Slides)
Mutual Exclusion
Lab 0 Review
Homework 0 Review
Thursday, June 20, 2019
Lecture 8 (Slides)
Semaphors and Condition Variables
Tuesday, June 25, 2019
Lecture 9 (Slides)
Reader-Writer Locks
Thursday, June 27, 2019
Lecture 10 (Slides)
Barriers and Thread Pools
Homework 2 Review
  • Lab 1 Due
Tuesday, July 2, 2019
Lecture 11 (Slides)
Synchronized Structures Part 1: Queues
Thursday, July 4, 2019
No Class
Tuesday, July 9, 2019
Midterm Quiz
Midterm Quiz
Thursday, July 11, 2019
Lecture 12 (Slides)
Synchronized Structures Part 2
Tuesday, July 16, 2019
Lecture 13 (Slides)
Multicore Correctness
Thursday, July 18, 2019
Lecture 14 (Slides)
Multicore Performance Evaluation
Tuesday, July 23, 2019
Lecture 15 (Slides)
Heterogeneous Multicore
Thursday, July 25, 2019
Lecture 16 (Slides)
Synchronized Structures Part 3
Tuesday, July 30, 2019
Lecture 17 (Slides)
Transactional Memory
  • Homework 4 Due
  • Lab 3 Due
Thursday, August 1, 2019
Lecture 18 (Slides)
Lab 4 Assigned / Lab Workday (Maze Library)
Tuesday, August 6, 2019
Presentations I
Thursday, August 8, 2019
Presentations II
Tuesday, August 13, 2019
Final Exam
  • Lab 4 Due (LAST ALLOWED DAY: 11:59:59pm, August 14th!)
Thursday, August 15, 2019
Final Exam Review
Lab 4 Review

Supplemental Readings

  1. "Parallel Programming for Multicore and Cluster Systems" (You must be logged into NYU network)
  2. Herb Sutter, The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software, Dr. Dobb's Journal, 30(3), March 2005.
  3. How to survive the multicore software revolution?
  4. Chip Multiprocessor Architecture: Techniques to Improve Throughput and Latency (You must be logged into NYU network)
  5. The Memory System: You Can't Avoid It, You Can't Ignore It, You Can't Fake It (You must be logged into NYU network)
  6. A Primer on Memory Consistency and Cache Coherence (You must be logged into NYU network)
  7. The Problem With Threads
  8. A Runtime Implementation of OpenMP Tasks (You must be logged into NYU network)
  9. IPC considered harmful for multiprocessor workloads
  10. Computer Architecture Performance Evaluation Methods (You must be logged into NYU network)
  11. Effective Performance Measurement and Analysis of Multithreaded Applications
  12. Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance
  13. The Impact of Performance Asymmetry in Emerging Multicore Architectures
  14. Transactional Memory (You must be logged into NYU network)
  15. Unlocking Concurrency
  16. Performance-Aware Multicore Programming (You must be logged into NYU network)
  17. The Common Case Transactional Memory Behavior of Multithreaded Programs