Multicore Processors: Architecture & Programming
Prof. Mohamed Zahran   (aka Prof. Z)
mzahran AT cs DOT nyu DOT edu
Wedndays 5:10-7:00 pm  Location: WWH 1302
Office Hours: Tuesdays 2:00-4:00pm (WWH 320)


Welcome students! ... to the Multicore Processors: Architecture & Programming course, edition Spring 2016.  I will keep updating this page regularly. If you have questions related to that course feel free to email me at mzahran (at) . Here is some basic information:

Mailing List

Sign up for the Mailman mailing list for the course, if the system hasn't signed you already. 
You can do so by clicking here. Please follow the mailing list etiquette. 
  • Use the Reply command to contribute to the current thread, but NOT to start another topic.
  • If quoting a previous message,  try to trim off irrelevant parts.
  • Use a descriptive Subject: field when starting a new topic.
  • Do not use one message to ask two unrelated questions.

  • Lectures

    Date Lecture Readings Comments
    1. 1/27 Multicore/Manycore Revolution   
    • Section 1.2 from #1 
    • The whole of #2
    2 .2/3 Parallelism and Concurrency 
    • Sections 2.2 and 2.3 in #1
    • Section 2 from #3
    • Section 1.2 from #4
    • hw1 asssigned
    3. 2/10 Know Your Hardware
    • Chp 1 from #5
    • Chp 2 from #6
    4. 2/17 Gentle Introduction to Parallel Programming
    • 3.1, 3.2, and 3.3 from #1
    5. 2/24 Threads ... Pthreads (Part I)
    • hw2 assigned
    5. 3/2 Threads ... Pthreads (Part II)
    • #7
    6. 3/9 OpenMP
    • Good tutorial here  
    • lab1 assigned
        3/16No class: Spring Recess
    7. 3/23 OpenMP (cont'd)
    8. 3/30 Other Concurrency Models
    • Haskell
    • Cilk++
    • Transactional Mem: #8, #9
    • lab 2 assigned
    9. 4/6 Multiple Multicore: MPI
    • A good tutorial here
    • hw3 assigned
    10. 4/13 Performance Evaluation
    • Section 2.1, 2.2, 2.3, 2.4 from #10
    • #11
    • #12
    12. 4/20 Putting it All Together
    13. 4/27 Presentations:
    • Memory allocator for multithreaded programs (Morten and Hasan)
    • Tumalloc (Thomas Uttaro)
    • Performance and Productivity Analysis (Jiaming Zhang and Willliam Emmett Brantley)
    • Processing data in parallel (Shaodong Li)
    • Concurrent Red-Black Tree (Yew Wong)
    • Parallel Matrix Inversion (Sourjya Dutta)
    14. 5/4 Presentations:
    • Automatic Parallelization Tool (Yu Huang)
    • Conv-N on multicore (Yuhan Zhao)
    • Loop Perforation (Zahra Ghodsi)
    • Two-Programming Models (Scott Michael Buckfelder)
    • Parallel Data Structure (Tailin Lo)
    • Programming Lang. Comparison (Tory Peterschild)

    Note about using MPI:

    Before doing any MPI programming, do the following once you login onto your CIMS account:
    ssh to one of the computational nodes (e.g. crunchy1, crunchy3, crunchy4, crunchy5, and  crunchy6 ... No crunchy2!)
    Type, the following:
         module load openmpi-x86_64

    Reading Material

    Below you will find the reading material we will use in this course.
    Next to each lecture, above, you will find the reading assignment. 
    It indicates a number, corresponding to the list below, and section(s) to read from that material.
    If no section numbers  are indicated, it means you need to read the whole thing.
    This list will grow as we proceed.
    1. Parallel Programming for Multicore and Cluster Systems (You must be logged into NYU network)
    2. Herb Sutter, The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software, Dr. Dobb's Journal, 30(3), March 2005.
    3. How to survive the multicore software revolution?
    4. Chip Multiprocessor Architecture: Techniques to Improve Throughput and Latency   (You must be logged into NYU network)
    5. The Memory System: You Can't Avoid It, You Can't Ignore It, You Can't Fake It   (You must be logged into NYU network)
    6. A Primer on Memory Consistency and Cache Coherence   (You must be logged into NYU network)
    7. The Problem With Threads
    8. Transactional Memory (You must be logged into NYU network)
    9. The Common Case Transactional Memory Behavior of Multithreaded Programs
    10. Computer Architecture Performance Evaluation Methods (You must be logged into NYU network)
    11. Effective Performance Measurement and Analysis of  Multithreaded Applications
    12. Performance-Aware Multicore Programming  (You must be logged into NYU network)


    Before executing your code, once you login onto your CIMS account ssh to one of the computational nodes
    (e.g. crunchy1, crunchy3, crunchy4, crunchy5, and  crunchy6 ... No crunchy2!)

    Homeworks Assignments 


    Interesting Links (Geeky Stuff)

    If you have an interesting link, please email it to the instructor and it will find its way to this page (with an acknowledgment to you of course!).

    Future Chips (targeting both software and hardware folks interested in parallel programming)

    Designing and Building Parallel Programs

    Multicore info

    The trouble with multicore

    Computer Architecture

    Introduction to parallel computing

    The Landscape of Parallel Computing

    Designing and Building Parallel Programs