CSCI-GA.3033-016
Multicore Processors: Architecture & Programming
Prof. Mohamed Zahran   (aka Prof. Z)
mzahran AT cs DOT nyu DOT edu
Wedndays 5:10-7:00 pm  Location: WWH 312
Office Hours: Tuesdays 2:00-4:00pm (WWH 320)


Announcements

Welcome students! ... to the Multicore Processors: Architecture & Programming course, edition Spring 2018.  I will keep updating this page regularly. If you have questions related to that course feel free to email me at mzahran (at) cs (dot) nyu.edu . Here is some basic information:



Where to Find Stuff


Lectures

Date Lecture Readings Comments
1. 1/24 Multicore/Manycore Revolution   
  • Section 1.2 from #1 
  • The whole of #2
2 .1/31 Parallelism and Concurrency 
  • Sections 2.2 and 2.3 in #1
  • Section 2 from #3
  • Section 1.2 from #4
  • hw1 assigned
3. 2/7 Know Your Hardware
  • Chp 1 from #5
  • Chp 2 from #6
4. 2/14 Know Your Hardware (cont'd)
5. 2/21 Gentle Introduction to Parallel Programming
  • 3.1, 3.2, and 3.3 from #1
  • hw2 assigned
  • projects assigned
6. 2/28 OpenMP
  • lab 1 assigned
7. 3/7 Class cancelled due to snow storm
    3/14 No class: Spring Recess
8. 3/21 Class cancelled due to snow storm
9. 3/28 OpenMP (cont'd)
10. 4/4 MPI
A good tutorial here
  • lab 2 assigned
11. 4/11 Other Concurrency Models
12. 4/18 Performance Evaluation
13. 4/25 Presentations:
1. Comparing: OpenMP & OpenACC (Shuyi Yu & Mi Fang)
2.  Comparing: OpenMP & OpenCL (Xialiang Liu & Tian Zhao)
3.  Parallelize & study graph-analytics benchmark (Andres Davila & Jiin Nam)
4.  Bottleneck Analysis of graph analytics benchmarks (Weisen Zhao & Kailin Luo & Jiayun Yu)
5. Parallelize & study graph-analytics benchmark (Shaheer Haroon & Chris Moirano)
6. Parallelizing bloom filter (Yik Wai Ng & Dayou Du)
7. Parallelize large matrix factorization using OpenMP (Arnav Kansal & Ojas Deshpande)
8. Parallel Sat solver (Guangyu Liu)
9. Bottleneck analysis of web crawler (Cheng, Zhengyue)
14. 5/2 Presentations:
1. Parallel processing of data streams using MPI (Munir Contractor and Elaina Tan)
2. Coherence protocols comparison (Chun Hao(Kevin) Yang)
3. Cache replacement policy analysis for shared cache (
Samidha Shirke)
4. Cache replacement policies study for parallel programs (Da Huo &
Huri Ma)
5. Cache replacement policy for shared cache using NN (Rihan Yang & Yida Xu)
6. Processing in Memory (Salil Kapur & Aviral Khattar)
7. Best core configuration for a parallel application (Xi Huang &
Yulu Qian)
8. Parallel Computation Model (Yuheng Xia)
9.
Proximal policy optimization on multicore (Ilya Kulikov)
10. 
Parallelize Latent Dirichlet Allocation (Yunfei Lu)


Reading Material

Below you will find the reading material we will use in this course.
Next to each lecture, above, you will find the reading assignment. 
It indicates a number, corresponding to the list below, and section(s) to read from that material.
If no section numbers  are indicated, it means you need to read the whole thing.
This list will grow as we proceed.
Enjoy!
  1. Parallel Programming for Multicore and Cluster Systems (You must be logged into NYU network)
  2. Herb Sutter, The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software, Dr. Dobb's Journal, 30(3), March 2005.
  3. How to survive the multicore software revolution?
  4. Chip Multiprocessor Architecture: Techniques to Improve Throughput and Latency   (You must be logged into NYU network)
  5. The Memory System: You Can't Avoid It, You Can't Ignore It, You Can't Fake It   (You must be logged into NYU network)
  6. A Primer on Memory Consistency and Cache Coherence   (You must be logged into NYU network)
  7. Transactional Memory (You must be logged into NYU network)
  8. The Common Case Transactional Memory Behavior of Multithreaded Programs

Labs

 Homeworks Assignments 
Note: All homework assignments are equally weighted. Each assignment is due at 11:55pm on the due date on NYU classes.
You can submit up to 3 days late. However, there will be a (-1) assigned to each late day till (-3).

Project
ProjectStudents
1. Coherence protocols comparison Chun Hao(Kevin) Yang
2. Comparing: OpenMP & OpenACC Shuyi Yu
Mi Fang
3. Parallel processing of data streams using MPIMunir Contractor
Elaina Tan
4. Comparing: OpenMP & OpenCL Xialiang Liu
Tian Zhao
5. Parallelize & study graph-analytics benchmarkAndres Davila
Jiin Nam
6. Parallel Computation Model Yuheng Xia
7. proximal policy optimization on multicore Ilya Kulikov
8. Cache replacement policies study for parallel programsDa Huo
Huri Ma
9. Parallelize Latent Dirichlet AllocationYunfei Lu
10. Parallelizing bloom filterYik Wai Ng
Dayou Du
11. Processing in MemorySalil Kapur
Aviral Khattar
12. Parallelize large matrix factorization using OpenMPArnav Kansal
Ojas Deshpande
13. Best core configuration for a parallel applicationXi Huang
Yulu Qian
14. Bottleneck Analysis of graph analytics benchmarksWeisen Zhao
Kailin Luo
Jiayun Yu
15. Cache replacement policy analysis for shared cacheSamidha Shirke
16. Parallelize & study graph-analytics benchmarkShaheer Haroon
Chris Moirano
17. Bottleneck analysis of web crawlerCheng, Zhengyue
18. Cache replacement policy for shared cache using NNRihan Yang
Yida Xu
19. Parallel Sat solverGuangyu Liu

Interesting Links (Geeky Stuff)

If you have an interesting link, please email it to the instructor and it will find its way to this page (with an acknowledgment to you of course!).

Designing and Building Parallel Programs

HPCWire

Introduction to parallel computing