Processors: Architecture & Programming
Prof. Mohamed Zahran
(aka Prof. Z)
mzahran AT cs DOT nyu DOT edu
Thursdays 5:10-7:00 pm
Location: WWH 1302
Office Hours: Tuesdays 4:30-6:30pm (WWH 320)
Welcome students! ... to
the Multicore Processors: Architecture
& Programming course,
I will keep updating this page regularly. If you have
questions related to that course feel free to email me at mzahran (at) cs.nyu.edu
. Here is some basic information:
- I am sure you have heard
the words multicore and manycore processors a lot. In this course we
will study multicore/manycore processors in details, both the hardware
and software aspects. We will learn: why multicore processors
are here to stay, why the future of computing depends on our ability
and competence to build them and deal with them, how to use
them to solve many real-life problems, what are the challenges, where
are we heading, and many more questions. Multicore/manycore processors
present a turning point in computing history, so we better be
good at deadling with it!
- There will be no textbook
for this course. Reading
material, from research papers, will be posted on this webpage, beside
the lecture slides.
- Here is the course syllabus
- Our grader: Rohit Gupta rg1960 (at) nyu.edu
Sign up for the Mailman mailing
list for the course, if the system hasn't signed you already.
You can do so by clicking here.
Please follow the mailing list etiquette.
Use the Reply command to
contribute to the current thread, but NOT to start
If quoting a previous message, try to trim off
Use a descriptive Subject: field when starting a new topic.
Do not use one message to ask two unrelated questions.
Do NOT make the mistake of sending your
completed project assignment to the mailing list!
Below you will find the reading material we will use in this course.
Next to each lecture, above, you will find the reading
It indicates a number, corresponding to the list below, and
section(s) to read from that material.
If no section numbers are indicated, it means
you need to read the whole thing.
Programming for Multicore and Cluster Systems (You must be logged into NYU network)
Herb Sutter, The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software, Dr. Dobb's Journal, 30(3), March 2005.
- How to survive the multicore software revolution?
- Chip Multiprocessor Architecture: Techniques to Improve Throughput and Latency (You must be logged into NYU network)
- The Memory System: You Can't Avoid It, You Can't Ignore It, You Can't Fake It (You must be logged into NYU network)
- A Primer on Memory Consistency and Cache Coherence (You must be logged into NYU network)
- The Problem With Threads
- Serialization Sets: A Dynamic Dependence-Based Parallel Execution Model
- IPC considered harmful for multiprocessor workloads
- Computer Architecture Performance Evaluation Methods (You must be logged into NYU network)
- Effective Performance Measurement and Analysis of Multithreaded Applications
- Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance
- The Impact of Performance Asymmetry in Emerging Multicore Architectures
- Transactional Memory (You must be logged into NYU network)
- Unlocking Concurrency
In many of the labs, we will be using the multi2sim simulator.
Here are the steps to set it up:
- Download this file on your CIMS account.
- tar -xvf multi2sim-4.2.tar.gz
- You must now a directory called: multi2sim-4.2
- cd multi2sim-4.2
- You must now have a subdirectory called bin
- cd bin
- The simulator is called m2s
lab1: [Due Oct 15th] You will need memory configuration and cpu configuration files to this lab.
If you get errors due to the lack of 32-bit libraries, you can use the following legacy 32-bit machine for compilation.
lab2: [Due Oct 17th] You will use the CIMS machine for this lab.
lab3 [Due Nov 21st] You will use multi2sim simulator and this configuration file.
HW1 Due Sep 26 - sol - Comments
HW2 Due Oct 24 - sol - Comments
HW3 Due Dec 5th
You are free to pick your own project, after you discuss its idea with me first.
However, here is a list of suggested projects that you can pick from.
- Pick a memory-bound non-trivial application and find ways to enhance its performance through parallelization.
a hardware configuration, and an application, can we say something
about the optimal number of threads before the performance drops? Can
- Pick a non-trivial application to parallelize and compare the serial and parallel versions
- Compare several parallel programming models using several applications
the scalability of a code through static analysis and confirm it
experimentally. That is, given an application, what is the peak
theoretical performance, relative to sequential code, we
scalabily of programs. Write a parallel program that can check the
performance at runtime and adjust its scalability to get best
- Given a parallel program, what is the best interconnect among cores to ensure the best performance?
- Refuting Amadahl's law
Schedule of Presentations
- Project description report - 5% - Due Oct 3rd
- Literature survey report - 10% - Due Oct 29th
- Design and experimental setup report - 10% - Due Nov 7th
- Final report - 15% - Due Dec 5th
- Presentation - 10% - Dec 5th and 12th
(Note: Presentation lists are ordered)
Day 1: Dec 5th
Day 2: Dec 12th
- Interconnection aware application: Jianfeng Lin and Kanwarpreet Randhawa
- Linux commands parallelization: Tao Feng and Yisheng Liao
- Intellgent Scalability: Abhinand Sundararajan and Anshul Jain
- Code Scalability Predictor: Yang Cui
- Multi-Dimentional tree parallelization: Yumin Lee
- Program-aware Interconnect: Numair Khan
- Best Interconnect Selection: Charles Drews
- OpenMP vs Cilk: Shitian Ren
- Concurrency in Java and C++: Keyi Chen
- Memory Page Replacement: Darshit Dave
- Parallel Programming with Python: Shichao An
Links (Geeky Stuff)
you have an interesting link, please email it to the instructor and it
will find its way to this page (with an acknowledgment to you of
(targeting both software and hardware folks interested in parallel
Building Parallel Programs
trouble with multicore
to parallel computing
Landscape of Parallel Computing
Building Parallel Programs