Overview
This course considers the challenge of converting high-level algorithmic ideas into efficient parallel code. We will see how this challenge is greatly simplified by modern programming languages and programming techniques, especially functional programming techniques which make it possible to quickly develop efficient, scalable, and correct implementations. Students will learn how to design, analyze, implement, and evaluate the performance of parallel algorithms across a variety of problem domains (e.g., graph analysis, computational geometry, numerical algorithms, image processing, etc.). In the second half of the course, students will complete a parallel programming project of their own design.
Prerequisites: CSCI-GA.2110 Programming Languages. Familiarity with functional programming and recursive programming techniques is helpful.
Topics include:
- work and span, parallelism
- sequential baselines, work-efficiency, speedup
- parallel functional programming
- parallel algorithm design techniques: divide-and-conquer, contraction, prefix doubling, etc.
- parallel data structures: sequences, sets, tables/dictionaries, graphs, etc.
-
higher-order parallel primitives:
map
,reduce
,filter
,scan
, etc. - determinism and non-determinism
- parallel algorithms from a variety of domains: sorting, searching, order statistics, text/image/audio processing, graph analysis, computational geometry, numerical algorithms, etc.
Schedule
(Note: tentative—subject to change)
Week | Date | Lecture | Notes | Homework | |
---|---|---|---|---|---|
0 | Mon | Jan 20 | no lecture—MLK day (university holiday) | ||
1 | Mon | Jan 27 |
introduction,
parallel hardware,
parallelism vs concurrency,
parallel functional programming,
MaPLe programming language,
par ,
scheduling,
(self-)speedup
|
notes (pdf, md)
code Recommended Reading: APS Ch 2 Sec 1,2 Ch 7 Ch 8 |
hw1 released |
2 | Mon | Feb 3 |
work and span,
language-based cost model,
recurrences,
divide-and-conquer,
reduce
|
notes (pdf)
Recommended Reading: APS Ch 2 Sec 3 Ch 26 Sec 1 Ch 28 Sec 4 |
hw1 due hw2 released |
3 | Mon | Feb 10 |
work efficiency,
recurrences (cont.),
contraction,
parallel prefix sums,
scan ,
sequences
|
notes (pdf)
Recommended Reading: Blelloch93 lib docs: Seq
|
hw2 due hw3 released |
4 | Tue | Feb 18 |
(Note: lecture Tue instead of Mon)
filter, flatten, tree flattening, sorting, parallel divide-and-conquer merging |
notes (pdf)
Recommended Reading: APS Ch 36 |
hw3 due |
5 | Mon | Feb 24 | trees, parallel ordered sets and tables/dictionaries, parallel augmented maps |
notes (pdf)
Recommended Reading: Just Join PAM |
hw5 released |
6 | Mon | Mar 3 | graphs: undirected and directed, sparse representations, parallel traversals |
notes (pdf)
Recommended Reading: APS Ch 52 Ch 54 Ligra |
hw5 due hw6 released |
7 | Mon | Mar 10 | graphs (cont.): parallel traversals, contraction | ||
Tue | Mar 11 | hw6 due | |||
8 | Mon | Mar 17 | the parallel zoo: parallelism in Rust, Java, Go, ISPC, CUDA, Futhark, etc. | project proposals due | |
Fri | Mar 21 | proposal revisions due (if applicable) | |||
9 | Mon | Mar 24 | no lecture—spring break | ||
10 | Mon | Mar 31 | advanced topics: fusion, eliminating intermediate allocation | ||
11 | Mon | Apr 7 | advanced topics: dynamic programming, bottom-up scheduling | ||
12 | Mon | Apr 14 | advanced topics: randomized parallel algorithms | project checkpoint due | |
13 | Mon | Apr 21 | advanced topics: on-the-fly concurrency and non-determinism, parallel hashing and hash tables | ||
14 | Mon | Apr 28 | advanced topics: scheduling by work-stealing | ||
15 | Mon | May 5 | project presentations | projects due | |
Thu | May 8 |
(Note: 10:00am–11:50am)
project presentations |
Policies
Grading: homework assignments (50%), final project (50%)
Deadlines: All deadlines are at 5:00pm (eastern time) on the date listed in the schedule.
Late Submissions: 10% score penalty for each day late. Submissions will be not be accepted if they are submitted more than one week late.
Academic Integrity: Please review the department academic integrity policy. In this course, you are permitted to discuss assignments with other students as long as all discussion adheres to the following "whiteboard policy". Discussion may take place at a whiteboard (or on a scrap of paper, etc.), but no record of the discussion may be kept (all notes must be erased or discarded, no audio or video recording, etc.) and you must allow at least two hours to pass after the discussion before working on the assignment. Being able to recreate any solution from memory is considered proof that you actually understand the solution. If you collaborate with someone in this way on an assignment, you must list their name(s) in your submission. Copying solutions or any other work is a serious offense.
Accommodations: If you are in need of accommodations due to a disability or otherwise, please contact the instructor: s (dogoodt) we!stricluckk (a!t) nyrobotsu (do!t) ed!u
Project
In the second half of the course, students will complete a self-directed programming project, responsible for half of their overall grade. The project can be completed individually, or in groups of two. The goal of the project is to develop a parallel application which achieves real parallel speedups.
Project proposals will be due halfway through the semester (please see the schedule). The specifics of the project are up to the students; the only requirements are (1) the project must be the students' original work, and (2) the project must have a significant parallel programming component, ideally demonstrating speedups on a real-world problem.
For the programming component of the project, we recommend using MPL (see below). However, if desired, another programming language can be used, with approval of the instructor. In the project proposal, students should clearly state what tools and programming language(s) they intend to use.
Proposal Guidelines. The proposal should be 1-2 pages, and should clearly state the following information:
- Who is working on the project. Groups of two are encouraged, or you may complete the project individually. Groups of three are okay as well, but note that the scope of the project should be a bit larger in this case.
- A brief introduction and motivation for the project (2-3 paragraphs). If your project is inspired by existing research or projects available online, please cite them and describe how your project is related.
-
The goals of the project.
Please clearly state two sets of goals:
- 3-week “checkpoint goal”: everything you hope to complete in the first three weeks of the project. (Note that a brief writeup describing your work-in-progress will be due on Monday, April 14.)
- The final goals of the project: everything you plan to complete before the end of the semester.
- Tools, languages, libraries, resources, etc. Please briefly describe what programming languages/libraries/tools you plan to use and what computing resources you will use to complete the project. (Please see https://cims.nyu.edu/dynamic/systems/resources/computeservers/ for a list of computing resources provided by Courant; we recommend using these for your projects.)
An example project goal could be to present a complete implementation and performance evaluation (either theoretical, empirical, or some combination of the two) of an interesting real-world parallel algorithm. In this case, a reasonable 3-week checkpoint could be to have an initial working implementation (perhaps not very efficient yet or not yet rigorously tested, but at least something "on track" towards the final project goals).
In general, all projects should include a parallel programming component, where you either write (or generate) code that can be run on parallel hardware such as multicore CPUs, or GPUs, etc.
Note that, while working on the project, you might realize that your original stated goals are not attainable. This is okay, and is a normal aspect of research. In this case, you can reach out to the instructor to plan an alternative goal during the course of the project.
MaPLe (MPL)
Homework assignments will use the MaPLe programming language, a high-level parallel programming language which offers a number of features making it simpler and safer to write efficient parallel code. The MaPLe language is based on Standard ML. Students do not need to already be familiar with MPL or Standard ML; we will introduce these as part of the course.
Resources
Slack. We'll use Slack for questions and discussions outside of lecture. An invite link will be sent at the beginning of the class.
Textbook. There is no required textbook. The content of this course is roughly based on the free textbook Algorithms: Parallel and Sequential, by Umut A. Acar and Guy Blelloch.
Learning MaPLe. We will introduce this language as part
of the course. If you would like to get a head start, we recommend taking a
look at mpl-tutorial
,
especially the first few sections (Hello World, Parallelism and Granularity
Control, and Trees). A number of programming examples are available
here
and here.
MaPLe is based on Standard ML. To familiarize
yourself with the syntax we recommend this guide.
Compute Servers. Courant has a number of compute servers available for students to use for assignments, research, etc. For this course we recommend students use the following machines, each of which has 32 cores (64 threads) and 256GB of memory.
- crunchy1.cims.nyu.edu
- crunchy2.cims.nyu.edu
- crunchy5.cims.nyu.edu
- crunchy6.cims.nyu.edu
Information about accessing these servers is available here.
You will need a CIMS account. If you do not already have a CIMS account,
please follow the instructions here.
We recommend adding the following to your local SSH configuration,
replacing YOUR_CIMS_USERNAME
with your CIMS account name.
This is usually the same as your NYU NetID.
Host cims-access
HostName access.cims.nyu.edu
User YOUR_CIMS_USERNAME
Host cims-crunchy1
ProxyCommand ssh cims-access nc crunchy1.cims.nyu.edu 22
User YOUR_CIMS_USERNAME
You can similarly add configurations for the other machines, crunchy2.cims.nyu.edu, etc. You should then be able to log into a crunchy server like so:
$ ssh cims-crunchy1