Overview
This course considers the challenge of converting high-level algorithmic ideas into efficient parallel code. We will see how this challenge is greatly simplified by modern programming languages and programming techniques, especially functional programming techniques which make it possible to quickly develop efficient, scalable, and correct implementations. Students will learn how to design, analyze, implement, and evaluate the performance of parallel algorithms across a variety of problem domains (e.g., graph analysis, computational geometry, numerical algorithms, image processing, etc.). In the second half of the course, students will complete a parallel programming project of their own design.
Prerequisites: CSCI-GA.2110 Programming Languages. Familiarity with functional programming and recursive programming techniques is helpful.
Topics include:
- work and span, parallelism
- sequential baselines, work-efficiency, speedup
- parallel functional programming
- parallel algorithm design techniques: divide-and-conquer, contraction, prefix doubling, etc.
- parallel data structures: sequences, sets, tables/dictionaries, graphs, etc.
-
higher-order parallel primitives:
map
,reduce
,filter
,scan
, etc. - determinism and non-determinism
- parallel algorithms from a variety of domains: sorting, searching, order statistics, text/image/audio processing, graph analysis, computational geometry, numerical algorithms, etc.
Schedule
(Note: tentative—subject to change)
Week | Date | Lecture | Notes | Homework | |
---|---|---|---|---|---|
0 | Mon | Sep 2 | no lecture—Labor day (university holiday) | ||
1 | Mon | Sep 8 |
introduction,
parallel hardware,
parallelism vs concurrency,
parallel functional programming,
MaPLe,
par ,
scheduling,
(self-)speedup
|
hw1 released | |
2 | Mon | Sep 15 |
work and span,
language-based cost model,
recurrences,
divide-and-conquer,
reduce
|
hw1 due hw2 released |
|
3 | Mon | Sep 22 |
work efficiency,
recurrences (cont.),
contraction,
parallel prefix sums,
scan ,
sequences
|
hw2 due hw3 released |
|
4 | Mon | Sep 29 | filter, flatten, tree flattening, sorting, parallel divide-and-conquer merging |
hw3 due hw4 released |
|
5 | Mon | Oct 6 | 2D computational geometry, parallel convex hull (quickhull), nearest neighbors |
hw4 due hw5 released |
|
6 | Tue | Oct 14 |
(Note: lecture Tue instead of Mon)
graphs: undirected and directed, sparse representations, parallel traversals |
hw5 due hw6 released |
|
7 | Mon | Oct 20 |
versioned sequences, inject ,
graphs (cont.), edge contraction, star contraction, counting connected components
|
hw6 due | |
8 | Mon | Oct 27 | the parallel zoo: parallelism in Go, Java, C++, Rust, etc. | project proposals due | |
Fri | Oct 31 | proposal revisions due (if applicable) | |||
9 | Mon | Nov 3 | on-the-fly concurrency, compare-and-swap, hashing, hash tables, lock freedom | ||
10 | Mon | Nov 10 | scheduling by work-stealing | ||
11 | Mon | Nov 17 | pseudo-randomness, splittable pseudorandom number generators, pedigrees and DPRNG, random shuffling | project checkpoint | |
12 | Mon | Nov 24 | trees, parallel ordered sets and tables/dictionaries, parallel augmented maps | ||
13 | Mon | Dec 1 | dynamic programming, bottom-up scheduling, subset sum, seam carving | ||
14 | Mon | Dec 8 | project presentations | projects due | |
15 | Mon | Dec 15 |
(Note: 10:00am–11:50am)
project presentations |
Policies
Grading: homework assignments (60%), final project (40%)
Deadlines: All deadlines are at 5:00pm (eastern time) on the date listed in the schedule.
Late Submissions: 10% score penalty for each day late. Submissions will be not be accepted if they are submitted more than one week late.
Academic Integrity: Please review the department academic integrity policy. In this course, you are permitted to discuss assignments with other students as long as all discussion adheres to the following "whiteboard policy". Discussion may take place at a whiteboard (or on a scrap of paper, etc.), but no record of the discussion may be kept (all notes must be erased or discarded, no audio or video recording, etc.) and you must allow at least two hours to pass after the discussion before working on the assignment. Being able to recreate any solution from memory is considered proof that you actually understand the solution. If you collaborate with someone in this way on an assignment, you must list their name(s) in your submission. Copying solutions or any other work is a serious offense.
Accommodations: If you are in need of accommodations due to a disability or otherwise, please contact the instructor: s (dogoodt) we!stricluckk (a!t) nyrobotsu (do!t) ed!u
Project
In the second half of the course, students will complete a self-directed programming project, responsible for 40% of their overall grade. The project can be completed individually or in groups of two. The goal of the project is to develop a parallel application which achieves real parallel speedups.
Project proposals will be due halfway through the semester (please see the schedule). There are three requirements: (1) the project must be the students' original work, (2) the project must have a significant parallel programming component, and (3) the parallel programming component must achieve speedup in practice due to parallelism.
For the parallel programming component of the project, you must use a dynamically scheduled parallel programming language or library, preferably based on structured fork-join parallelism. All of the following are acceptable.
- MaPLe
- ParlayLib (C++)
- Rayon (Rust)
- Go
- Fork/Join Java
If you wish to use any other language or library, you must first discuss with the instructor and get approval. (Note that Python will not be approved.)
Only the parallel programming component of the project has this restriction. The rest of the project is up to you, and we encourage thinking about incorporating your parallel component into a larger application. Note however that the project will be primarily evaluated on the parallel programming component alone.
Project Proposal Guidelines
The proposal should be exactly 1 page, and should clearly state the following information:
- Who is working on the project. Groups of two are encouraged, or you may complete the project individually.
- A brief introduction and motivation for the project (1-2 paragraphs). If your project is inspired by existing research or projects available online, please cite them and describe how your project is related.
- A clear description of the parallel programming component. This should include at least one key algorithm or computational problem which you intend to parallelize. Describe the problem carefully. What is the input? What is the output? What data structures will be involved? How do you plan to test your solution for correctness? How do you plan to evaluate the performance of your solution?
-
Project timeline.
Please clearly state two sets of goals:
- 3-week “checkpoint goal”: everything you hope to complete in the first three weeks of the project.
- The final goals of the project: everything you plan to complete before the end of the semester.
- Tools, languages, libraries, resources, etc. Please briefly describe what programming languages/libraries/tools you plan to use and what computing resources you will use to complete the project. (Please see https://cims.nyu.edu/dynamic/systems/resources/computeservers/ for a list of computing resources provided by Courant; we recommend using these for your projects.)
Note that, while working on the project, you might realize that your original stated goals are not attainable. This is okay, and is a normal aspect of research. In this case, you can reach out to the instructor to plan an alternative goal during the course of the project.
MaPLe (MPL)
Homework assignments will use the MaPLe programming language, a high-level parallel programming language which offers a number of features making it simpler and safer to write efficient parallel code. The MaPLe language is based on Standard ML. Students do not need to already be familiar with MaPLe or Standard ML; we will introduce these as part of the course.
Resources
Slack. We'll use Slack for questions and discussions outside of lecture. An invite link will be sent at the beginning of the class.
Textbook. There is no required textbook. The content of this course is roughly based on the free textbook Algorithms: Parallel and Sequential, by Umut A. Acar and Guy Blelloch.
Learning MaPLe. We will introduce this language as part
of the course. If you would like to get a head start, we recommend taking a
look at mpl-tutorial
,
especially the first few sections (Hello World, Parallelism and Granularity
Control, and Trees). A number of programming examples are available
here
and here.
MaPLe is based on Standard ML. To familiarize
yourself with the syntax we recommend this guide.
Compute Servers. Courant has a number of compute servers available for students to use for assignments, research, etc. For this course we recommend students use the following machines, each of which has 32 cores (64 threads) and 256GB of memory.
- crunchy1.cims.nyu.edu
- crunchy2.cims.nyu.edu
- crunchy5.cims.nyu.edu
- crunchy6.cims.nyu.edu
Information about accessing these servers is available here.
You will need a CIMS account. If you do not already have a CIMS account,
please follow the instructions here.
We recommend adding the following to your local SSH configuration,
replacing YOUR_CIMS_USERNAME
with your CIMS account name.
This is usually the same as your NYU NetID.
Host cims-access
HostName access.cims.nyu.edu
User YOUR_CIMS_USERNAME
Host cims-crunchy1
ProxyCommand ssh cims-access nc crunchy1.cims.nyu.edu 22
User YOUR_CIMS_USERNAME
You can similarly add configurations for the other machines, crunchy2.cims.nyu.edu, etc. You should then be able to log into a crunchy server like so:
$ ssh cims-crunchy1