Processing Units (GPUs): Architecture and Programming
Prof. Mohamed Zahran
(aka Prof. Z)
mzahran AT cs DOT nyu DOT edu
Tuedays: 7:10PM — 9:00PM WWH 1302
Office Hours (WWH 320): Tuesdays 4:30-6:30pm
Welcome students! ... to
the Graphics Processing Units course, edition Fall 2013.
I will keep updating this page regularly. If you have
questions related to this course feel free to email me.
Here is some basic information:
course examines the architecture and capabilities of
modern GPUs (graphics processing unit).
Many computations can be performed
faster on the GPU than on a traditional CPU.
This is why GPUs
are present now in almost all
computers; and the
majority of Top 500 supercomputers in the world are built
GPUs are now used for a diverse set of applications not only
traditional graphics applications; which introduces the concept of
general-purpose GPUs or GPGPUs.
In this course, we will cover architectural aspects of modern GPUs.
We will also learn how to program GPUs to solve different type of
problems and how to make
the best use of its hardware.
Ram Charan Mohan
email: rcm372 (at) nyu.edu
Final exam: Dec 17, 7:10-9pm Room 1302
Sign up for the Mailman mailing
list for the course, if you have not done it already. You can manage your subscription by clicking here.
Please follow the mailing list etiquette.
- Use the Reply command to
contribute to the current thread, but NOT to start
- Use your NYU email, not any other ones.
- If quoting a previous message, try to trim off
- Use a descriptive Subject: field when starting a new topic.
- Do not use one message to ask two unrelated questions.
- Do NOT make the mistake of sending your
completed project assignment to the mailing list!
Final exam of Spring 2012
Assignment 1: Due Sep 17th, 2013 -
Assignment 2: Due Oct 22nd, 2013 - Solutions - Comments
Assignment 3: Due Nov 26th, 2013 - Solutions
Lab 1: Due Oct 1st, 2013 Comments
Lab 2: Due Nov 5th, 2013
Lab3: Due Nov 19th, 2013
CUDA GPU Servers at NYU
OpenCL GPU Servers at NYU
Suggested projects (but feel free to suggest your own):
- What is the relationship between the utilization and performance?
an application and compare it with its sequential part. The application
must be non-trivial and has not been parallelized before (or you must
do better than the previous version).
- Given a kernel, can you build a tool that tries to suggest the best configuration (grid, block, and threads).
- Build a tool to predict the scalability of a kernel.
- Build a tool that suggests the potential of parallelization, on GPU, of sequential code.
- Problem definition report (5%) -> Due Oct 8th
- What is the specific topic (detailed problem definition)?
- What are the input and output?
- Why is it suitable for GPU?
- Survey and design report (5%) -> Due Nov 5th
- Survey of previous implementations in literature.
- What are the shortcomings of those implementations?
- How will your design avoid these shortcomings
- Give a detailed design of your program:
- Data structure
- Main functions used for CPU and kernels in GPU
- Optimizations that you will consider and why (be specific to your project)?
- Project results and final report (10%)-> Due Dec 3rd
- Cover page: Project name, your name(s)
- Section 1: High-level project description
- Section 2: Design:
- Your data structure
- How you partition your data
- How you decide on threads, blocks, and grid
- Section 3: Optimizations you have done in your code to speedup things
- Section 4: Experimental setup
- Machines on which you tested your code
- Input you used to test your implementation
- How did you check for correctness
- Section 5: Experimental Results
6: Conclusions (Was CUDA suitable for this project? What could have
been done better? What are the shortcomings of your implementations?)
need to email the source code as well as readme file telling us how to
compile and run your code, in a single zip file named after your
Links (Geeky stuff about GPUs)
CUDA C programming guide
CUDA occupancy calculator
First digital 3D rendered film (Thanks William Ward)
Interview with Ed Catmull (Thanks William Ward)
NVIDIA GPU computing seminars
For CUDA developers
Series of CUDA articles at Dr. Dobb's
OpenCL 2.0 reference card
GPU Simulators and Tools:
Multi2sim (simulates both GPUs and multicore)
GPUOcelot (dynamic compilation for PTX)