Processing Units (GPUs): Architecture and Programming
Prof. Mohamed Zahran
(aka Prof. Z)
mzahran AT cs DOT nyu DOT edu
Thursdays: 7:10PM — 9:00PM WWH 1302
Office Hours: Wednesdays 5-7pm at WWH 328
Welcome students! ... to
the Graphics Processing Units course, edition Spring 2012.
I will keep updating this page regularly. If you have
questions related to this course feel free to email me.
Here is some basic information:
This course examines the architecture and capabilities of
modern GPUs (graphics processing unit).
Many computations can be performed
faster on the GPU than on a traditional CPU.
This is why GPUs are present now in almost all
computers; and the majority of Top 500 supercomputers in the world are built
GPUs are now used for a diverse set of applications not only
traditional graphics applications; which introduces the concept of
general-purpose GPUs or GPGPUs.
In this course, we will cover architectural aspects of modern GPUs.
We will also learn how to program GPUs to solve different type of problems.
Our Grader: Young-Rok Bahn yrbahn (at) gmail dot com
Final Exam: May 10th, 2012 (same place and time as the lecture)
Sign up for the Mailman mailing
list for the course. You can do so by clicking here.
Please follow the mailing list etiquette.
- Use the Reply command to
contribute to the current thread, but NOT to start
- If quoting a previous message, try to trim off
- Use a descriptive Subject: field when starting a new topic.
- Do not use one message to ask two unrelated questions.
- Do NOT make the mistake of sending your
completed project assignment to the mailing list!
History of GPUs
Modern GPUs : A HW Perspective The following papers: 1 and 2
Mainly slides, and skim chp 7
The following papers: 1 and 2
chp 11 (till 11.6) and
following papers: 1 and 2
Putting It All Together
- Homework 1 - Due Feb 9th
- Homework 2 (problems 4.1, 4.2, and 4.3 from the text) - Due March 1st (sol)
- Homework 3 (problem 6.7 from the text) - Due March 29th (sol)
- Problem definition report -> Due Mar 8th
- What is the specific topic (detailed problem definition)?
- What are the input and output?
- Why is it suitable for GPU?
- Survey and design report -> Due Mar 22nd
- Survey of previous implementations in literature.
- What are the shortcomings of those implementations?
- How will your design avoid these shortcomings
- Give a detailed design of your program:
- Data structure
- Main functions used for CPU and kernels in GPU
- Optimizations that you will consider and why (be specific to your project)?
- Project results and final report -> Due April -> Due Apr 26th [Report as hardcopy and your code by email to both instructor and TA]
- Cover page: Project name, your name(s)
- Section 1: High-level project description
- Section 2: Design:
- Your data structure
- How you partition your data
- How you decide on threads, blocks, and grid
- Section 3: Optimizations you have done in your code to speedup things
- Section 4: Experimental setup
- Machines on which you tested your code
- Input you used to test your implementation
- How did you check for correctness
- Section 5: Experimental Results
6: Conclusions (Was CUDA suitable for this project? What could have
been done better? What are the shortcomings of your implementations?)
need to email the source code as well as readme file telling us how to
compile and run your code, in a single zip file named after your
Links (Geeky stuff about GPUs)Important link: CUDA GPU Servers at NYU
CUDA occupancy calculator
First digital 3D rendered film (Thanks William Ward)
Interview with Ed Catmull (Thanks William Ward)
NVIDIA GPU computing seminars
For CUDA developers
Series of CUDA articles at Dr. Dobb's
GPU Simulators and Tools:
Multi2sim (simulates both GPUs and multicore)
GPUOcelot (dynamic compilation for PTX)