Center for Data Science and Computer Science
Lecture time and location: Mondays, 7:10pm-9:00pm at Cantor, room 101
Office hours time and location: Tuesday 4:00pm-5:00pm @ location 715
Broadway, room 1005
Big Data Lab: Thursdays, 7:10pm-8:00pm at CIWW, room 109. Always bring your laptop for the lab.
Nandita Chugh (nc1267 at nyu.edu)
Office hours: Thursday, 2 - 3 pm in 715 Broadway, room 1005.
Subodh Narayanrao Tiwaskar (subodh at nyu.edu)
Office hours: Wednesday, 1 - 2 pm in 715 Broadway, room 1005.
Sumit Gupta (s.gupta at nyu.edu)
Office hours: Wednesday, 5 - 6 pm in 715 Broadway, room 1005.
Big Data requires the storage,
organization, and processing of data at a scale and efficiency that go
well beyond the capabilities of conventional information
technologies. In this course, we will study the state of the art in
big data management: we will learn about algorithms, techniques and
tools needed to support big data processing. In addition, we will
examine real applications that require massive data analysis and how
they can be implemented on Big Data platforms.
The course will consist of lectures based both on textbook material
and scientific papers. It will also include programming assignments
that will provide students with hands-on experience on building
data-intensive applications using existing Big Data platforms,
including Amazon AWS. Besides lectures given by the instructor, we
will also have guest lectures by experts in some of the topics we will
Syllabus and Estimated Times
The course consists of three main modules where we will tentatively
cover the following topics:
The schedule for classes, lecture notes, and required reading will be
available at http://www.vistrails.org/index.php/Course:_Big_Data_2014
- Background: the evolution of data management and overview of
traditional data management techniques. (4 weeks)
- Big Data Foundations and Infrastructure: cloud computing and map
reduce; data management for big data. (4 weeks)
- Big Data Algorithms and Techniques: query processing; finding
similar items; graph analysis; frequent itemset mining; interactive
data analysis and visualization. (6 weeks)
The readings for this course will consist of research papers and two recent books that are freely-available for download on the Web:
Workload and Requirements
The workload will consist of online quizzes, using the Gradiance system and
For programming assignments, the instructor and graders will run
your code and your grade will depend on the correctness of the
outputs. Therefore, you must strictly follow the guidelines given for
the programming assignments to ensure we will be able to run them.
Programming assignments must be done individually, unless
otherwise noted. Students must design and program their own solutions
-- copying from other students or any other source is not
Students are required to follow the following rules about academic honesty:
Lateness policy: Late quizzes and assignments will not be
accepted without a note from your physician or from your
The grade for the course will be based on:
- Programming Assignments (50%)
- Quizzes (15%)
- Final Exam (35%)
You will need to access Gradiance for your quizzes at
http://www.newgradiance.com/services. Here's a link to a guide on
how to use Gradiance:
Register and use the class token A3A872C4. Make sure to use
your official NYU email and id when you register.
The quizzes appear to be sets of mutiple-choice questions. But you
should think of the questions as if you were asked to work an
ordinary, "long-answer" question. Work that question and keep the
answer handy on a piece of paper. The multiple-choice question will
typically sample your knowledge of the correct answer.
You can try the work as many times as you like, and we hope everyone
will eventually get 100%. Also notice that you have to wait 10
minutes between openings, so brute-force random guessing will not
If you need to reach the instructor, send email to bigdata.nyu AT gmail DOT com
The class mailing list is: http://www.cs.nyu.edu/mailman/listinfo/csci_ga_2568_001_sp14
We thank Amazon for the AWS in Education Coursework grant which gives students taking this course access to their cloud infrastructure.
Last modified: Fri Jan 24 13:20:11 EST 2014