Big Data and ML — Spring 2022

Help? Campuswire
When? 7:10 PM to 9:10 PM
Where? Global Center 475 and Zoom
Day Time Who Where
Tuesday 4-5 pm Panda 60FA Room 405
Thursday 4:30-5:30pm Vinayak 60FA Room 406

Course Aims

The aim of this course is to spend a semester studying and understanding scheduling problems and approaches adopted in modern clusters. We will take a broad view of the types of scheduling problems we consider, and aim to have the course be largely driven by student interests.

Tentative Schedule and Syllabus

Date Topic & Readings Other
01/26 Introduction: Course Mechanics and Overview Lecture 1 Whiteboard
02/02 Scheduling: Fairness and Workloads Lecture 2 Whiteboard
02/09 Fairness continued Whiteboard
02/16 What cluster schedulers look like in practice Homework 1 Out Whiteboard
02/23 Scheduling: Preemption and Low Latency Final project proposal due.
Homework 1 Due
03/02 Sheduling Machine Learning Whiteboard
03/16 Spring Break SPRING BREAK
03/23 Scheduling and Privacy Whiteboard
03/30 Federated Learning Whiteboard
04/06 No Class
04/13 Network Scheduling Whiteboard
04/20 More low-latency scheduling Whiteboard
04/27 Potpourri
No reading.
05/04 Poster session and presentations


Grading will be based on quality of work, and presentation. The grade breakdown is as follows (this might change until the beginning of semester):

  • 30% for four homework assignments.
  • 15% for class participation. This 15% will be split into two portions:
    1. 5% for participating in in-class discussion.
    2. 10% for either writing a note (on Campuswire) or presenting results in class for one (or more) of the papers being discussed on that day. You must sign up to do this ahead-of-time (no later than 5pm on the day of the class) by posting on Campuswire. You can present as many times as you want, and we will give you feedback. We will only consider your two-best presentations when deciding your grade.
  • 25% for the final project: This should be done in groups of 2 or 3 people. You can either (a) explore a new research idea, or (b) work on a significant implementation project. For (a) you should work on a project that could eventually lead to a paper at SoCC, OSDI, SOSP or similar conference; while for (b) we recommend finding an existing open source project and extending or contributing to it (e.g., developing a new scheduling policy for Kubernetes or Apache Yarn); or developing a sufficiently large project.

    We will have 2 intermediate project checkpoints to give you early feedback on project progress. You are encouraged to use Campuswire and other class communication medium to ask questions and get help from others in the class.

  • 10% Midterm, 20% Final exam. By default both are in-person.