Big Data and ML — Spring 2023

Instructors
Help? Campuswire (use code 0187 to join)
When? Wednesday 7:10 PM to 9:10 PM
Where? Warren Weaver Hall 312 and Zoom
Hours
Day Time Who Where
TBD TBD Panda 60FA Room 405

Course Aims

The aim of this course is to spend a semester studying distributed tracing and looking at how it is used today, and how it can be expanded. The course is going to be somewhat project heavy. Throughout the semester we will be reading papers and also following text from Distributed Tracing in Practice a recent book on the topic that can be freely accessed online through the library.

Tentative Schedule and Syllabus

Date Topic & Readings Other
01/25 Introduction: Course Mechanics and Overview Slides
02/01
  • Chapter 1 DTP
  • Dapper
  • Pick a language and follow instructions at OpenTelemetry to collect an initial set of logs.
Slides
02/08
  • Reading: Pivot Tracing
  • Reading: Snicket
  • Practical: Set up Istio on microk8s. Some official documentation installing microk8s can be found here, and some documentation on installing Istio as an addon can be found here and here.
  • Get your program from last week running on you microk8s setup. You can find some instructions here or here, but there is a wealth of content on the internet about this. If you have trouble, post on Campuswire, and the other students (and I) will try to help you out.
Whiteboard
02/15
  • Reading: Friday
  • Reading: DeMI
  • Cloudlab: For anyone who had problems with Istio, please go to Cloudlab to sign up. Ask to join the nyu-bdml-class project. Use the Getting Started guide to set up a single machine, and install MicroK8S and Istio on it. You might want to write a script for this, since CloudLab machines are granted for short (a few weeks at most) durations.
  • Trace collection: Use Istio to collect traces from the program you developed in the first class. You can find documentation for this here.
  • Trace collection: Change the trace sampling frequency, and record the how performance, memory and disk usage change.
Whiteboard
02/22 Whiteboard
03/01 Whiteboard
03/08 MIDTERM MIDTERM
03/15 SPRING BREAK
03/22
  • READING: tprof
  • READING: GMTA
  • READING: TraVista
  • Final Project: By 11:59pm on Friday 03/24 post a note on Campuswire (1 per group) stating what you are doing for your final project, and a rough timeline (at the granularity of weeks) for when things will be done. You should of course also start working on the final project now.
Whiteboard
03/29 Whiteboard
04/05 Whiteboard
04/12 Whiteboard
04/19 Whiteboard
04/26 Whiteboard
05/03 Poster Session
Report is due on 05/08.

Grading

Grading will be based on quality of work, and presentation. The grade breakdown is as follows (this might change until the beginning of semester):

  • 30% for three homework assignments. These are checkpoints that require you to collect the work you have done for class and submit them, rather than projects you need to do from scratch.
  • 15% for class participation. This 15% will be split into two portions:
    1. 5% for participating in in-class discussion.
    2. 10% for helping other students on Campuswire, writing tutorials or notes, and calss presentations.
  • 25% for the final project: This should be done in groups of 2 or 3 people.
  • 10% Midterm, 20% Final exam. By default both are in-person.