Inference and Representation

DS-GA-1005 and CSCI-GA.2569 Fall 2014


This course covers graphical models, causal inference, and advanced topics in statistical machine learning. A graphical model is a probabilistic model, where the conditional dependencies between the random variables are specified via a graph. Graphical models provide a flexible framework for modeling large collections of variables with complex interactions, as evidenced by their wide domain of application, including for example machine learning, computer vision, speech and computational biology. This course will provide a comprehensive survey of learning and inference methods in graphical models.

Prerequisite: DS-GA-1003/CSCI-GA.2567 (Machine Learning and Computational Statistics), or permission of instructor. Please note that seats are limited and exceptions to the prerequisite are only likely to be granted to PhD students.

General information

Lecture: Tuesdays, 7:10-9:00pm, in Warren Weaver Hall 101
Recitation/Laboratory (required for all students): Thursdays, 5:10-6:00pm in Silver Center 401

David Sontag   
dsontag {@ | at}
Lab instructor:
Yacine Jernite

jernite {@ | at}

Prasoon Goyal

pg1338 {@ | at}

Office hours: Tuesdays, 10:30-11:30am. Location: Center for Data Science, 726 Broadway, 7th Floor

Grading: problem sets (55%) + midterm exam (20%) + final exam (20%) + participation (5%). Problem Set policy

Book (required): Kevin Murphy, Machine Learning: a Probabilistic Perspective, MIT Press, 2012. I recommend the latest (4th) printing, as the earlier editions had many typos. You can tell which printing you have as follows: check the inside cover, below the "Library of Congress" information. If it says "10 9 8 ... 4" you've got the (correct) fourth print.

Mailing list: To subscribe to the class list, follow instructions here.


Draft Schedule (Readings refer to the Murphy book)

Week Date Topic Readings Assignments
1 Sept 2
Introduction, Bayesian networks
Chapter 10 (Bayesian networks)
Sec. 17.3, 17.4, 17.5.1 (HMMs)

How to write a spelling corrector (optional)

Introduction to Probabilistic Topic Models (optional)

ps1 due Sept. 15 at 10pm

2 Sept 9
Undirected graphical models [Slides]

Conditional random fields
Sec. 9-9.2.5 (exponential family)
Sec. 19-19.4.4, 19.6 (MRFs, CRFs)

An Introduction to Conditional Random Fields (section 2)

Original paper introducing CRFs (optional)

Application to camera lens blur (optional)

Application to predicting wind flow (optional)

Sept 16
Learning and causal inference
[Slides] [BN structure slides]

Maximum likelihood learning, structure learning, causal networks
Sec. 26-26.6 (structure learning & causal inference)
Sec. 2.8.2 (KL-divergence)

Overview of causal inference in statistics by Judea Pearl (sec. 3.2; optional)

Paper on causal inference used for computational advertising (optional)

Science paper on learning gene regulatory networks (optional)

Learning causal networks using interventions (optional)
ps2 due Sept. 26 at 5pm
Sept 23
Exact inference [Slides]

Variable elimination, treewidth
Chapter 20

ps3 (data) due Oct. 10 at 5pm

Sept 30
Belief propagation
[Slides] [Notes]
Sec. 22.2 (loopy belief propagation)

Using loopy BP for stereo vision (optional)


Oct 7
Integer linear programming
(MAP inference)

Linear programming relaxations, dual decomposition
Sec. 22.6 (MAP inference)

Derivation relating dual decomposition & LP relaxations

Introduction to Dual Decomposition for Inference (optional)

Integer Programming for Bayesian Network Structure Learning (optional)

Oct 21

(no class Oct 14, Fall Recess; will hold office hours on 10/14)
Midterm exam (in class)


Oct 28
Variational inference [Slides]

Mean-field approximation
Sec. 21.1-21.4

Graphical models, exponential families, and variational inference (optional)

Stochastic Variational Inference (optional)

ps4 (data) due Nov. 17 at 5pm

Nov 4
Monte-Carlo methods for inference [Slides 1] [Slides 2]

Importance sampling, Metropolis Hastings, Gibbs sampling
Sec. 23.1-23.4
Sec. 17.2.3
Sec. 24.1-24.4


Nov 11
More variational inference [Slides]
TRW, loopy BP
Sec. 22.3-22.4


Nov 18 Discovering latent variables using the method-of-moments
[Slides 1] [Slides 2]

Guest lecture by Daniel Hsu
A Method of Moments for Mixture Models and Hidden Markov Models (Sec. 1 & 2)

Method of Moments for Topic Modeling (optional)

11 Nov 25 Structured prediction
Sec. 19.7 (also, review 19.6 again)

Tutorial on Structured Prediction (optional)

Original paper introducing structured perceptron (optional)

Cutting-Plane Training of Structural SVMs (optional)

Block-Coordinate Frank-Wolfe Optimization for Structural SVMs (optional)

ps5 (data) due Dec. 8 at 5pm 

12 Dec 2
Learning Markov networks
Slides 1] [Slides 2]

Pseudo-likelihood, deep generative models
Sec. 19.5, 26.7-26.8
Sec. 27.7, 28.2

Notes on pseudo-likelihood

An Introduction to Conditional Random Fields (section 4; optional)

Approximate maximum entropy learning in MRFs (optional)

13 Dec 9
Advanced topics in learning

Probabilistic programming

Talk for recent ICML 2014 paper on deep generative models (optional)

Probabilistic programming: overview and bibliography (optional)

Probabilistic programming: BUGS (optional)

Probabilistic programming: Church book (optional)

Dec 16
Final exam (in class)

Acknowledgements: Many thanks to the Toyota Technological Institute, Hebrew University, UC Berkeley, and Stanford University for sharing material used in slides and homeworks

Reference materials

Problem Set policy

I expect you to try solving each problem set on your own. However, when being stuck on a problem, I encourage you to collaborate with other students in the class, subject to the following rules:

  1. You may discuss a problem with any student in this class, and work together on solving it. This can involve brainstorming and verbally discussing the problem, going together through possible solutions, but should not involve one student telling another a complete solution.

  2. Once you solve the homework, you must write up your solutions on your own, without looking at other people's write-ups or giving your write-up to others.

  3. In your solution for each problem, you must write down the names of any person with whom you discussed it. This will not affect your grade.

  4. Do not consult solution manuals or other people's solutions from similar courses.
Late submission policy During the semester you are allowed at most two extensions on the homework assignment. Each extension is for at most 48 hours and carries a penalty of 25% off your assignment.