Introduction to Machine Learning

Course#: CSCI-UA.0480-002

Instructor: Mehryar Mohri

TA: Andres Munoz Medina

Mailing List

Course Description

This course introduces several fundamental concepts and methods for machine learning. The objective is to familiarize the audience with some basic learning algorithms and techniques and their applications, as well as general questions related to analyzing and handling large data sets. Several software libraries and data sets publicly available will be used to illustrate the application of these algorithms. The emphasis will be thus on machine learning algorithms and applications, with some broad explanation of the underlying principles. The main topics covered are:

- Basic machine learning concepts and examples.
- Basic probability notions.
- Bayesian inference.
- Nearest-neighbor algorithms.
- On-line learning (Halving, Weighted Majority, Perceptron, Winnow).
- Support vector machines.
- Kernel methods.
- Decision trees.
- Ensemble methods (Boosting, Bagging).
- Logistic regression.
- Density estimation, ML, MAP, Maxent models.
- Multi-class classification (Conditional Maxent models, binary classifiers and error-correction codes).
- Regression (linear regression, Kernel Ridge Regression, Lasso, neural networks).
- Clustering (K-means, DT clustering).
- Dimensionality reduction (PCA, KPCA).
- Introduction to reinforcement learning.
- Elements of learning theory.

Location and Time

Room 109 Warren Weaver Hall,

251 Mercer Street.

Mondays and Wednesdays 3:30PM - 4:45PM.

Prerequisite

The course will introduce all basic concepts needed in probability and statistics.

Projects and Assignments

There will be a mid-term exam, about 5 assignments, and a project. The final grade is essentially the average of the exam, assignments, and project grades. The standard high level of integrity is expected from all students, as with all CS courses.

Lectures

- Lecture 01: Introduction.
- Lecture 02: Basic probability notions.
- Lecture 03: Bayesian learning.
- Lecture 04: Nearest-Neighbor Algorithms.
- Lecture 05: On-line learning.
- Lecture 06: Perceptron and Winnow.
- Lecture 07: Convex optimization.
- Lecture 08: Support Vector Machines (SVMs).
- Lecture 09: Kernel methods.
- Lecture 10: Decision trees.
- Lecture 11: Boosting.
- Lecture 12: Bagging.
- Lecture 13: Multi-class classification.
- Lecture 14: Density estimation, Maximum Likelihood, Maxent models.
- Lecture 15: Regression.
- Lecture 16: Ranking.

Textbooks

There is no single textbook covering the material presented in this course. Here is a list of books recommended for further reading in connection with the material presented:

- Mitchell Tom.
*Machine Learning*. McGraw Hill, 1997. - Richard O. Duda, Peter E. Hart, David G. Stork.
*Pattern classification (2nd edition)*. Wiley, New York, 2001.

Software

Here is list of some of the software tools used to illustrate the applications of the algorithms discussed:

Technical Papers

An extensive list of recommended papers for further reading will be provided in the lecture slides.

Homework

- Homework 1 [solution]
- Homework 2 [solution]
- Homework 3 [solution]

Exams