Introduction To Machine Learning

David Sontag

Undergraduate Division

Computer Science

Go to course website

In this undergraduate-level class on machine learning, students will learn about the theoretical foundations of machine learning and how to apply machine learning to solve new problems. Machine learning is an exciting and fast-moving field of Computer Science with many recent consumer applications (e.g., Microsoft Kinect, Google Translate, Iphone's Siri, digital camera face detection, Netflix recommendations, Google news) and applications within the sciences and medicine (e.g., predicting protein-protein interactions, species modeling, detecting tumors, personalized medicine).

In the first part of the course, we will cover supervised prediction algorithms including linear and logistic regression, support vector machines, and nearest-neighbor methods. In the second part of the course, we will cover clustering (e.g., K-means), dimensionality reduction (e.g., PCA), recommender systems, density estimation, Bayesian networks, and time-series modeling (e.g., hidden Markov models). Students will learn about "big data" and how to do large-scale machine learning.

The course will have a strong focus on real-world problems, and the applications mentioned earlier will be used to motivate each of the lectures. Assignments will involve real applications of machine learning.

Grading: about 7 homework assignments, a midterm and final exam.

Pre-requisites: Students should be very comfortable with basic mathematical skills in addition to good programming skills. Some knowledge of probability theory and statistics, linear algebra, and multivariable calculus will be helpful. Basic Algorithms (CSCI-UA.0310) is a prerequisite, although well qualified students taking it during Fall 2012 will be accepted with permission of the instructor. Please contact the professor with any additional questions.

Textbooks: No textbook is required for this class. However, where possible, we will suggest readings in following two (optional) textbooks:

Students may find it helpful to purchase one of these book. Bishop's book is much easier to read, whereas Murphy's book has substantially more depth and coverage (and is up to date).

Readings: Excited about the class and want to get an early start on the reading? The assigned reading for the first week is the (freely available) first chapter Introduction to Machine Learning from Murphy's book.