Instructors of record: Yann LeCun, Denis Zorin
Coordinator: Alfredo Canziani
Credits: 2
Grading: Pass/Fail

Offered: Spring 2021

This team-taught course provides a high-level overview of the keyideasand technologies that lead to revolutionary changes in Artificial Intelligence (AI) and to the explosive growth in practical applications of AI.Taught by a team of NYU's top experts in artificial intelligence lead by the Turing award winner Yann LeCun, the course will introduce students to a range of topics in fundamentals of AI and its key sub-areas including machine learning, natural language processing, computer vision, as well as its applications in several domains.

The course will be complimentary with respect to “Thinking, Learning, and Consciousness in Humans and Machines”, which considers the impact of AI technologies on society and relationship between human and artificial intelligence. Our course focuses on the key scientific ideas underlying revolutionary advances in AI technology.

The course will consist of a sequence of 14 lectures, approximately 45 min in duration each. The lectures will be pre-recorded and watched asynchronously. There will be a weekly discussion session with each presenter (usually 2), and an online discussion board. Assessment will be based on brief weekly online questionnaires.

Syllabus of Lectures

Introduction; Deep Learning revolution in AI. (Yann LeCun)
AI for understanding visual information: computer vision. Machine learning foundations of computer vision (Rob Fergus)
Vision continued (Laurens van der Maaten)
Natural language processing: machine learning foundations, language generation, dialog(Kyunghyun Cho)
Engineering behind AI dialog systems: (Emily Dinan)
Speech understanding and synthesis (Michael Picheny)
Autonomous robots, reinforcement learning with robotics applications robotics. (Lerrel Pinto)
Robotics (Andy Zheng)
AI and games: how machines learned to beat humans in games (Noam Brown)
AI for medicine: saving lives with machine learning. (Rajesh Ranganath)
Machine learning in medical imaging: challenges and opportunities (Krzysztof J. Geras)
Fairness in AI: (Emily Denton)
Conclusion: What is next in AI? (Yann LeCun)

Lesson 1 – Deep Learning revolution in AI

Slides and episodes: W1S2, W1E2.1, W1E2.2, W1E2.3, W1E2.4.

Lecturer: Yann LeCun

Silver Professor of Computer Science, Data Science, and Neural Science. VP and Chief Scientist, Facebook; ACM Turing Award Laureate

Biography

Yann LeCun is VP & Chief AI Scientist at Facebook and Silver Professor at NYU affiliated with the Courant Institute of Mathematical Sciences & the Center for Data Science. He was the founding Director of Facebook AI Research and of the NYU Center for Data Science. He received an Engineering Diploma from ESIEE (Paris) and a PhD from Sorbonne Université. After a postdoc in Toronto he joined AT&T Bell Labs in 1988, and AT&T Labs in 1996 as Head of Image Processing Research. He joined NYU as a professor in 2003 and Facebook in 2013. His interests include AI machine learning, computer perception, robotics and computational neuroscience. He is the recipient of the 2018 ACM Turing Award (with Geoffrey Hinton and Yoshua Bengio) for "conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing", a member of the National Academy of Engineering and a Chevalier de la Légion d’Honneur.

Lecture topics: Brief history of AI, what AI can do, basic ideas of ML, supervised learning, gradient descent, multilayer networks, ConvNets

Reading materials

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436-444.

Upon completion, students will be able to

Explain the origins of the surge of interest in AI.
What modern AI systems can and cannot do.
Explain what Machine Learning and Deep Learning are.
Describe the basic building blocks of simple neural networks

Lesson 2.1 – AI for understanding visual information

Slides and episodes: W2S1, W2E1.1, W2E1.2, W2E1.3.

Lecturer: Robert Fergus

Professor of Computer Science + DeepMind

Biography

Rob Fergus is a Professor of Computer Science at NYU. He is also a research scientist at Google DeepMind. He has degrees from Cambridge, Oxford and Caltech, all in Electrical Engineering. Before coming to NYU, he spent two years as a postdoc at MIT. His research focuses on machine learning, computer vision and deep learning.

Lecture topics: Computer vision, object recognition, generative image models, self-supervised learning.

Reading materials

https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/

Upon completion, students will be able to

Have a basic understanding of how object classification and detection is performed during deep convolutional models. They will also comprehend how these models can be used to generate images, both photorealistic and artistic.

Lesson 2.2 – Computer vision in the real world

Slides and episodes: W2S2, W2E2.1, W2E2.2, W2E2.3, W2E2.4.

Lecturer: Laurens van der Maaten

Research Director, Facebook AI Research

Biography

Laurens van der Maaten is a Research Director at Facebook AI Research in New York. He leads FAIR's New York site. Prior, he worked as an Assistant Professor at Delft University of Technology (The Netherlands) and as a post-doctoral researcher at the University of California, San Diego. He received his PhD from Tilburg University (The Netherlands) in 2009. With collaborators from Cornell University, he won the Best Paper Award at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017. He is an editorial board member of IEEE Transactions of Pattern Analysis and Machine Intelligence and is regularly serving as area chair for premier conferences, including NeurIPS, ICML, and CVPR. Laurens is interested in a variety of topics in machine learning and computer vision.

Lecture topics: Computer vision, object recognition.

Reading materials

Upon completion, students will be able to

Understand the various ways in which computer vision techniques can be used in real-world products. They also have an understanding of some of the aspects that are important when computer vision is employed in the real world: in particular, the need for large-scale representative training data, the importance of making predictions in a computationally efficient way, and the necessity of analyzing how well a computer-vision system works for all its potential users.

Lesson 3.1 – Autonomous robots: teaching robots to drive and fly safely

Slides and episodes: W3S1, W3E1.1, W3E1.2.
Lecturer: Lerrel Pinto
Assistant Professor of Computer Science

Biography

Lerrel Pinto is an assistant professor of Computer Science. His research interests focus on learning and computer vision for robots. He received a PhD degree from CMU in 2019; prior to that he received an MS degree from CMU in 2016, and a B.Tech in Mechanical Engineering from IIT-Guwahati. His work on large-scale learning for grasping received the Best Student Paper award at ICRA 2016. Several of his works have been featured in popular media like TechCrunch, MIT Tech Review and BuzzFeed among others.

Lecture topics: This lecture will cover broad applications of machine learning techniques for a variety of real-world applications such as robotic manipulation, navigation, and self-driving cars. We will explore the core technology and ideas that make these applications possible and discuss areas that require additional research. Topics will include neural networks for predicting robotic action; reinforcement learning for dexterous manipulation; imitation learning from humans.

Reading materials

Upon completion, students will be able to

Understand how deep neural networks can be used for robotics
Understand the foundations of reinforcement learning
Understand the foundations of imitation learning from experts.

Lesson 3.2 – Machine Learning for Robotic Manipulation

Slides and episodes: W3S2, W3E2.1, W3E2.2, W3E2.3, W3E2.4, W3E2.5.
Lecturer: Andy Zeng
Research Scientist in Robotics at Google AI

Biography

Andy Zeng is a Research Scientist in Robotics at Google AI. His research lies at the intersection of robotics, machine learning, and computer vision. In particular, he is interested in developing algorithms that enable machines to intelligently perceive and interact with the physical world and improve themselves over time. Andy received his Bachelors in Computer Science (‘15) and Mathematics (‘15) at UC Berkeley and received his PhD in Computer Science (‘19) at Princeton University. He led machine learning for Team MIT-Princeton at the Amazon Picking Challenge (‘17), winning 1st place (stow task) worldwide. He is a recipient of several awards including the Best System Paper Award at RSS (‘19), Best Manipulation System Paper Award at Amazon (‘18), and has been a finalist for Best Paper Awards at ICRA’20, RSS’19, IROS’18. His research has been recognized through the Gordon Y.S. Wu Fellowship in Engineering and Wu Prize (‘16), NVIDIA Fellowship (‘18), and Princeton SEAS Award for Excellence (‘18), and his work has been featured in many popular press outlets, including the New York Times, BBC, and Wired.

Lecture topics

Challenges of supervised robot learning.
Examples of self-supervision in robotics.
Closing the perception-action loop.
Spatial structure and residual physics in robot learning.

Reading materials

Upon completion, students will be able to

Summarize some of the key challenges in robot learning.
Describe the role of machine learning in closing the perception-action loop.
Give examples of using self-supervision for robot learning.
Discuss ideas that improve robot learning efficiency: shape symmetries for grasping, residual physics.

Lesson 4.1 – Natural language processing: machine learning foundations, language generation, dialogue

Slides and episodes: W4S1, W4E1.1, W4E1.2, W4E1.3, W4E1.4, W4E1.5.
Lecturer: Kyunghyun Cho
Associate Professor of Computer Science and Data Science

Biography

Kyunghyun Cho is an associate professor of computer science and data science at New York University and CIFAR Associate Fellow. He was a postdoctoral fellow at the University of Montreal until Summer 2015 under the supervision of Prof. Yoshua Bengio, after receiving PhD and MSc degrees from Aalto University in April 2011 and April 2014, respectively, under the supervision of Prof. Juha Karhunen, Dr Tapani Raiko and Dr Alexander Ilin. He tries his best to find a balance between machine learning, natural language processing, and life, but almost always fails to do so.

Lecture topics: NLP, language generation, machine translation, and what happens in dialogue systems like GPT.

Reading materials

Kyunghyun Cho. Introduction to Neural Machine Translation with GPUs. NVIDIA Developer Blog. May 2015. https://developer.nvidia.com/blog/introduction-neural-machine-translation-with-gpus/
Gideon Lewis-Kraus. The Great A.I. Awakening. New York Times. Dec 2016. https://www.nytimes.com/2016/12/14/magazine/the-great-ai-awakening.html
Chris Olha, Shan Carter. Attention and Augmented Recurrent Neural Networks. Distill.Pub. Sept 2016. https://distill.pub/2016/augmented-rnns/.

Upon completion, students will be able to

understand the attention mechanism which has become one of the most fundamental building blocks in modern natural language processing systems and know the scientific thought process by which the attention mechanism was invented and developed. Furthermore, students will learn various applications in which the attention mechanism and sequence models are used, such as dialogue modelling.

Lesson 4.2 – Building an open-domain dialogue agent

Slides and episodes: W4S2, W4E2.1, W4E2.2, W4E2.3, W4E2.4.
Lecturer: Emily Dinan
Research Engineer, Facebook AI Research

Biography

Emily Dinan is a Research Engineer at Facebook AI Research in New York. Her research interests include conversational AI, natural language processing, and fairness and responsibility in these fields. Recently she has focused on methods for preventing conversational agents from reproducing biased, toxic, or otherwise harmful language. Prior to joining FAIR, she received her master's degree in Mathematics from the University of Washington.

Lecture topics

Can we train an AI agent to speak to humans in the way that humans speak to each other? Much progress has been made in the field of open-domain dialogue by training large-scale sequence-to-sequence models on data from the internet. Recent models can hold short, intelligible conversations on topics ranging from eating ramen to playing frisbee. However, many limitations still remain: these models tend to hallucinate knowledge, repeat and contradict themselves, and reproduce harmful language from the training data. We will define the problem of creating an open-domain dialogue agent and discuss recent advances in NLP that led to progress in this field, as well as the limitations of state-of-the-art models, remaining challenges, and unanswered questions.

Reading materials

Video — Alexa Prize 2020
Blog Post — Scaling chatbots (Google’s Meena)
Article — Chatbots and offensive language

Advanced materials (optional)

Arxiv Paper — Recipes for building an open-domain chatbot
Arxiv Paper — Recipes for safety in open-domain chatbots

Upon completion, students will be able to

understand the goal of creating a superhuman open-domain dialogue agent and recent progress towards this goal, based on advances in natural language processing and large-scale computing. Students will furthermore learn the limitations of existing approaches, such as lack of knowledge, lack of long-term memory, spurious repetitions, contradiction as well as bias and toxicity. Finally, students understand the difficulty of evaluating dialogue agents due to the unconstrained nature of the problem.

Lesson 5.1 – AI for Speech Processing: Overview and Recent Developments

Slides and episodes: W5S1, W5E1.1, W5E1.2, W5E1.3, W5E1.4.
Lecturer: Michael Picheny
Research Professor, Computer Science and Data Science, Formerly IBM T. J. Watson Research Center

Biography

Michael Picheny received his BS, MS, and PhD degrees from the Massachusetts Institute of Technology, Cambridge. He recently joined New York University in the Courant-Computer Science and the Center for Data Science as a research professor after many years as the senior manager of speech technologies in IBM Research AI at the IBM T.J. Watson Research Center, Yorktown Heights, New York. He has worked in the speech recognition area since 1981 when he joined IBM. He served as an associate editor of IEEE Transactions on Acoustics, Speech, and Signal Processing from 1986 to 1989 and as the chair of the Speech Technical Committee of the IEEE Signal Processing Society from 2002 to 2004. He was a member of the board of the International Speech Communication Association from 2005 to 2013 and was named an International Speech Communication Association fellow in 2014. He was the general chair of the 2011 IEEE Automatic Speech Recognition and Understanding Workshop. He is currently a distinguished industry speaker of the IEEE Signal Processing Society. He has published numerous papers in journals and at conferences on nearly every aspect of speech recognition. He is a Fellow of the IEEE.

Lecture topics

The lecture will cover the basics of four key long-standing problems in speech processing today: Speech Recognition, Speech Synthesis, Speaker Verification, and Spoken Language Understanding, and how these areas have been revolutionized through breakthroughs in Deep Learning.

Reading materials

Speech Reco:
- https://www.ibm.com/blogs/watson/2017/03/reaching-new-records-in-speech-recognition/
- https://www.ibm.com/cloud/learn/speech-recognition
Speaker Reco:
- https://www.ibm.com/blogs/research/2016/11/audio-analytics/
Speech Synthesis:
- https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-science

Upon completion, students will be able to

have a basic understanding of key issues in speech processing and how deep learning’s huge impact on performance on these tasks allows us to finally be able to create deployable, usable systems that now penetrate all aspects of society.

Lesson 5.2 – Adversarial search and learning, regret minimization, AI for Go, AI for poker

Slides and episodes: W5S2, W5E2.1, W5E2.2, W5E2.3.
Lecturer: Noam Brown
Research Scientist, Facebook AI Research NYC

Biography

Noam Brown is a Research Scientist at Facebook AI Research. He earned his PhD at Carnegie Mellon University. His research interests focus on multi-agent artificial intelligence and computational game theory. He has applied this research toward creating Libratus and Pluribus, the first AIs to defeat top humans in no-limit poker. Pluribus was named one of Science Magazine’s Top Ten Breakthroughs of the Year for 2019 and Libratus was a finalist for the same award in 2017. Noam received a NeurIPS Best Paper award in 2017, the 2017 Allen Newell Award for Research Excellence, and the 2019 Marvin Minsky Medal for Outstanding Achievements in AI. He was named a 2019 Innovator Under 35 by MIT Tech Review.

Lecture topics: NLP, language generation, machine translation, and what happens in dialogue systems like GPT.

Reading materials

A Simple Alpha(Go) Zero Tutorial: https://web.stanford.edu/~surag/posts/alphazero.html
Intro to Regret Minimization: https://hackernoon.com/artificial-intelligence-poker-and-regret-part-1-36c78d955720

Optional reading materials

AlphaGo Zero Paper: https://ai6034.mit.edu/wiki/images/Nature24270_AlphaGoZero.pdf
Pluribus Paper: https://www.cs.cmu.edu/~noamb/papers/19-Science-Superhuman.pdf

Upon completion, students will be able to

Explain the concept of a Nash equilibrium
Explain how multi-agent learning differs from learning in single-agent settings
Describe regret minimization
Describe AlphaZero and Pluribus at a high level

Lesson 6.1 – AI for medicine: saving lives with machine learning

Slides and episodes: W6S1, W6E1.1, W6E1.2, W6E1.3.
Lecturer: Rajesh Ranganath
Assistant Professor of Computer Science

Biography

I am an Assistant Professor at the Courant Institute at NYU in Computer Science and at the Center for Data Science. I am also part of the CILVR group. My research interests centre on easy-to-use probabilistic inference, understanding the role of randomness and information in model building, and machine learning for healthcare. Before joining NYU, I completed my PhD at Princeton working with Dave Blei and my undergraduate at Stanford both in computer science. I have also spent time as a research affiliate at MIT’s Institute for Medical Engineering and Science.

Lecture topics: Healthcare data, AI models for medical data with examples in sepsis and covid.

Reading materials

An AI revolution in medicine. https://news.harvard.edu/gazette/story/2020/11/risks-and-benefits-of-an-ai-revolution-in-medicine/
A validated, real-time prediction model for favourable outcomes in hospitalized COVID-19 patients. https://www.nature.com/articles/s41746-020-00343-x
The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care. https://www.nature.com/articles/s41591-018-0213-5?sf200531662

Upon completion, students will be able to

Understand the potential and challenges in AI for medicine
Explain the basic pipeline of building a translating a model to care
Understanding the basics of prediction vs causality, interpretation, and bias

Lesson 6.2 – Machine learning in medical imaging: challenges and opportunities

Slides and episodes: W6S2, W6E2.1, W6E2.2, W6E2.3.
Lecturer: Krzysztof Geras
Assistant Professor of Radiology

Biography

Krzysztof is an assistant professor at NYU School of Medicine and an affiliated faculty at NYU Center for Data Science. His main interests are in unsupervised learning with neural networks, model compression, transfer learning, evaluation of machine learning models and applications of these techniques to medical imaging. He previously did a postdoc at NYU with Kyunghyun Cho, a PhD at the University of Edinburgh with Charles Sutton and an MSc as a visiting student at the University of Edinburgh with Amos Storkey. His BSc is from the University of Warsaw.

Lecture topics: Explainable AI, learning from very large images, applications in medical imaging, breast cancer diagnosis.

Reading materials

Deep neural networks improve radiologists’ performance in breast cancer screening: https://medium.com/@jasonphang/deep-neural-networks-improve-radiologists-performance-in-breast-cancer-screening-565eb2bd3c9f
https://github.com/nyukat/breast_cancer_classifier
https://github.com/nyukat/GMIC

Upon completion, students will be able to

Understand the breadth of applications of machine learning in medical imaging.
Understand the opportunities and challenges for machine learning in medical imaging.
Explain how different machine learning tasks (classification, segmentation, detection, localization, reconstruction) are applied in medical image analysis.
Explain how neural networks are practically applied to the classification of large resolution medical images.

Lesson 7 – Data ethics and algorithmic fairness

Slides and episodes: W7S1, W7E1.1, W7E1.2, W7E1.3.
Lecturer: Emily Denton
Senior Research Scientist, Google Brain

Biography

Emily Denton is a Research Scientist on Google’s Ethical AI team, studying the societal impacts of machine learning and AI technology. Their recent research centres on critically examining the norms, values, and work practices that structure the development and use of machine learning datasets. Prior to joining Google, Emily received their PhD in machine learning from the Courant Institute of Mathematical Sciences at New York University, focusing on unsupervised learning and generative modelling of images and video.

Lecture topics

Data ethics
Algorithmic fairness

Reading materials

Data and its (dis)contents: A survey of dataset development and use in machine learning research
Data Science as Political Action: Grounding Data Science in a Politics of Justice (part 1 highly recommended)
How our data encodes systematic racism
Fairness and Abstraction in Sociotechnical Systems

(Optional) Reading materials

Race After Technology

Upon completion, students will be able to

Identify ethical issues associated with machine learning research and applications
Identify ethical issues along the dataset development pipeline
Improve ability to critically reflect on research culture, norms, and values

Lesson 8 – The future of AI

Slides and episodes: W7S2, W7E2.1, W7E2.2, W7E2.3, W7E2.4, W7E2.5.
Lecturer: Yann LeCun
Silver Professor of Computer Science, Data Science, and Neural Science. VP and Chief Scientist, Facebook; ACM Turing Award Laureate

Biography

Big Ideas in Artificial Intelligence

Syllabus of Lectures

Lesson 1 – Deep Learning revolution in AI

Lesson 2.1 – AI for understanding visual information

Lesson 2.2 – Computer vision in the real world

Lesson 3.1 – Autonomous robots: teaching robots to drive and fly safely

Lesson 3.2 – Machine Learning for Robotic Manipulation

Lesson 4.1 – Natural language processing: machine learning foundations, language generation, dialogue

Lesson 4.2 – Building an open-domain dialogue agent

Lesson 5.1 – AI for Speech Processing: Overview and Recent Developments

Lesson 5.2 – Adversarial search and learning, regret minimization, AI for Go, AI for poker

Lesson 6.1 – AI for medicine: saving lives with machine learning

Lesson 6.2 – Machine learning in medical imaging: challenges and opportunities

Lesson 7 – Data ethics and algorithmic fairness

Lesson 8 – The future of AI