End-to-end Optimization of Discrete Latent Structures
Speaker: Mohammad Norouzi, Google
Location: 60 Fifth Avenue 150
Date: April 3, 2019, 2 p.m.
Host: Subhash Khot
Discrete latent variables offer computational efficiency, interpretability, and enable interaction with traditional software (e.g., databases) within machine learning models. Despite their attractive properties, discrete variables are difficult to optimize as they entail discontinuous objective functions for which direct stochastic relaxation is often ineffective. This talk introduces two general approaches to optimizing discrete variable models, inspired by value-based and policy-based reinforcement learning, with applications to end-to-end speech recognition and program synthesis. For structured losses such as word error rate in speech recognition, I present a dynamic programming algorithm to identify the optimal extension for each partial sequence and a way to distill such optimal extensions into a neural network. This results in a significant reduction of word error rate compared to standard probabilistic models. For unstructured losses (e.g., sparse success-failure feedback), I present a novel way to leverage a memory buffer of promising trajectories to reduce the variance of policy gradient estimates. This yields state-of-the-art weakly supervised program synthesis from natural language input. Finally, I discuss geometric latent structures and present a novel approach to learning optimal sets of category-specific keypoints for pose estimation.
Mohammad Norouzi is a Senior Research Scientist at Google Brain in Toronto, working in Geoff Hinton’s group. He received his PhD in computer science at the University of Toronto in 2016, under the supervision of David Fleet, working on scalable similarity search algorithms. His PhD was supported by a Google US/Canada PhD fellowship in machine learning. His research interests span a broad range of topics in deep learning, natural language processing, and computer vision, with a focus on statistical generative models and reinforcement learning algorithms and applications.
Refreshments will be offered starting 15 minutes prior to the scheduled start of the talk.