Numerical Analysis and Scientific Computing Seminar

An Optimal Control Framework for Efficient Training of Deep Neural Networks

Speaker: Lars Ruthotto, Emory University

Location: Warren Weaver Hall 1302

Date: Nov. 10, 2017, 10 a.m.


One of the most promising areas in artificial intelligence is deep learning, a form of machine learning that uses neural networks containing many hidden layers. Recent success has led to breakthroughs in applications such as speech and image recognition. However, more theoretical insight is needed to create a rigorous scientific basis for designing and training deep neural networks, increasing their scalability, and providing insight into their reasoning.

In this talk, we present a new mathematical framework that simplifies designing, training, and analyzing deep neural networks. It is based on the interpretation of deep learning as a dynamic optimal control problem similar to path-planning problems.  We will exemplify how this understanding helps design, analyze, and train deep neural networks. Particularly, we will focus on ways to ensure the stability of the dynamics in both the continuous and discrete setting and on ways to exploit discretization to obtain adaptive neural networks.

The talk is joint work with Eldad Haber and based in part on the manuscripts: