Colloquium Details

Towards a Less Conservative Theory of Machine Learning: Unstable Optimization and Implicit Regularization

Speaker: Jingfeng Wu, UC Berkeley

Location: 60 Fifth Avenue 150

Date: March 6, 2026, 11 a.m.

Host: Prof. Joan Bruna

Synopsis:

Deep learning’s empirical success challenges the “conservative" nature of classical optimization and statistical learning theories. Classical theory mandates small stepsizes for training stability and explicit regularization for complexity control. Yet, deep learning leverages mechanisms that thrive beyond these traditional boundaries. In this talk, I present a research program dedicated to building a less conservative theoretical foundation by demystifying two such mechanisms:
1. Unstable Optimization: I show that large stepsizes, despite causing local oscillations, accelerate the global convergence of gradient descent (GD) in overparameterized logistic regression.
2. Implicit Regularization: I show that the implicit regularization of early-stopped GD statistically dominates explicit $\ell_2$-regularization across all linear regression problem instances.
I further showcase how the theoretical principles lead to practice-relevant algorithmic designs (such as Seesaw for reducing serial steps in large language model pretraining). I conclude by outlining a path towards a rigorous understanding of modern learning paradigms.

Speaker Bio:

Jingfeng Wu is a postdoctoral fellow at the Simons Institute for the Theory of Computing at UC Berkeley. His research focuses on deep learning theory, optimization, and statistical learning. He earned his Ph.D. in Computer Science from Johns Hopkins University in 2023. Prior to that, he received a B.S. in Mathematics (2016) and an M.S. in Applied Mathematics (2019), both from Peking University. In 2023, he was recognized as a Rising Star in Data Science by the University of Chicago and UC San Diego.

Notes:

In-person attendance only available to those with active NYU ID cards.

How to Subscribe