Numerical Analysis and Scientific Computing Seminar

Random-then-Greedy Procedure: From Empirical Risk Minimization to Gradient Boosting Machine

Speaker: Hai-hao (Sean) Li, Google Research and University of Chicago

Location: Warren Weaver Hall 1302

Date: Nov. 22, 2019, 10 a.m.


Randomized algorithms and greedy algorithms are both widely used in practice. In this talk, I'll present a random-then-greedy procedure, and showcase how it improves two important machine learning algorithms: stochastic gradient descent and gradient boosting machine.
In the first part of the talk, we propose a new stochastic optimization framework for empirical risk minimization problems such as those that arise in machine learning. The traditional approaches, such as (mini-batch) stochastic gradient descent (SGD), utilize an unbiased gradient estimator of the empirical average loss. In contrast, we develop a computationally efficient method to construct a gradient estimator that is purposely biased toward those observations with higher current losses. On the theory side,  we show that the proposed method minimizes a new ordered modification of the empirical average loss, and is guaranteed to converge at a sublinear rate to a global optimum for convex loss and to a critical point for weakly convex (non-convex) loss.  Furthermore, we prove a new generalization bound for the proposed algorithm. On the empirical side, the numerical experiments show that our proposed method consistently improves the test errors compared with the standard mini-batch SGD in various models including SVM, logistic regression, and deep learning problems. 

The gradient boosting machine (GBM) is one of the most successful supervised learning algorithms, and it has been the dominant method in many data science competitions, including Kaggle and KDDCup.  In the second part of the talk, we present the Random-then-Greedy Gradient Boosting Machine (RtGBM), which lowers the cost per iteration and achieves improved performance in theory as well as practice.

Haihao (Sean) Lu is a visiting researcher at Google Research NYC. He will join the University of Chicago as an assistant professor in Fall 2020. Prior to Google, he obtained his PhD degree in Operations Research and Applied Mathematics at MIT, advised by Prof. Robert Freund. His research interests are at the intersection of optimization and machine learning, with a focus on developing both theoretical analysis and real computational tools for challenging “big data” problems.