Computational Mathematics and Scientific Computing Seminar
Neural network Hessian approximation using the fast multipole method
Speaker: George Biros, Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin
Location: Warren Weaver Hall (online)
Date: Nov. 20, 2020, 10 a.m.
Fast multipole methods (FMM) are ubiquitous in science and engineering and are a core computational primitive for many production codes in simulation and data analysis. In the first part of my talk, I will first describe GOFMM (Geometry-Oblivious Fast-Multipole Method), an FMM method that can be used to compress an arbitrary SPD matrix. For many (but not all) problems of practical interest, GOFMM enables an approximate matrix-vector multiplication in O(N log N) or even O(N) time. Compression requires O(N log N) storage and work. In general, GOFMM belongs to the hierarchical matrix approximation methods. In particular, it generalizes FMM to a purely algebraic setting by only requiring the ability to sample matrix entries. Neither geometric information (i.e., point coordinates) nor knowledge of how the matrix entries have been generated is required, thus, the term “geometry oblivious”. In the second part of my talk, I will describe using GOFMM to approximate the Gauss-Newton Hessian of a fully connected net. The algorithm uses double randomization, both for GOFMM and for the entries of the Hessian. I will present results for simple autoencoders and compare with the K-FAC and global low-rank Hessian approximations.
This is joint work with Chao Chen and Chenhan Yu.
See https://cs.nyu.edu/dynamic/news/seminars/1/ for how to join the seminar via Zoom