Modern learning problems in computer vision, natural language
processing, computational biology, and other areas are often based
on large data sets of tens of thousands to millions of training
instances. But, how do we scale standard learning algorithms, such as
Support Vector Machines, Kernel Ridge Regression, Spectral Clustering
and Kernel Principal Component Analysis, to such orders of magnitude?
In this talk, I will discuss solutions to this problem using
sampling-based matrix approximation techniques. I will address several
fundamental theoretical, algorithmic, and empirical questions
associated with this class of algorithms: Which approximation should
be used? How should sampling be performed? How well do these
approximations work in theory? How well do they work for practical
applications? I will also report the results of extensive experiments
with large-scale data sets.