|
Nonlinear Dimensionality Reduction and Manifold Learning
Non-linear CCA and PCA by Alignment of Local Models
- In the same way that you can align local representations
of linear subspace models to get globally nonlinear models, you can
also align representations of linear input-output models such as
Canonical Correlation Analysis to get non-linear CCA.
A special case of our method, when applied to only a single manifold,
reduces to the Laplacian Eigenmaps algorithm.
- Jakob Verbeek, Nikos Vlassis and I had a paper entitled
Non-linear CCA and PCA by Alignment of Local Models
describing this work in NIPS 2003.
Automatic Alignment of Local Representations
- A big challenge when getting many experts to learn in concert is
to get their hidden representations to agree in some way.
In this paper, we
present an automatic alignment procedure which maps the disparate
internal representations learned by several local dimensionality reduction
experts into a single, coherent global coordinate system for the
original data space. Our algorithm can be applied to any set of
experts, each of which produces
a low-dimensional local representation
of a high-dimensional input. Unlike the Global Coordination
paper below (which coordinates models by modifying their objective
functions) our algorithm is invoked after training and applies
an efficient eigenmethod to post-process the
trained models. It has no local optima and the size of the system it
must solve scales with the number of local models rather than the
number of original data points, making it more efficient than
model-free algorithms such as Isomap or LLE.
- A paper on this work, with Yee Whye Teh:
Automatic Alignment of Local Representations
appeared in NIPS14.
Stochastic Neighbour Embedding
- Nonlinear dimensionality reduction algorithms can often be
formulated in terms of what they preserve about the original data.
Some, like PCA preserve variance, others, like MDS or Isomap large distances
(metric, nonmetric or geodesic), others like SOM or LLE preserve
nearby neighbours. In this paper
we describe a probabilistic approach to the task of embedding
high-dimensional objects into a low-dimensional space in a way that
preserves the *distribution of neighbor identities*.
A Gaussian is centered on each object in the
high-dimensional space and the densities under this Gaussian are used
to define a probability distribution over all the potential neighbors
of the
object. The aim of the embedding is to approximate this distribution
as well as possible when the same operation is performed in the
low-dimensional space. The natural cost function is a sum of
Kullback-Leibler divergences, one per object. This leads to a simple
gradient for adjusting the positions of objects in the low-dimensional
space. Unlike other dimensionality reduction methods,
this probabilistic framework makes it easy to represent
each high-dimensional object by a mixture of widely separated
low-dimensional points. This allows the embeddings of ambiguous
objects, like the document count vector for the word ``bank'', to be
close to the embeddings of both ``river'' and ``finance'' without
forcing outdoor concepts to be located close to corporate ones.
- As an example, here's a map of the NIPS
vols.0-12 authors as produced by the SNE algorithm.
- A paper on SNE, written with Geoff Hinton:
Stochastic Neighbor Embedding
appeared in NIPS14.
Global Coordination
- High dimensional data that lies on or near a low dimensional
manifold can be described by a collection of local linear models.
Such a description, however, does not provide a global
parameterization of the manifold---arguably an important goal
of unsupervised learning.
In this paper, we show how to learn a collection of local linear
models that solves this more difficult problem. Our local linear
models are represented by a mixture of factor analyzers, and the
``global coordination'' of these models is achieved by adding a
regularizing term to their objective function. The regularizer breaks
a degeneracy in the mixture model's parameter space, favoring models
whose internal coordinate systems are aligned
in a consistent way. As a result, the internal coordinates change
smoothly and continuously as one traverses a connected path on the % data
manifold---even when the path crosses the domains of many different
local models. The regularizer takes the form of a Kullback-Leibler
divergence and illustrates an unexpected application of variational
methods: not to perform approximate inference in intractable
probabilistic models, but to learn more useful internal
representations in tractable~ones.
- Our paper,
Global Coordination of Local Linear Models appeared in
NIPS13 (joint work with Lawrence Saul and Geoff Hinton).
Locally Linear Embedding (LLE)
-
Exploratory data analysis and visualization play an
increasingly important role in all areas of science.
The need to analyze large amounts of multivariate data
raises the fundamental problem of dimensionality reduction--how to
discover compact representations of high dimensional data. Here we
introduce locally linear embedding (LLE), an unsupervised learning
algorithm that is able to discover the global structure of nonlinear
manifolds. LLE uses local symmetries and linear reconstructions to
compute low dimensional, neighborhood preserving embeddings of
multivariate data. Applied to images of faces, LLE discovers a
coordinate representation of facial attributes; applied to documents
of text, it colocates--in a continuous semantic space--words with
similar contexts.
(with Lawrence Saul, AT&T Labs)
- A paper describing this
work appeared in the Dec22, 2000 issue of Science.
Here is a link to the
online version of the paper.
- A longer version
of the LLE paper is now available, which appeared in volume 4 of JMLR.
Learning nonlinear dynamics
[ |
Information |
Research |
Teaching |
Professional |
]
Sam Roweis, Vision, Learning and Graphics Group,
NYU, www.cs.nyu.edu/~roweis
|