Computer Science NASC Seminar
A Randomized Approximate Nearest Neighbors Algorithm
Andrei Osipov, Yale University
October 26, 2012
10:00AM
Warren Weaver Hall, Room 1302
251 Mercer Street
New York, NY, 100121110
(Directions)
Fall 2012 NASC Seminars Calendar
Synopsis
We present a randomized algorithm for the approximate nearest neighbor problem in ddimensional
Euclidean space. Given N points {xj} in Rd, the algorithm attempts to find
k nearest neighbors for each of xj , where k is a userspecified integer parameter. The
algorithm is iterative, and its CPU time requirements are proportional to T ·N · (d· (log d)+
k · (d + log k) · (logN)) + N · k2 · (d + log k), with T the number of iterations performed.
The memory requirements of the procedure are of the order N · (d + k).
A byproduct of the scheme is a data structure, permitting a rapid search for the k nearest
neighbors among {xj} for an arbitrary point x 2 Rd. The cost of each such query is
proportional to T · (d · (log d) + log(N/k) · k · (d + log k)), and the memory requirements for
the requisite data structure are of the order N · (d + k) + T · (d + N).
The algorithm utilizes random rotations and a basic divideandconquer scheme, followed
by a local graph search. We analyze the scheme’s behavior for certain types of distributions
of {xj}, and illustrate its performance via several numerical examples.
