Usual coordinates:
Friday, 2:15PM
Room 1314
Warren Weaver Hall
251 Mercer Street
Spring 2015 Schedule
Upcoming talks
Friday May 1
2:15PM
WWH 1314
Ilya Razenshteyn (MIT CSAIL)
Optimal Data-Dependent Hashing for Approximate Near Neighbors
Abstract: The classic Approximate Near Neighbor problem (ANN) can be formulated as
follows. Given a dataset of points the goal is to preprocess it so that,
given a query point close to some data point, return a data point that
is not too far from the query. If one insists on the space being
subquadratic in the number of points (the most important regime in
practice), the only known technique for tackling ANN until recently was
Locality-Sensitive Hashing (LSH) introduced by Indyk and Motwani in
1998. Since then, LSH has been proved very influential both in theory
and in practice.
During the talk, I will introduce LSH and state certain fundamental
limitations of it. Then, I will talk about a recent line of work that
overcomes them. That is, I will show first data structures for ANN that
are provably better than the best possible LSH-based ones. In
particular, for the first time, we are able to improve upon
[Indyk-Motwani 1998] for the Hamming distance, and [Andoni-Indyk 2006]
for the Euclidean distance.
The key technique is *data-dependent hashing*: we tailor the hash family
to a given set of data points; this approach has strong parallels with
the practice of ANN. Our hashing schemes are *optimal* in the class of
data-dependent hashing. The key technical tool is a regularity-type
decomposition: we show how to partition a *worst-case* dataset into
several parts that look *random*.
Abstract: Many machine learning tasks can be posed as structured prediction,
where the goal is to predict a labeling or structured object. For
example, the input may be an image or a sentence, and the output is a
labeling such as an assignment of each pixel in the image to
foreground or background, or the parse tree for the sentence. Despite
marginal and MAP inference for many of these models being NP-hard in
the worst-case, approximate inference algorithms are remarkably
successful and as a result structured prediction is widely used.
What makes these real-world instances different from worst-case
instances? One key difference is that in all of these applications,
there is an underlying "ground truth" which structured prediction is
aiming to find. In this talk, I will introduce a new theoretical
framework for analyzing structured prediction algorithms in terms of
their ability to achieve small Hamming error. We study the
computational and statistical trade-offs that arise in this setting,
and illustrate a setting where polynomial-time algorithms can perform
optimal prediction, despite the corresponding MAP inference task being
NP-hard.
Joint work with Amir Globerson, Tim Roughgarden, and Cafer Yildirim.
List of previous talks
Friday January 30
2:15PM
WWH 1314
Michael Kapralov (IBM Watson)
Sample-Optimal Fourier Sampling in Any Constant Dimension
Abstract: We present an algorithm that computes a k-sparse approximation to any signal from O(k\log N) Fourier measurements of a length N signal. This matches the known lower
bound of O(k \log(N/k)) up to constant factors for any k\leq N^{1-\delta}. The algorithm runs in near-linear time, and provides the so-called \ell_2/\ell_2 guarantee. Our algorithm extends to higher dimensions, leading to sample complexity of O_d(k\log N), which is again optimal up to constant factors for any constant d. This is the first sample optimal algorithm for these problems.
Using similar techniques, we also obtain an algorithm with slightly suboptimal sample complexity O(k\log N (\log\log N)^2) and a sub-linear time O(k \log^{O(1)} N) for any constant d. This generalizes the result of [IKP] to higher dimensions.
We also present preliminary experimental evaluation of our sample-optimal near-linear time algorithm, indicating that the empirical sampling complexity of the algorithm is comparable to that of other recovery methods known in the literature, while providing strong provable guarantees on the recovery quality.
(joint work with Piotr Indyk)
Friday February 6
2:15PM
WWH 1314
Ofer Shayevitz (Tel Aviv University)
An Upper Bound on the Sizes of Multiset-Union-Free Families
Abstract: Two families of subsets of [n] are called multiset-union-free
if all their pairwise multiset unions are distinct. Despite much effort
over the years, not much is known on the largest possible sizes of such families,
and a wide gap remains between the best known constructions and upper bounds.
In this work we derive a new upper bound on the sizes of families that possess this property,
improving a result by Urbanke and Li. To that end, we introduce a soft variation of the
Sauer-Perles-Shelah Lemma, that is then used in conjunction with an information-theoretic
argument for a more general setup.
Joint work with Or Ordentlich.
Friday February 13
2:15PM
WWH 1314
Elliot Anshelevich (RPI)
Stable Matching, Friendship, and Altruism
Abstract: We will discuss both integral and fractional versions of "correlated stable matching"
problems. Each player is a node in a social network and strives to form a good match with a
neighboring player; the player utilities from forming a match are correlated. We consider
the existence, computation, and inefficiency of stable matchings from which no pair of players
wants to deviate. We especially focus on networks where players are embedded in a social context,
and may incorporate friendship relations or altruism into their decisions.
When the benefits from a match are the same for both players, we show that incorporating the
well-being of other players into their matching decisions significantly decreases the price of
stability, while the price of anarchy remains unaffected. Furthermore, a good stable matching
achieving the price of stability bound always exists and can be reached in polynomial time. We
extend these results to more general matching rewards, when players matched to each other may
receive different utilities from the match. For this more general case, we show that incorporating
social context (i.e., "caring about your friends") can make an even larger difference, and greatly
reduce the price of anarchy. Finally, we extend most of our results to network contribution games,
in which players can decide how much effort to contribute to each incident edge, instead of simply
choosing a single node to match with.
Friday March 13
2:15PM
WWH 1314
Justin Thaler (Yahoo! Labs)
Approximate Degree and the Method of Dual Polynomials
Abstract: The \eps-approximate degree of a Boolean function is the minimum degree of a real polynomial that
point-wise approximates f to error \eps. Approximate degree has wide-ranging applications in theoretical computer science,
from computational learning theory to communication, query, and circuit complexity. Despite its importance,
our understanding of approximate degree remains somewhat limited, with few general results known.
The focus of this talk will be on a relatively new method for proving lower bounds on approximate degree:
specifying \emph{dual polynomials}, which are dual solutions to a certain linear program capturing the
approximate degree of any function. After surveying earlier work on approximate degree, I will describe
how the method of dual polynomials has recently enabled progress on several open problems.
Based on joint work with Mark Bun
Friday March 27
2:15PM
WWH 1314
Michael Forbes (IAS)
Dimension Expanders via Rank Condensers
Abstract: Expander graphs are sparse graphs with good connectivity properties and they have
become ubiquitous in theoretical computer science. Dimension expanders are a linear-algebraic
variant where we ask for a constant number of linear maps that expand subspaces of a vector space
(instead of subsets of vertices). After their definition 10 years ago by Barak, Impagliazzo, Shpilka
and Wigderson there are now two constructions of constant-degree dimension expanders, both of which
suggest dimension expanders are more complicated than expander graphs.
In this work, we give a new construction of constant-degree dimension expanders (over large fields)
which is quite simple. It follows from an emerging theory of linear-algebraic pseudorandomness where
the rank of a subspace plays the role of the min-entropy of a random variable. In particular, we use
the recent near-optimal construction of subspace designs by Guruswami and Kopparty (based on Wronskians)
to construct a near optimal "lossy rank condenser". This condenser, in addition to a tensoring operation,
yields the desired dimension expanders.
Joint work with Venkatesan Guruswami
Thursday April 16
3:00PM
WWH 201
Ankur Moitra (MIT)
Tensor Prediction, Rademacher Complexity and Random 3-XOR
Abstract: Here we study the tensor prediction problem, where the goal is to accurately predict the entries of a low rank,
third-order tensor (with noise) given as few observations as possible. We give algorithms based on the sixth level of the
sum-of-squares hierarchy that work with roughly m = n^3/2 observations, and we complement our result by showing that any
attempt to solve tensor prediction with fewer observations through the sum-of-squares hierarchy would run in moderately
exponential time. In contrast, information theoretically roughly m = n observations suffice.
This work is part of a broader agenda of studying computational vs. statistical tradeoffs through the sum-of-squares hierarchy.
In particular, for linear inverse problems (such as tensor prediction) the natural sum-of-squares relaxation gives rise to a
sequence of norms. Our approach is to characterize their Rademacher complexity. Moreover, both our upper and lower bounds are
based on connections between this, and the task of strongly refuting random 3-XOR formulas, and the resolution proof system.
This talk is based on joint work with Boaz Barak
Friday April 24
2:00PM
WWH 512
Yuval Filmus (IAS)
On the Coppersmith-Winograd approach to matrix multiplication
Abstract:
Ever since Strassen's O(n^{2.81}) matrix multiplication algorithm stunned the mathematical community,
the quest for fast matrix multiplication algorithms has been a holy grail in computer science.
At first progress was fast, culminating in Coppersmith and Winograd's O(n^{2.376}) algorithm of 1987.
Recently interest in the area has reawakened due to work of Stothers, Vassilevska-Williams and Le Gall,
who managed to improve the exponent slightly from 2.376 to 2.373.
Roughly speaking, Coppersmith and Winograd constructed an edifice turning an "identity"
(some kind of mathematical object) into a fast matrix multiplication algorithm.
Applying their method on the identity TCW, they obtained an O(n^{2.388}) algorithm.
Applying it to TCW^2 (identities can be squared!), they obtained their O(n^{2.376}) algorithm.
They stopped there since computing was a lot more cumbersome in the 1980s.
Using modern computers, Stothers, Vassilevska-Williams and Le Gall were able to analyze
higher powers of TCW (up to TCW^32), and so reduced the exponent to 2.373.
Our talk answers the following question:
What is the limit of this approach?
We show that this approach cannot yield an exponent smaller than 2.372.
No prior knowledge will be assumed.
Joint work with Andris Ambainis (University of Latvia) and Francois Le Gall (University of Tokyo).