NYU/Courant Theory Seminar

Friday January 30
2:15PM
WWH 1314

Michael Kapralov (IBM Watson)

Sample-Optimal Fourier Sampling in Any Constant Dimension

Friday February 6
2:15PM
WWH 1314

Ofer Shayevitz (Tel Aviv University)

An Upper Bound on the Sizes of Multiset-Union-Free Families

Friday February 13
2:15PM
WWH 1314

Elliot Anshelevich (RPI)

Stable Matching, Friendship, and Altruism

Friday March 13
2:15PM
WWH 1314

Justin Thaler (Yahoo! Labs)

Approximate Degree and the Method of Dual Polynomials

Friday March 27
2:15PM
WWH 1314

Michael Forbes (IAS)

Dimension Expanders via Rank Condensers

Thursday April 16
3:00PM
WWH 201

Ankur Moitra (MIT)

Tensor Prediction, Rademacher Complexity and Random 3-XOR

Friday April 24
2:00PM
WWH 512

Yuval Filmus (IAS)

On the Coppersmith-Winograd approach to matrix multiplication

Friday May 1
2:15PM
WWH 1314

Ilya Razenshteyn (MIT CSAIL)

Optimal Data-Dependent Hashing for Approximate Near Neighbors

Friday May 8
2:15PM
WWH 512

David Sontag (NYU)

How Good Is Structured Prediction?

Thursday June 11
2:15PM
WWH 1314

Moshe Lewenstein (Bar Ilan University)

(Min,+)-Convolutions, 3SUM and Additive Combinatorics

NYU CS Theory Seminar

Spring 2015 Schedule

List of previous talks

Friday January 30 2:15PM WWH 1314	Michael Kapralov (IBM Watson) Sample-Optimal Fourier Sampling in Any Constant Dimension Abstract: We present an algorithm that computes a k-sparse approximation to any signal from O(k\log N) Fourier measurements of a length N signal. This matches the known lower bound of O(k \log(N/k)) up to constant factors for any k\leq N^{1-\delta}. The algorithm runs in near-linear time, and provides the so-called \ell_2/\ell_2 guarantee. Our algorithm extends to higher dimensions, leading to sample complexity of O_d(k\log N), which is again optimal up to constant factors for any constant d. This is the first sample optimal algorithm for these problems. Using similar techniques, we also obtain an algorithm with slightly suboptimal sample complexity O(k\log N (\log\log N)^2) and a sub-linear time O(k \log^{O(1)} N) for any constant d. This generalizes the result of [IKP] to higher dimensions. We also present preliminary experimental evaluation of our sample-optimal near-linear time algorithm, indicating that the empirical sampling complexity of the algorithm is comparable to that of other recovery methods known in the literature, while providing strong provable guarantees on the recovery quality. (joint work with Piotr Indyk)
Friday February 6 2:15PM WWH 1314	Ofer Shayevitz (Tel Aviv University) An Upper Bound on the Sizes of Multiset-Union-Free Families Abstract: Two families of subsets of [n] are called multiset-union-free if all their pairwise multiset unions are distinct. Despite much effort over the years, not much is known on the largest possible sizes of such families, and a wide gap remains between the best known constructions and upper bounds. In this work we derive a new upper bound on the sizes of families that possess this property, improving a result by Urbanke and Li. To that end, we introduce a soft variation of the Sauer-Perles-Shelah Lemma, that is then used in conjunction with an information-theoretic argument for a more general setup. Joint work with Or Ordentlich.
Friday February 13 2:15PM WWH 1314	Elliot Anshelevich (RPI) Stable Matching, Friendship, and Altruism Abstract: We will discuss both integral and fractional versions of "correlated stable matching" problems. Each player is a node in a social network and strives to form a good match with a neighboring player; the player utilities from forming a match are correlated. We consider the existence, computation, and inefficiency of stable matchings from which no pair of players wants to deviate. We especially focus on networks where players are embedded in a social context, and may incorporate friendship relations or altruism into their decisions. When the benefits from a match are the same for both players, we show that incorporating the well-being of other players into their matching decisions significantly decreases the price of stability, while the price of anarchy remains unaffected. Furthermore, a good stable matching achieving the price of stability bound always exists and can be reached in polynomial time. We extend these results to more general matching rewards, when players matched to each other may receive different utilities from the match. For this more general case, we show that incorporating social context (i.e., "caring about your friends") can make an even larger difference, and greatly reduce the price of anarchy. Finally, we extend most of our results to network contribution games, in which players can decide how much effort to contribute to each incident edge, instead of simply choosing a single node to match with.
Friday March 13 2:15PM WWH 1314	Justin Thaler (Yahoo! Labs) Approximate Degree and the Method of Dual Polynomials Abstract: The \eps-approximate degree of a Boolean function is the minimum degree of a real polynomial that point-wise approximates f to error \eps. Approximate degree has wide-ranging applications in theoretical computer science, from computational learning theory to communication, query, and circuit complexity. Despite its importance, our understanding of approximate degree remains somewhat limited, with few general results known. The focus of this talk will be on a relatively new method for proving lower bounds on approximate degree: specifying \emph{dual polynomials}, which are dual solutions to a certain linear program capturing the approximate degree of any function. After surveying earlier work on approximate degree, I will describe how the method of dual polynomials has recently enabled progress on several open problems. Based on joint work with Mark Bun
Friday March 27 2:15PM WWH 1314	Michael Forbes (IAS) Dimension Expanders via Rank Condensers Abstract: Expander graphs are sparse graphs with good connectivity properties and they have become ubiquitous in theoretical computer science. Dimension expanders are a linear-algebraic variant where we ask for a constant number of linear maps that expand subspaces of a vector space (instead of subsets of vertices). After their definition 10 years ago by Barak, Impagliazzo, Shpilka and Wigderson there are now two constructions of constant-degree dimension expanders, both of which suggest dimension expanders are more complicated than expander graphs. In this work, we give a new construction of constant-degree dimension expanders (over large fields) which is quite simple. It follows from an emerging theory of linear-algebraic pseudorandomness where the rank of a subspace plays the role of the min-entropy of a random variable. In particular, we use the recent near-optimal construction of subspace designs by Guruswami and Kopparty (based on Wronskians) to construct a near optimal "lossy rank condenser". This condenser, in addition to a tensoring operation, yields the desired dimension expanders. Joint work with Venkatesan Guruswami
Thursday April 16 3:00PM WWH 201	Ankur Moitra (MIT) Tensor Prediction, Rademacher Complexity and Random 3-XOR Abstract: Here we study the tensor prediction problem, where the goal is to accurately predict the entries of a low rank, third-order tensor (with noise) given as few observations as possible. We give algorithms based on the sixth level of the sum-of-squares hierarchy that work with roughly m = n^3/2 observations, and we complement our result by showing that any attempt to solve tensor prediction with fewer observations through the sum-of-squares hierarchy would run in moderately exponential time. In contrast, information theoretically roughly m = n observations suffice. This work is part of a broader agenda of studying computational vs. statistical tradeoffs through the sum-of-squares hierarchy. In particular, for linear inverse problems (such as tensor prediction) the natural sum-of-squares relaxation gives rise to a sequence of norms. Our approach is to characterize their Rademacher complexity. Moreover, both our upper and lower bounds are based on connections between this, and the task of strongly refuting random 3-XOR formulas, and the resolution proof system. This talk is based on joint work with Boaz Barak
Friday April 24 2:00PM WWH 512	Yuval Filmus (IAS) On the Coppersmith-Winograd approach to matrix multiplication Abstract: Ever since Strassen's O(n^{2.81}) matrix multiplication algorithm stunned the mathematical community, the quest for fast matrix multiplication algorithms has been a holy grail in computer science. At first progress was fast, culminating in Coppersmith and Winograd's O(n^{2.376}) algorithm of 1987. Recently interest in the area has reawakened due to work of Stothers, Vassilevska-Williams and Le Gall, who managed to improve the exponent slightly from 2.376 to 2.373. Roughly speaking, Coppersmith and Winograd constructed an edifice turning an "identity" (some kind of mathematical object) into a fast matrix multiplication algorithm. Applying their method on the identity TCW, they obtained an O(n^{2.388}) algorithm. Applying it to TCW^2 (identities can be squared!), they obtained their O(n^{2.376}) algorithm. They stopped there since computing was a lot more cumbersome in the 1980s. Using modern computers, Stothers, Vassilevska-Williams and Le Gall were able to analyze higher powers of TCW (up to TCW^32), and so reduced the exponent to 2.373. Our talk answers the following question: What is the limit of this approach? We show that this approach cannot yield an exponent smaller than 2.372. No prior knowledge will be assumed. Joint work with Andris Ambainis (University of Latvia) and Francois Le Gall (University of Tokyo).
Friday May 1 2:15PM WWH 1314	Ilya Razenshteyn (MIT CSAIL) Optimal Data-Dependent Hashing for Approximate Near Neighbors Abstract: The classic Approximate Near Neighbor problem (ANN) can be formulated as follows. Given a dataset of points the goal is to preprocess it so that, given a query point close to some data point, return a data point that is not too far from the query. If one insists on the space being subquadratic in the number of points (the most important regime in practice), the only known technique for tackling ANN until recently was Locality-Sensitive Hashing (LSH) introduced by Indyk and Motwani in 1998. Since then, LSH has been proved very influential both in theory and in practice. During the talk, I will introduce LSH and state certain fundamental limitations of it. Then, I will talk about a recent line of work that overcomes them. That is, I will show first data structures for ANN that are provably better than the best possible LSH-based ones. In particular, for the first time, we are able to improve upon [Indyk-Motwani 1998] for the Hamming distance, and [Andoni-Indyk 2006] for the Euclidean distance. The key technique is data-dependent hashing: we tailor the hash family to a given set of data points; this approach has strong parallels with the practice of ANN. Our hashing schemes are optimal in the class of data-dependent hashing. The key technical tool is a regularity-type decomposition: we show how to partition a worst-case dataset into several parts that look random. The talk is based on a joint work with Alex Andoni (Simons Institute) that can be found at http://arxiv.org/abs/1501.01062.
Friday May 8 2:15PM WWH 512	David Sontag (NYU) How Good Is Structured Prediction? Abstract: Many machine learning tasks can be posed as structured prediction, where the goal is to predict a labeling or structured object. For example, the input may be an image or a sentence, and the output is a labeling such as an assignment of each pixel in the image to foreground or background, or the parse tree for the sentence. Despite marginal and MAP inference for many of these models being NP-hard in the worst-case, approximate inference algorithms are remarkably successful and as a result structured prediction is widely used. What makes these real-world instances different from worst-case instances? One key difference is that in all of these applications, there is an underlying "ground truth" which structured prediction is aiming to find. In this talk, I will introduce a new theoretical framework for analyzing structured prediction algorithms in terms of their ability to achieve small Hamming error. We study the computational and statistical trade-offs that arise in this setting, and illustrate a setting where polynomial-time algorithms can perform optimal prediction, despite the corresponding MAP inference task being NP-hard. Joint work with Amir Globerson, Tim Roughgarden, and Cafer Yildirim.
Thursday June 11 2:15PM WWH 1314	Moshe Lewenstein (Bar Ilan University) (Min,+)-Convolutions, 3SUM and Additive Combinatorics Abstract: We present a collection of new results on problems related to 3SUM, including: The first truly subquadratic algorithm for 1. computing the (min,+) convolution for monotone increasing sequences with integer values bounded by $O(n)$, 2. solving 3SUM for monotone sets in 2D with integer coordinates bounded by $O(n)$, and 3. preprocessing a binary string for histogram indexing (also called jumbled indexing). The running time is $O(n^{1.859})$ with randomization, or $O(n^{1.864})$ deterministically. This greatly improves the previous $n^2/2^{\Omega(\sqrt{\log n})}$ time bound obtained from Williams' recent result on all-pairs shortest paths [STOC'14], and answers an open question raised by several researchers studying the histogram indexing problem. The first algorithm for histogram indexing for any constant alphabet size that achieves truly subquadratic preprocessing time and truly sublinear query time. 1. A truly subquadratic algorithm for integer 3SUM in the case when the given set can be partitioned into $n^{1-\delta}$ clusters each covered by an interval of length $n$, for any constant $\delta>0$. 2. An algorithm to preprocess any set of $n$ integers so that subsequently 3SUM on any given subset can be solved in $\OO(n^{13/7})$ time. All these results are obtained by a surprising new technique, based on the Balog-Szemeredi-Gowers Theorem from additive combinatorics. Joint work with Timothy Chan