# Colloquium Details

## New York Area Theory Day

**Speaker:**
IBM/NYU/Columbia

**Location:**
Warren Weaver Hall 109

**Date:**
December 1, 2017, 9:30 a.m.

**Host:** New York University, New York

**Synopsis:**

The New York Area Theory Day is a semi-annual conference, aimed to bring together people in the New York metropolitan area for one day of interaction and discussion. The meeting is free and open to everyone; in particular, students are encouraged to attend.

For directions, please see here.

To subscribe to our mailing list, follow the instructions here.

Organizers:

## Program

9:30 - 10:00 | Coffee and bagels |

10:00 - 10:55 | Professor Christos Papadimitriou A Computer Scientist Thinks about the Brain |

10:55 - 11:10 | Coffee break |

11:10 - 12:10 | Short talks: Shay Solomon, Ilya Razenshteyn, Sepideh Mahabadi |

12:10 - 2:00 | Lunch break |

2:00 - 2:55 | Dr. Ilan Komargodski White-Box vs. Black-Box Complexity of Search Problems: Ramsey and Graph Property Testing |

2:55 - 3:15 | Short talk: Sandro Coretti |

3:15 - 3:35 | Coffee break |

3:35 - 3:55 | Short talk: Greg Bodwin |

3:55 - 4:50 |
Professor Virginia Vassilevska Williams |

5:00 - 7:00 | Follow-up social |

## Long Talks

**Christos H. Papadimitriou** (Columbia University)

*A Computer Scientist Thinks about the Brain*

Understanding the ways in which the Brain gives rise to the Mind (memory, behavior, cognition, intelligence, language) is the most challenging problem facing science today. As the answer seems likely to be at least partly computational, computer scientists should work on this problem. Using a simplified model of Hebbian plasticity and employing random graph theory and Bernoulli approximations we reproduce a form of association between memories recently observed experimentally in humans. We discuss several implications and extensions. (Joint work with W. Maass and S. Vempala).

**Ilan Komargodski** (Cornell Tech)

*White-Box vs. Black-Box Complexity of Search Problems: Ramsey and Graph Property Testing*

Ramsey theory assures us that in any graph there is a clique or independent set of a certain size, roughly logarithmic in the graph size. But how difficult is it to find the clique or independent set? This problem is in TFNP, the class of search problems with guaranteed solutions. If the graph is given explicitly, then it is possible to do so while examining a linear number of edges. If the graph is given by a black-box, where to figure out whether a certain edge exists the box should be queried, then a large number of queries must be issued.

- What if one is given a program or circuit ("white-box") for computing the existence of an edge. Does the search problem remain hard?
- Can we generically translate all TFNP black-box hardness into white-box hardness?
- Does the problem remain hard if the black-box instance is small?

We will answer all of these questions and discuss related questions in the setting of property testing.

Joint work with Moni Naor and Eylon Yogev.

**Virginia Vassilevska Williams **(MIT)

*Fine-Grained Complexity of Problems in P*

A central goal of algorithmic research is to determine how fast computational problems can be solved in the worst case. Theorems from complexity theory state that there are problems that, on inputs of size n, can be solved in t(n) time but not in t(n)^{1-eps} time for eps>0. The main challenge is to determine where in this hierarchy various natural and important problems lie. Throughout the years, many ingenious algorithmic techniques have been developed and applied to obtain blazingly fast algorithms for many problems. Nevertheless, for many other central problems, the best known running times are essentially those of their classical algorithms from the 1950s and 1960s.

Unconditional lower bounds seem very difficult to obtain, and so practically all known time lower bounds are conditional. For years, the main tool for proving hardness of computational problems have been NP-hardness reductions, basing hardness on P\neq NP. However, when we care about the exact running time (as opposed to merely polynomial vs non-polynomial), NP-hardness is not applicable, especially if the problem is already solvable in polynomial time. In recent years, a new theory has been developed, based on "fine-grained reductions" that focus on exact running times. Mimicking NP-hardness, the approach is to (1) select a key problem X that is conjectured to require essentially t(n) time for some t, and (2) reduce X in a fine-grained way to many important problems. This approach has led to the discovery of many meaningful relationships between problems, and even sometimes to equivalence classes.

The main key problems used to base hardness on have been: the 3SUM problem, the CNF-SAT problem (based on the Strong Exponential TIme Hypothesis (SETH)) and the All Pairs Shortest Paths Problem. Research on SETH-based lower bounds has flourished in particular in recent years showing that the classical algorithms are optimal for problems such as Approximate Diameter, Edit Distance, Frechet Distance, Longest Common Subsequence etc.

In this talk I will give an overview of the current progress in this area of study, and will highlight some exciting new developments.

## Short Talks

**Shay Solomon** (IBM Research)

*Fully Dynamic Maximal Matching*

Graph matching is one of the most well-studied problems in combinatorial optimization, with a reach plethora of applications. Nevertheless, until recently little was known about this problem in real-life dynamic networks, which aim to model the constantly changing physical world.

Focusing on the maximal matching problem, there is a trivial linear time algorithm for computing such a matching in static networks, which naturally translates into an algorithm with a constant amortized cost in the incremental case. In this talk I will discuss some of the challenges that we overcame to obtain a constant amortized cost algorithm in the fully dynamic case.

**Ilya Razenshteyn** (Columbia)

*Practical Data-Dependent Metric Compression with Provable Guarantees*

How well can one compress a dataset of points from a high-dimensional space while preserving pairwise distances? Indyk and Wagner have recently obtained almost optimal bounds for this problem, but their construction (based on hierarchical clustering) is not practical. In this talk, I will show a new practical, quadtree-based compression scheme, whose provable performance essentially matches that of the result of Indyk and Wagner.

In addition to the theoretical results, we will see experimental comparison of the new scheme and Product Quantization (PQ)--one of the most popular heuristics for distance-preserving compression--on several datasets. Unlike PQ and other heuristics that rely on the clusterability of the dataset, the new algorithm ends up being more robust.

The talk is based on a joint work with Piotr Indyk and Tal Wagner.

**Sepideh Mahabadi **(Columbia)

*Set Cover in Sub-Linear Time*

We study the classic set cover problem from the perspective of sub-linear algorithms. Given access to a collection of $m$ sets over $n$ elements in the query model, we show that sub-linear algorithms derived from existing techniques have almost tight query complexities.

On one hand, first we show that an adaptation of streaming algorithms to the sub-linear query model returns an $\alpha$-approximate cover using $\tilde O(m(n/k)^{1/(\alpha-1)} + nk)$ queries to the input, where $k$ denotes the value of a minimum set cover. We then complement this upper bound by proving that for lower values of $k$, the required number of queries is $\tilde Omega(m(n/k)^{1/(2\alpha)})$, even for estimating the optimal cover size. Moreover, we prove that even checking whether a given collection of sets covers all the elements would require $\Omega(nk)$ queries. These two lower bounds provide strong evidence that the upper bound is almost tight for certain values of the parameter $k$. On the other hand, we show that this bound is not optimal for larger values of the parameter $k$, as there exists a $(1+\eps)$-approximation algorithm with $\tilde O(mn/k\eps^2)$ queries. We show that this bound is essentially tight for sufficiently small constant $\eps$, by establishing a lower bound of $\tilde Omega(mn/k)$ query complexity.

This is a joint work with Piotr Indyk, Ronitt Rubinfeld, Ali Vakilian, and Anak Yodpinyanee.

**Sandro Coretti **(New York University)

*Random Oracles and Non-Uniformity*

Hash functions are ubiquitous in cryptography. They are widely used in practice to build one-way functions (OWFs), collision-resistant hash functions (CRHFs), pseudorandom functions/generators (PRFs/PRGs), message authentication codes (MACs), etc. Moreover, they are often used together with other computational assumptions to prove the security of higher-level applications, such as Fiat-Shamir heuristics for signature schemes, full-domain-hash signatures, or OAEP encryption, among many others.

The security of such schemes is commonly analyzed in the random-oracle model (ROM), in which the hash function is modeled as a truly random function that can be queried by an attacker at a bounded number of points. The traditional ROM, however, does not account for the fact that a non-uniform attacker may obtain arbitrary advice about the hash function H before attacking a construction. As a result, bounds derived in the ROM do not accurately reflect security against such attackers.

To remedy this situation, Unruh (CRYPTO ’07) considered the auxiliary-input ROM (AI-ROM), in which an attacker may obtain (a bounded amount of) arbitrary leakage about the random oracle before attacking a cryptographic scheme. In this talk we discuss significant improvements to Unruh's presampling technique, which, intuitively, allows a leaky random oracle to be replaced by one that is adversarially fixed on a subset of the coordinates but uniformly random otherwise. We also show how this improved presampling leads to better and easily derivable AI-ROM security bounds for a number of cryptographic applications. We conclude by noting that in several cases (such as OWFs, PRGs, and CRHFs) there are still significant gaps between the derived bounds and the best attacks, leaving interesting open problems.

**Greg Bodwin **(MIT)

* Compressing Graphs While Preserving Their Distances*

Abstract TBA