# Colloquium Details

## The IBM Research / NYU / Columbia Theory Day

**Speaker:**
The IBM Research / NYU / Columbia Theory Day

**Location:**
Schapiro (CEPSR) Building, Columbia University 412

**Date:**
May 7, 2010, 9:30 a.m.

**Host:** Columbia University, New York

**Synopsis:**

Program

9:30 - 10:00 Coffee and bagels

10:00 - 10:55 Dr. Shai Halevi (IBM Research) On Homomorphic Encryption and Secure Computation

10:55 - 11:05 Short break

11:05 - 12:00 Prof. Claire Mathieu (Brown University) Recognizing Well-Parenthesized Expressions in the Streaming Model

12:00 - 2:00 Lunch break

2:00 - 2:55 Dr. Alex Andoni (Princeton) Polylogarithmic Approximation to Edit Distance (or, the Asymmetric Query Complexity)

2:55 - 3:15 Coffee break

3:15 - 4:10 Dr. John Langford (Yahoo Research) The Foundations of Learning from Exploration Data

For directions, please see http://www.cs.columbia.edu/theory/directions.html

To subscribe to our mailing list, follow instructions at http://www.cs.nyu.edu/mailman/listinfo/theory-ny

Organizers: Yevgeniy Dodis dodis@cs.nyu.edu Cliff Stein stein.cliff@gmail.com Tal Rabin talr@us.ibm.com Baruch Schieber sbar@us.ibm.com

==================================================================

Abstracts

On Homomorphic Encryption and Secure Computation

Dr. Shai Halevi (IBM Research)

A homomorphic encryption scheme is one that allows computing on encrypted data, such that the result of the computation can still be decrypted. I will talk about recent developments in constructing (fully) homomorphic encryption schemes, and relations with protocols for secure function evaluation. Specifically, I will:

* Describe the approach that underlies Gentry's recent homomorphic encryption scheme and all its variants, and illustrate a few different instantiations of this approach.

* Survey some easy (but often ignored) connections between homomorphic encryption schemes and protocols for two-party secure function evaluation.

* Show how to extend homomorphic encryption schemes to a "multi hop" setting: In this setting, several parties sequentially compute on the same encrypted data, and we want to allow each party to use not only the original encrypted data but also the results of prior computations.

==================================================================

Recognizing Well-Parenthesized Expressions in the Streaming Model

Dr. Claire Mathieu (Brown University)

Motivated by a concrete problem and with the goal of understanding the sense in which the complexity of streaming algorithms is related to the complexity of formal languages, we investigate the problem Dyck(s) of checking matching parentheses, with s different types of parenthesis. We present a one-pass randomized streaming algorithm for Dyck(2) with space O( sqrt{n} log n), time per letter polylog(n), and one-sided error. We prove that this one-pass algorithm is optimal, up to a polylog(n) factor, even when two-sided error is allowed. For the lower bound, we prove a direct sum result on hard instances by following the "information cost" approach, but with a few twists. Indeed, we play a subtle game between public and private coins. This mixture between public and private coins results from a balancing act between the direct sum result and a combinatorial lower bound for the base case. Surprisingly, the space requirement shrinks drastically if we have access to the input stream in reverse. We present a two-pass randomized streaming algorithm for Dyck(2) with space O((log n)^2), time polylog(n) and one-sided error, where the second pass is in the reverse direction. Both algorithms can be extended to Dyck(s) since this problem is reducible to Dyck(2) for a suitable notion of reduction in the streaming model.

This is joint work with Frederic Magniez and Ashwin Nayak.

==================================================================

Polylogarithmic Approximation to Edit Distance (or, the Asymmetric Query Complexity)

Dr. Alex Andoni (Princeton)

We present a near-linear time algorithm that approximates the edit distance between two strings within a polylogarithmic factor. This is an exponential improvement over the previously known bounds.

Our new algorithm emerges from the investigation of the edit distance within a new framework, namely a model of asymmetric queries. Within this framework, we are able to maintain a parallel view of the upper and lower bounds, leading to near-tight query complexity bounds. Our investigation also yields the first rigorous separation between the edit distance and the Ulam distance (edit distance on permutations), thus uncovering further hardness phenomena inherent to the edit distance, beyond what the previous analyses have revealed.

I will talk about some arising open questions.

Joint work with Robert Krauthgamer and Krzysztof Onak.

==================================================================

The Foundations of Learning from Exploration Data

Dr. John Langford (Yahoo Research)

It is very natural to wish to apply machine learning to the large amounts of data generated by user interactions, but the process turns out to be delicate. For example, if a news story snippet is shown to a user, and the user clicks on (and reads) it, this event is fundamentally _not_ equivalent to a multiclass label for several reasons. For example, the user might easily have been interested in other (unseen) stories as well. To deal with these issues, we propose using the contextual bandit setting, where on each round information is used to choose an action, and feedback about just this action is observed. This setting has the great virtue that it's tractable, with algorithms enjoying regret and sample complexity guarantees entirely comparable to what's possible in a more familiar supervised learning setting.

**Notes:**

In-person attendance only available to those with active NYU ID cards.