Computer Science Department

Computer Science Colloquium

THEORY DAY - Friday, December 13, 2002

New York University
Room 109 Warren Weaver Hall
251 Mercer Street
New York, NY 10012-1185

The IBM Research | NYU | Columbia Theory Day is a semi-annual conference aimed to bring together people in New York Metropolitan area for one day of interaction and discussion. The Theory Day features several (usually 4-5) hour-long presentations by leading theoretical computer scientists about state-of-the-art advances in various areas. Some presentations give a survey of the latest advances in some area, while others may concentrate on a particular (often ground-breaking!) result. The meeting is free and open to everyone; in particular, students are encouraged to attend.


9:30 - 10:00 Coffee and Bagels
10:00 - 10:55 Dr. Andrei Broder (IBM Research)
      Modern Applications of Bloom Filters
10:55 - 11:05 Short Break
11:05 - 12:00 Dr. Ronitt Rubinfield (NEC Labs)
      What Can We Do in Sublinear Time?
12:00 - 2:00 Lunch Break
2:00 - 2:55 Prof. Moses Charikar (Princeton)
      Compact Representations for Data
2:55 - 3:15 Coffee Break
3:15 - 4:10 Prof. Erik Demaine (MIT)
      Folding and Unfolding in Computational Geometry
4:15 - 5:15 Wine and Cheese Reception (13th floor lounge)


The Theory Day is held at Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, Auditorium 109, New York.

For directions, please see and (building 46)

To subscribe to our mailing list, follow instructions at

Yevgeniy Dodis, (212) 998-3084
Tal Rabin
Baruch Schieber

Colloquium Information:


Modern Applications of Bloom Filters

Dr. Andrei Broder
IBM Research

A Bloom filter is an ingenious randomized data-structure for concisely representing a set in order to support approximate membership queries. The space efficiency is achieved at the cost of a small probability of false positives. It was invented by Burton Bloom in 1970 for the purpose of spell checking and for many years it was seldom mentioned in unrelated contexts, except for database optimization. Nevertheless, Bloom's beautiful approach has seen a sudden resurgence in a variety of large-scale applications such as shared web caches, query routing, replica location, and stream data analysis. This talk presents a plethora of recent uses of this old data structure, its modern variants, and the mathematical basis behind them, with the aim of making these ideas available to a wider community and the hope of inspiring new applications.

This is an "algorithm engineering" talk, targeted at both theoreticians and practioners and only a general technical background is needed. It is based on a survey being jointly prepared with Michael Mitzenmacher (Harvard).

What Can We Do in Sublinear Time?

Dr. Ronitt Rubinfeld
NEC Laboratories America

We have long considered showing the existence of a linear time algorithm for a problem to be the gold standard of achievement. Indeed, it is hard to imagine doing any better than that, since for any nontrivial problem, any algorithm must consider all of the input in order to make a decision. However, as extremely large data sets grow more prevalent in a wide variety of settings, it is natural to wonder what one can do in *sublinear* time. In fact, there has been a lot of recent interest in this direction.

This talk will contain a brief, nonexhaustive survey of the types of problems that can be solved in sublinear time. Since any sublinear time algorithm can only view a small portion of the input, for most natural problems the answer must necessarily be approximate in some sense. We concentrate on such algorithms. We will see that there are classical optimization problems whose values can be approximated in sublinear time. In addition, property testing, an alternative notion of approximation for decision problems, has been applied to give sublinear algorithms for a wide variety of problems.

I will spend the rest of the talk describing joint work with Bernard Chazelle and Luca Trevisan, in which we consider sublinear time algorithms for approximating the weight of a minimum spanning tree. We give a probabilistic algorithm for estimating the weight of a minimum spanning tree in time O(dw log(w)), where d is the average degree of the graph and w is a bound on the ratio of the maximum weight edge to the minimum weight edge. In particular, for constant degree graphs in which the ratio w is constant, our running time will also be constant. We also show that our running time is nearly tight.

Compact Representations for Data

Prof. Moses Charikar
Princeton University

Several algorithmic techniques have been devised recently to deal with large volumes of data. At the heart of many of these techniques are ingenious schemes to represent data compactly. This talk will present some constructions of such compact representation schemes. We will see an application to identifying similar objects (e.g. text documents) in a large collection. We will also see how such schemes lead to efficient, one-pass algorithms for processing large volumes of data (streaming algorithms).

Folding and Unfolding in Computational Geometry

Prof. Erik Demaine

When can a linkage of rigid bars be untangled or folded into a desired configuration? What polyhedra can be cut along their surface and unfolded into a flat piece of paper without overlap? What shapes can result by folding a piece of paper flat and making one complete straight cut? Folding and unfolding is a branch of discrete and computational geometry that addresses these and many other intriguing questions. I will give a taste of the many results that have been proved in the past few years, as well as the several exciting open problems that remain open. Many folding problems have applications in areas including manufacturing, robotics, graphics, and protein folding.