Research Summary

Goals

Ok, so I work on quite a few different projects, not all of which are listed here. A general pattern? I like puzzles.

Our goal is to make it possible to discover patterns in strings, trees, and graphs given a pattern metric. On the way we have worked out or borrowed algorithms to match a pattern against data and find a distance between the pattern and the data. We are really good at trees, about at the state of the art for strings, and starting to do well in graphs especially acyclic ones. We have software available by anonymous ftp. Collaborators: Kaizhong Zhang (U of Western Ontario), Tsong-Li Wang (NJ Institute of Technology), Tom Marr (Cold Spring Harbor), and Bruce Shapiro (National Cancer Institute). Typical papers:
1. A review of approximate tree matching algorithms by (to appear in a book by Apostolico and Galil) postscript.
2. Approximate graph matching for acyclic graphs (submitted) postscript.
3. Discovering patterns in protein sequences (appeared in Sigmod 94) postscript.

Our Persistent Linda project extends the Linda system developed primarily by Dave Gelernter and Nick Carriero at Yale. We use a slightly weakened form of transaction combined with checkpoints to support fault tolerance (appeared in Proc of 13th Symp on Fault Tolerant Distributed Systems) postscript. It should also enable us to support an extended worm like computational model similar to Piranha. Collaborators: Brian Anderson, Karp Jeong, Suren Talla, Peter Wyckoff, Bin Li, all students or former students at NYU

We have developed algorithms for scheduling overloaded real-time systems in a uniprocessor setting (to appear in Siam J. comp) postscript and in a multi-processor setting (appeared in Theoretical Computer Science, July 1994) postscript. Collaborator: Gilad Koren

A tool to tailor information flow for readers of complex (or boring) documents such as laws and a problem-solving tool generalizing spreadsheets. The tool integrates spreadsheets, rules, databases, and hypermedia. Collaborators: Roman Yangarber, Peter Piatko, Daoi Lin, Minna Cha, Dave Tanzer, Alex Shenker, Mike Leder, Julia Tolpin, Mirella Shannon, and Chris Jones, all students at NYU. Only writeup so far is a patent app.

Buffer paging algorithms that beat LRU in several application areas. Collaborators: Ted Johnson (U. of Florida), industrial collaborators notably at database companies (appeared in VLDB 94) postscript. By the way, part of Johnson's thesis showed that it is a good idea to be lazy when you're designing B-trees: as long as there are more inserts than deletes, free-at-empty is a better strategy than merge-at-half (appeared in Journal of Computer Science and Systems, Aug. 1993) postscript.

Database tuning is the activity of making your database system run faster. Though each vendor will tell you a different story about the subject, it turns out that the underlying principles are the same. If you are interested in notes for Database Tuning: a principled approach published by Prentice-Hall in 1992, then please let me know at shasha@cs.nyu.edu. I can send some to you in postscript.

Dr. Ecco, a mathematical detective cracks mysteries by solving puzzles. Some are combinatoric, e.g. what is the smallest number of people who could be at a party in which everyone has shaken hands with three other people, except one person who has only shaken hands with one other person? Others involve algorithmic aspects, including the simplest zero knowledge protocols known to (wo)man. The Puzzling Adventures of Dr. Ecco, 1988. Codes, Puzzles, and Conspiracy, 1992. Both published by W. H. Freeman, (1-)212-576-9400.