Research Summary
Goals
Ok, so I work on quite a few different projects,
not all of which are listed here.
A general pattern? I like puzzles.
Biological Computing
-
Our goal is to make it possible to discover patterns in strings,
trees, and graphs given a pattern metric.
On the way we have worked out or borrowed algorithms
to match a pattern against data and find a distance between the pattern
and the data.
We are really good at trees, about at the state of the art for strings,
and starting to do well in graphs especially acyclic ones.
We have software available by anonymous ftp.
Collaborators: Kaizhong Zhang (U of Western Ontario),
Tsong-Li Wang (NJ Institute of Technology), Tom Marr (Cold Spring Harbor),
and Bruce Shapiro (National Cancer Institute).
Typical papers:
1. A review of approximate tree matching algorithms by
(to appear in a book by Apostolico
and Galil)
postscript.
2. Approximate graph matching for acyclic graphs (submitted)
postscript.
3. Discovering patterns in protein sequences (appeared in Sigmod 94)
postscript.
Fault Tolerant Parallel Programming
-
Our Persistent Linda project extends the Linda system
developed primarily by Dave Gelernter and Nick Carriero at Yale.
We use a slightly weakened form of transaction combined with
checkpoints to support fault tolerance (appeared in Proc of 13th
Symp on Fault Tolerant Distributed Systems)
postscript.
It should also enable us to support an extended worm like
computational model similar to Piranha.
Collaborators: Brian Anderson, Karp Jeong, Suren Talla, Peter Wyckoff, Bin Li,
all students or former students at NYU
Real-Time Scheduling
-
We have developed algorithms for scheduling overloaded
real-time systems in a uniprocessor setting (to appear in Siam J. comp)
postscript
and in a multi-processor setting (appeared
in Theoretical Computer Science, July 1994)
postscript.
Collaborator: Gilad Koren
Thinksheet
-
A tool to tailor information flow for readers of complex (or boring) documents
such as laws and a problem-solving tool generalizing spreadsheets.
The tool integrates spreadsheets, rules, databases, and hypermedia.
Collaborators: Roman Yangarber, Peter Piatko, Daoi Lin, Minna Cha,
Dave Tanzer, Alex Shenker, Mike Leder, Julia Tolpin, Mirella Shannon,
and Chris Jones,
all students at NYU.
Only writeup so far is a patent app.
Database Internals Work
-
Buffer paging algorithms that beat LRU in several application areas.
Collaborators: Ted Johnson (U. of Florida), industrial collaborators
notably at database companies (appeared in VLDB 94)
postscript.
By the way, part of Johnson's thesis showed that it is a good idea
to be lazy when you're designing B-trees: as long as there are more
inserts than deletes, free-at-empty is a better strategy
than merge-at-half (appeared in
Journal of Computer Science and Systems, Aug. 1993)
postscript.
Reference Book: Database Tuning
-
Database tuning is the activity of making your database system run faster.
Though each vendor will tell you a different story about the subject,
it turns out that the underlying principles are the same.
If you are interested in
notes for Database Tuning: a principled approach
published by Prentice-Hall in 1992, then please let me know
at shasha@cs.nyu.edu.
I can send some to you in postscript.
Fun Books
-
Dr. Ecco, a mathematical detective cracks
mysteries by solving puzzles.
Some are combinatoric, e.g. what is the smallest number of people
who could be at a party in which everyone has shaken hands with three
other people, except one person who has only shaken hands with one other
person?
Others involve algorithmic aspects, including the
simplest zero knowledge protocols known to (wo)man.
The Puzzling Adventures of Dr. Ecco, 1988.
Codes, Puzzles, and Conspiracy, 1992.
Both published by W. H. Freeman, (1-)212-576-9400.