G22.2433

G22.2433 DATABASE SYSTEMS


All projects much be approved by Prof. Libkin before March 26!

Projects can be roughly divided into two groups: Practical Projects (you can think of those as large programming assignments) and Theoretical Projects. A Theoretical Project can involve some programming work, too. For example, you can choose a nice datastructure used for indexing purposes, and implement it together with a library of most often used functions. As an example of a Practical Project, you can design a database for some real (or semi-real) application, implement it using a commercial package, and provide a library of useful queries. Some suggested, but not exclusive, topics are:

Database interface with Java using JDBC (e.g. writing a driver to access some obscure database and manage it using Java objects.)

OLAP: How to extract a representative sample of a database?

Web and databases: Is Web a database? Can it be queried using database techniques?

Geographical Information Systems: how can we store and query information about geographical objects?

Indexing techniques - we won't cover them in depth in this course; but you're welcome to learn more about this important subject.

Constraint Databases: how to store infinite data?

Heterogeneous Databases: How to query multiple databases?

Replication: You often need more than one copy of your data (disks do crash!), but what are the costs?

Temporal Databases: How to incorporate time? Are time stamps and time intervals equivalent?

Query optimization beyond conjunctive queries (or relational algebra): how to optimize aggregation, datalog queries, etc?

Similarity detection: You have two huge datasets; are they similar?

Datamining: you have a huge dataset; try to extract some meaningful information from it.

Incremental recomputation: You spent hours and hours computing a result of a query on a huge database, and now someone tells you a new record was insterted. What do you do? You probably can do better than recomputing your query from scratch, but how?

A variation on incremental recomputation: Suppose you learn about the new tuple sometime after it was inserted - now the old database isn't there! How do you adjust your incremental recomputation algorithms?

Object Oriented databases: how do we design them? How do we query them?