Design and Implementation of a Versatile MAML-compatible Microarray Database

Marc C. Rejali, Marco Antoniotti, Caroline Leventhal and Bud Mishra

Abstract

Microarray Technology has proven useful in many areas of biology due to its versatility and inherent parallelism. Consequently, this poses significant "Data Management" problems.

The NYU Bioinformatics Group is involved in developing new microarray-based approaches in several areas of genomics:

- Genome Wide BAC Mapping.
- Nitrogen Pathway in Arabidobpsis.
- Hallucinogen action on the 5HT-2A receptor.
- Cancer cell signalling studies based on the cocultivation of cells.

We developed the NYU MicroArray Database (NYUMAD), to address many specific needs of these wide array of approaches: e.g. "session management" and "cross experiment" data comparisons.

The underlying DB schema follows the MGED group specifications (http://www.ebi.ac.uk/microarray/MGED), especially regarding the XML-based MAML (Microarray Markup Language) exchange format.

A RDBMS (Postgresql on a Linux cluster) is the NYUMAD core and uses a schema based on MAML specification. This "Back Tier" system is accessed via JavaDBC.

The "Front Tier" comprises "clients" that manipulate data via the WWW (as MAML files) or the Java Serialization Protocol. Clients are Java applets (e.g. our NYUMAD Applet), custom-built user programs, or HTML/HTTP forms connecting to the "Middle Tier".

The "Middle Tier" comprises the servlets handling all requests and submissions, while ensuring data integrity and observance of constraints and security restrictions. Multiple back tier databases are accessible, allowing for distribution and scalability.