Bayesian Inference for Cluster Analysis of Microarray Data

Vera Cherepinsky(1), Marc Rejali(2) and Bud Mishra(3& 4)

(1) Mathematics Department, Courant Institute
(2) Bioinformatics Group, Courant Institute
(3) Professor of Computer Science & Mathematics, Courant Institute, NYU
(4) Professor, Watson School, CSHL

Abstract

We give a mathematically rigorous derivation of a correlation coefficient of two gene expression vectors based on James-Stein Shrinkage estimators. We initially assume that the mean is itself a zero-mean Normal random variable, and use Bayesian analysis to obtain its a posteriori distribution in terms of the data, subsequently estimating the parameters of the distributions involved. The results of hierarchical cluster analysis on the data set from [1] using our correlation coefficient are also provided.

[1] Eisen, M.B., Spellman, P.T., Brown, P.O., and Botstein, D. (1998) PNAS USA, 95: 14863-14868.