Isaac Newton Institute for Mathematical Sciences

Mathematical and Statistical Aspects of Molecular Biology

A fully Bayesian method for estimating gene expression levels from Affymetrix GeneChip arrays

Authors: Anne-Mette Hein (imperial College, St. Mary's), Sylvia Richardson (imperial College, St. Mary's), Helen Causton (Imperial College, Hammersmith), Graeme Ambler and Peter Green (Bristol University)


We present a Bayesian Hierarchical model for calculating gene expression indexes from Affymetrix GeneChip arrays. The model uses both perfect match and mismatch probes, and allows for additive as well as multiplicative error. By explicitly accounting for non-specific hybridization in the model, background correction and normalization becomes an integral part of the analysis. Furthermore, we allow for the binding of a fraction of the perfect match cRNA population to the mismatch oligos. The fraction is estimated under the model. We first define a model for estimating gene expression levels from a single GeneChip array. Using MCMC methods we obtain the full posterior distributions for all the parameters, and functions hereof, in the model. Point estimates and credibility intervals for the gene expression indices are thus available. The model is readily extendable to situations where different conditions are considered, and replicate arrays are available for some or all of the conditions. In this case, the posterior distribution of condition specific gene expression indices are estimated directly, by a simultaneous consideration of replicate probe sets, avoiding averaging over estimates obtained from individual replicate arrays. Posterior distributions of the ranks of genes, and thus credibility intervals of these, with respect to degree of differential expression under two conditions, can be obtained.