Multilevel modelling of proteomic mass-spectrometry data
Seminar Room 2, Newton Institute Gatehouse
Statistical methodology for the analysis of proteomic mass-spectrometry data is proposed using multilevel modelling. Each high-dimensional spectrum is represented using a near-orthogonal low dimensional basis of Gaussian functions. Multivariate mixed effect models are proposed in the lower dimensional space. In particular, differences between groups are investigated using fixed effect parameters, and individual variability of spectra is modelled using random effects. A deterministic peak fitting algorithm provides initial estimates of the near-orthogonal Gaussian basis, and the estimates are updated using a two-stage iterative method. The multilevel model is fitted using a parallel procedure for computational convenience. The methodology is applied to proteomic mass-spectrometry data from serum samples from melanoma patients categorized as Stage I or Stage IV, and significant locations of peaks are identified. Finally comparisons with other methods, including simple feature-based statistics and more complicated Bayesian Markov chain Monte Carlo inference are also made. This is joint work with William Browne (University of Bristol) and Kelly Handley (University of Birmingham).