Proteomics data analysis
Seminar Room 2, Newton Institute Gatehouse
Within the context of expression proteomics, we developed a novel approach to identify and assess meaningful differences in functional datasets. Given multiple proteomic profiles (generated by a Matrix Assisted Laser Desorption Ionization Time-of-Flight Mass Spectrometer) from subjects who belonged to one of two treatment groups, we extracted and classified biologically relevant information using Bayesian nonparametric methods. We modelled f(t), the mean ion abundance per spectrum, via an adaptive kernel regression approach, and relied on an underlying Levy random field to control model complexity. We began by implementing a Levy random fields model for an individual spectrum, and extended it hierarchically to include data from multiple spectra. To make the extension, we asserted that each multi-modal spectrum depended upon one, time and resolution dependent, marked Gamma process, but was unique for reasons including random, biological or measurement error. Upon eliciting parameter prior distributions, we designed a Markov chain Monte Carlo algorithm that enabled exploration of a trans-dimensional model space and posterior predictions of experimental-group status.