Functional data analysis using a levy random fields model for multi-spectra peak identification and classification
Seminar Room 1, Newton Institute
We developed a novel approach for assessing proteomic differences between subjects of two treatment groups. Given multiple, high dimensional, proteomic profiles generated by Matrix Assisted Laser Desorption Ionization, Time-of-Flight mass spectrometers (MALDI-TOF MS), we used Bayesian nonparameteric methods to reduce the data to include only biologically relevant information upon which we based classification. We began by implementing a Levy random fields model that extracted pertinent features from individual spectra, and then extended this single spectrum model to incorporate data from multiple spectra. Specifically, we assert that one, m/z and resolution dependent, marked Gamma Process influences every, within-population, multi-modal spectrum and expect random, biological, or measurement error to force spectra to deviate from the process parameters. Under this assertion, a Bayesian hierarchical approach naturally models data quality control variables and peak parameters while leading to posterior predictions of experimental-group status.