Identification of enriched regions in ChIP-seq and whole-genome sequencing data
Seminar Room 1, Newton Institute
An important statistical problem in analysis of ChIP-seq data is robust identification of both sharp peaks (e.g., for transcription factors) and broad regions (e.g., for many histone modifications) in the enrichment profile. I will describe a method based on model selection by maximum likelihood. This method is intuitive, fast, and can be extended easily to multi-sample cases. This method is also applicable to detection of copy number variations from whole-genome sequencing data. I will illustrate the applications of this method with examples from the Encyclopedia of DNA Elements (ENCODE) and the Cancer Genome Atlas (TCGA) data.