An Isaac Newton Institute Workshop

Recent Advances in Statistical Genetics and Bioinformatics

TECHNIQUES FOR THE DETECTION OF COPY NUMBER VARIATION USING SNP GENOTYPING ARRAYS

Author: Chris Barnes (Sanger Institute)

Abstract

The key goal of medical genetics is the search for the genetic variation responsible for disease. A major focus is on the use of single nucleotide polymorphism (SNP) genotyping arrays for genome wide association studies. However, recent studies suggest that copy number variation (CNV) accounts for a significant fraction of the total variation in the human genome. The susceptibility to a number of diseases, including HIV infection, is already known to be associated with copy number variants but the full functional and phenotypic impact of CNVs is not yet fully understood. In order to search for CNVs using SNP genotyping platforms we have developed a number of normalization schemes. These incorporate allele specific corrections, quantile normalization and corrections for the source, GC content and length of the PCR products. We have also developed methods of locating and categorizing CNVs using both existing algorithms, such as SWArray and CBS, and novel tools. These are implemented within a high throughput framework, essential for processing large datasets already available and from future projects. Here we present a map of common CNVs based on studies of a large set of healthy individuals.