Genetic association studies in the presence of population structure and admixture
Seminar Room 1, Newton Institute
There has been considerable discussion among genetics researchers about the impact of population structure on association studies. When one samples from a population made up of subpopulations differing in disease risk and allele frequencies, estimates of disease association with a candidate locus can be exaggerated or attenuated. For example, any allele that occurs more frequently in a subpopulation with higher disease prevalence will potentially show a statistical association with the disease phenotype even if it is not linked to a causative locus. Different approaches have been proposed to address this issue. Attractive among these is that of Pritchard and coauthors (2000, 2001). They proposed modeling the subpopulations, classifying individuals accordingly and essentially pooling resulting stratified inference. Initial and most current methods adopting this approach mainly proceed sequentially and/or parametrically through clustering, classification and inference. We consider a unified semiparametric regression to model appropriately and to integrate out population structure in making association inference within a cohesive Bayesian framework. While this approach is feasible in additional situations (e.g., case-control studies), here we focus on the case of quantitative traits. Effectiveness of the proposed model and related Markov chain Monte Carlo computations is demonstrated via simulated data.