skip to content

The evolution of promoter sequence

Monday 31st March 2008 - 10:00 to 11:00
INI Seminar Room 1

We have produced an evolutionary model for promoters (and more generally for genomic regulatory sequence) analogous to the commonly used synonymous/nonsynonymous mutation models for protein-coding sequence. Although our model, called Sunflower, relies on some simple assumptions, it captures enough of the biology of transcription factor action to show clear correlation with other biological features. Sunflower predicts a binding profile of transcription factors to DNA sequence, in which different factors compete for the same potential binding sites. Sunflower can also model cooperative binding. We can control the apparent concentration of the factors by setting parameters uniformly or from gene expression data. The parameterized model simultaneously estimates a continuous measurement of binding occupancy across the genomic sequence for each factor. We can then introduce either a localized mutation (such as a SNP) or a coordinated set of mutations (for example, from a haplotype or another species), rerun the binding model and record the difference in binding profiles using their relative entropy. A single mutation can alter interactions both upstream and downstream of its position due to potential overlapping binding sites, and our statistic captures this domino effect.

Results from Sunflower show many features in agreement with known biology. For example, the overall binding occupancy rises over transcription start sites, and CpG desert promoters show sharper localization signals relative to the transcription start site. More interesting are correlates to variation both between species and within them. Over evolutionary time, we observe a clear excess of low- scoring mutations fixed in promoters, consistent with most changes being neutral. However, this is not consistent across all promoters, and some promoters show more rapid divergence. This divergence often occurs in the presence of relatively constant protein coding divergence. Interestingly, different classes of promoters show different sensitivity to mutations, with developmental and immunological genes having promoters inherently more sensitive to mutations than housekeeping genes.

The video for this talk should appear here if JavaScript is enabled.
If it doesn't, something may have gone wrong with our embedded player.
We'll get it fixed as soon as possible.
Presentation Material: 
University of Cambridge Research Councils UK
    Clay Mathematics Institute London Mathematical Society NM Rothschild and Sons