skip to content

Statistical challenges in using comparative genomics for the identification of functional sequences

Monday 31st March 2008 - 16:30 to 17:30
INI Seminar Room 1

There are two main aspects of comparative sequence analysis that rely on high-dimensional statistical approaches: identifying evolutionarily constrained regions and determining the significance of their overlap with functional sequences. The identification of constrained sequences largely relies on our understanding of evolutionary models and applying them to multi-sequence alignments. However, our understanding of evolutionary processes is incomplete and our ability to generate “perfect” multi-sequence alignments is hampered by incomplete sequence datasets and general uncertainty in the process; these factors can lead to multiple equally plausible alignments, only one of which is typically represented in downstream analyses. In order to mitigate some of these issues, we have been developing new comparative genomics approaches that take into account the biochemical physical properties of DNA, such that we can understand which substitutions are more “tolerable” with respect to the three dimensional structure of DNA, and thus more “neutral” in evolution. We also plan to start taking into account alignment uncertainty into our predictions of constrained sequences. Determining the significance of our improved sequence constraint methods relies on a new statistical approach for determining the significance of overlap with known functional annotations. This new method, devised by Peter Bickel and colleagues, was applied to analyses performed within the ENCODE consortium and provides the basis for newer methods that will be discussed later in this meeting.

Related Links

The video for this talk should appear here if JavaScript is enabled.
If it doesn't, something may have gone wrong with our embedded player.
We'll get it fixed as soon as possible.
Presentation Material: 
University of Cambridge Research Councils UK
    Clay Mathematics Institute London Mathematical Society NM Rothschild and Sons