skip to content

Modelling the boundaries of highly conserved non-coding DNA sequences in vertebrates

Presented by: 
K Walter [MRC Biostatistics Unit, Cambridge]
Tuesday 12th December 2006 - 10:00 to 10:45
Center for Mathematical Sciences

A comparison of the human and fish genomes produced more than 1000 highly conserved non-coding elements (CNEs), sequences of DNA that show a remarkable degree of similarity between human and fish despite an evolutionary distance of about 900 million years.

The high sequence conservation suggests that these CNEs possess some kind of function, though neither their function nor which part of their sequence is functional have been well defined yet. Since each CNE was defined by a pairwise sequence alignment, its boundary might not be accurate enough to design biological experiments to help identify its role in the genome. In a first step an examination of the CNE's nucleotide composition revealed a striking A+T pattern at the CNE boundary in fish as well as human.

In a step further we propose a probabilistic model that takes into consideration not only nucleotide composition but also phylogenetic information, and that aims to define the functional part of CNEs by using multiple sequence alignments of human, mouse, chicken, frog and fish.

University of Cambridge Research Councils UK
    Clay Mathematics Institute London Mathematical Society NM Rothschild and Sons