Lecture Notes in Computer Science, 2002, Volume 2452/2002, 220-234, DOI: 10.1007/3-540-45784-4_17

Comparative Methods for Gene Structure Prediction in Homologous Sequences

Christian N.S. Pedersen and Tejs Scharling

View Related Documents

Abstract

The increasing number of sequenced genomes motivates the use of evolutionary patterns to detect genes. We present a series of comparative methods for gene finding in homologous prokaryotic or eukaryotic sequences. Based on a model of legal genes and a similarity measure between genes, we find the pair of legal genes of maximum similarity. We develop methods based on genes models and alignment based similarity measures of increasing complexity, which take into account many details of real gene structures, e.g. the similarity of the proteins encoded by the exons. When using a similarity measure based on an exiting alignment, the methods run in linear time. When integrating the alignment and prediction process which allows for more fine grained similarity measures, the methods run in quadratic time. We evaluate the methods in a series of experiments on synthetic and real sequence data, which show that all methods are competitive but that taking the similarity of the encoded proteins into account really boost the performance.
Partially supported by the Future and Emerging Technologies programme of the EU under contract number IST-1999-14186 (ALCOM-FT).
Bioinformatics Research Center (BiRC), www.birc.dk, funded by Aarhus University Research Fundation.

Fulltext Preview

Image of the first page of the fulltext document