As far as protein-coding genes are concerned, there is a non-zero probability that at least one of the five possible overlapping
sequences of any gene will contain an open-reading frame (ORF) of a length that may be suitable for coding a functional protein.
It is, however, very difficult to determine whether or not such an ORF is functional. Recently, we proposed a method that
predicts functionality of an overlapping ORF if it can be shown that it has been subject to purifying selection during its
evolution. Here, we use simulation to test this method under several conditions and compare it with the method of Firth and
Brown. We found that under most conditions, our method detects functional overlapping genes with higher sensitivity than Firth
and Brown’s method, while maintaining high specificity. Further, we tested the hypothesis that the two aminoacyl tRNA synthetase
classes have originated from a pair of overlapping genes. A central piece of evidence ostensibly supporting this hypothesis
is the assertion that an overlapping ORF of a heat-shock protein-70 gene, which exhibits some similarity to class 2 aminoacyl
tRNA synthetases, is functional. We found signature of purifying selection only in highly divergent sequences, suggesting
that the method yields false-positives in high sequence divergence and that the overlapping ORF is not a functional gene.
Finally, we examined three cases of overlap in the human genome. We find varying signatures of purifying selection acting
on these overlaps, raising the possibility that two of the overlapping genes may not be functional.
Keywords Overlapping genes - Purifying selection - Annotation
An abstract of this paper accompanied a poster presentation at the 13th Annual International Conference on Computational Molecular
Biology (RECOMB) in May 2009.