Cytoplasmic post-transcriptional modification of mRNA transcripts in the form of polyadenylated (poly(A)) tails plays a key
role in their translational control. The timing and degree of polyadenylation has been shown to be due in part to a consensus
nucleotide sequence – cytoplasmic polyadenylation elements (CPEs) which can be detected by a polyadenylation element binding
protein (CPEB). An individual mRNA transcript controlled by CPEB may contain one or more CPE sites occurring upstream of a
consensus hexamer poly-(A) signal. A probabilistic model, CPEDetector, is presented for predicting whether or not a gene’s
translation is mediated by CPEB. CPEDetector takes into account detected CPE sites, poly-A sites, and distance metrics between
the detected locations. This approach is tested against the 3’ untranslated regions (UTRs) of known genes using the UTRdb
database.
Keywords CPE - CPEB - bioinformatics - hidden Markov model - context free grammar - untranslated region