We describe a Bayesian inference method for the identification of protein coding regions (active or residual) in DNA or RNA
sequences. Its main feature is the computation of the conditional and a priori probabilities required in Bayes’s formula by factoring each event (possible annotation) for a nucleotide string into the
concatenation of shorter events, believed to be independent.The factoring allows us to obtain fast but reliable estimates
for these parameters from readily available databases; whereas the probability estimation for unfactored events would require
databases and tables of astronomical size. Promising results were obtained in tests with natural and artificial genomes.
Keywords coding regions - ab-initio DNA tagging - Bayesian inference