A major computational problem in biology is the reconstruction of evolutionary (a.k.a. “phylogenetic”) trees from biomolecular
sequences. Most polynomial time phylogenetic reconstruction methods are distance-based, and take as input an estimation of the evolutionary distance between every pair of biomolecular sequences in the dataset.
The estimation of evolutionary distances is standardized except when the set of biomolecular sequences is “saturated”, which
means it contains a pair of sequences which are no more similar than two random sequences. In this case, the standard statistical
techniques for estimating evolutionary distances cannot be used. In this study we explore the performance of three important
distance-based phylogenetic reconstruction methods under the various techniques that have been proposed for estimating evolutionary
distances when the dataset is saturated.