In this paper, we apply new geometric and combinatorial methods to the study of phylogenetic mixtures. The focus of the geometric
approach is to describe the geometry of phylogenetic mixture distributions for the two state random cluster model, which is
a generalization of the two state symmetric (CFN) model. In particular, we show that the set of mixture distributions forms
a convex polytope and we calculate its dimension; corollaries include a simple criterion for when a mixture of branch lengths
on the star tree can mimic the site pattern frequency vector of a resolved quartet tree. Furthermore, by computing volumes
of polytopes we can clarify how “common” non-identifiable mixtures are under the CFN model. We also present a new combinatorial
result which extends any identifiability result for a specific pair of trees of size six to arbitrary pairs of trees. Next
we present a positive result showing identifiability of rates-across-sites models. Finally, we answer a question raised in
a previous paper concerning “mixed branch repulsion” on trees larger than quartet trees under the CFN model.
Keywords Phylogenetics - Model identifiability - Mixture model - Polytope - Discrete Fourier analysis
F.A. Matsen’s and M. Steel’s research was supported by the Allan Wilson Centre for Molecular Ecology and Evolution.
E. Mossel’s research was supported by a Sloan fellowship in Mathematics, NSF awards DMS 0528488 and DMS 0548249 (CAREER) and
by ONR grant N0014-07-1-05-06.