A new tree using additional genomes is currently in progress.

Until then, here’s the most recent one (published alongside the description and genome of C. monodelphis (formally C. sp. 1) in BMC Zoology):


Overview of phylogenomic analysis

We collected all the predicted proteins for each species and removed all non-longest isoforms, such that each gene was represented by a single protein. We then clustered the proteins into orthologous groups using OrthoFinder. We selected 303 clusters which represented single copy genes (allowing for a proportion of species to have missing data). The proteins belonging to each cluster were aligned, the alignments were then trimmed and concatenated to yield a supermatrix. This was used as an input to RAxML and PhyloBayes, both of which yielded an identical topology. Bootstrap values below 100 are noted.

Don’t see the species your interested in?

Karin Kiontke (New York University) has been collecting the sequences of rRNA genes and a handful of protein coding genes for many more species than those with sequenced genomes.

Don’t agree with the topology?

Tell us! Simply having whole-genome data does not guarantee an accurate topology. For example, C. sp. 31 was believed to be a member of the Drosophilae supergroup, but our tree placed it as sister to the Elegans supergroup. Our colleagues have indicated to us that C. sp. 31’s placement within the Drosophilae supergroup seems to fit better with their independent analyses of these genomes. This kind of information invaluable as we try to pin down exactly how all of these species are related.