Grand Challenges in Phylogenomics

Date: 

Friday, October 16, 2015, 4:00pm

Location: 

Conference Room, Program for Evolutionary Dynamics, One Brattle Square, Suite 6

PED Seminar Series Presents

Grand Challenges in Phylogenomics

Dr. Tandy Warnow

Estimating the Tree of Life will likely involve a two-step procedure, where in the first step trees are estimated on many genes, and then the gene trees are combined into a tree on all the taxa. However, the true gene trees may not agree with the species tree due to biological processes such as deep coalescence, gene duplication and loss, and horizontal gene transfer. Statistically consistent methods based on the multi-species coalescent model have been developed to estimate species trees in the presence of incomplete lineage sorting; however, the relative accuracy of these methods compared to the usual "concatenation" approach is a matter of substantial debate within the research community.

I will present results showing that coalescent-based estimation methods are impacted by gene tree estimation error, so that they can be less accurate than concatenation in many cases. I will also present two new methods (ASTRAL and statistical binning) for estimating species trees in the presence of gene tree conflict due to ILS that are more accurate than current methods. Key to these methods is addressing gene tree estimation error more effectively. Finally, I will present open problems in this area.

ASTRAL (Mirarab et al., Bioinformatics 2014) was used in the Thousand Plant Transcriptome Project (Wickett et al., PNAS 2014), and Statistical Binning (Mirarab et al., Science 2014) was used in the Avian Phylogenomics Project (Jarvis, Mirarab, et al., Science 2014).