
Summary methods are widely used to reconstruct species trees from gene tres while accommodating discordance from incomplete lineage sorting; however, it is increasingly recognized that their accuracy can be negatively impacted by incomplete and/or error-ridden gene trees. To address the latter, Zhang and Mirarab (2022) updated the popular summary method ASTRAL so that it weights quartets based on gene tree branch lengths and support values. The implementation of these weighting schemes presented computational challenges, leading Zhang and Mirarab (2022) to replace ASTRAL's original algorithm (i.e., computing an exact solution within a constrained search space) in favor of search heuristics based on phylogenetic placement. Here, we show that these weighting schemes can be effectively leveraged within the Quartet Max Cut framework of Snir and Rao (2010), introducing weighted TREE-QMC. The incorporation of weighting schemes into TREE-QMC required only a small increase in time complexity compared to the unweighted algorithm; fortunately, the increase in runtime was also small, behaving more like a constant factor in our simulation study. Moreover, weighted TREE-QMC was fast and highly competitive with weighted ASTRAL, even outperforming it in terms of species tree accuracy on some challenging simulation conditions, such as large numbers of taxa. In reanalyzing two avian data sets, we found that weighting quartets by gene tree branch lengths can improve robustness to systematic homology errors and can be as effective as removing the impacted taxa from individual gene trees or removing the impacted gene trees entirely. Lastly, our study revealed that TREE-QMC was robust to extreme rates of missing taxa, suggesting its utility as a supertree method.
Funding provided by: State of MarylandROR ID: https://ror.org/04ja8je85Award Number:
missing data, species trees, Summary methods, homology error, gene tree error, quartets
missing data, species trees, Summary methods, homology error, gene tree error, quartets
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
