
pmid: 28177088
Several well-documented evolutionary processes are known to cause conflict between species-level phylogenies and gene-level phylogenies. Three of the most challenging processes for species tree inference are incomplete lineage sorting, hybridization and gene duplication, which may result in unwarranted comparisons of paralogous genes. Several existing methods have dealt with these processes but none has yet been able to untangle all three at once. Here, we propose a stepwise method by which these processes can be discerned using information on genomic location coupled with coalescent simulations. In the first step, highly discordant genes within genomic blocks (putative paralogs) are identified and excluded from the data set and, in the second step, blocks of linked genes are grouped according to their hybrid history. Existing multispecies coalescent software can then be applied to recover the principal tree(s) that make up the species tree/network without violating the underlying model. The potential of the approach is evaluated on simulated data derived from a species network composed of nine species, of which one is of hybrid origin, and displaying a single-gene duplication that leads to paralogous comparisons. We apply our method to an empirical set of 12 genes from 7 species sampled in the plant genus Medicago that display phylogenetic discordance. We identify the causes of the discordance and demonstrate that the Medicago orbicularis lineage experienced an episode of ancient hybridization. Our results show promise as a new way to explore phylogenetic sequence data that can significantly improve species tree inference in presence of hybridization and undetected paralogy or other causes leading to extremely discordant gene trees. [Coalescent simulation; gene tree; genomic location; hybridization; incomplete lineage sorting; paralogy; phylogenetic incongruence; principal tree; species tree.].
Models, Genetic, Medicago, Hybridization, Genetic, Computer Simulation, Genome, Plant, Phylogeny, Software
Models, Genetic, Medicago, Hybridization, Genetic, Computer Simulation, Genome, Plant, Phylogeny, Software
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 13 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
