
Abstract The rooting of the SARS-CoV-2 phylogeny is important for understanding the origin and early spread of the virus. Previously published phylogenies have used different rootings that do not always provide consistent results. We investigate several different strategies for rooting the SARS-CoV-2 tree and provide measures of statistical uncertainty for all methods. We show that methods based on the molecular clock tend to place the root in the B clade, whereas methods based on outgroup rooting tend to place the root in the A clade. The results from the two approaches are statistically incompatible, possibly as a consequence of deviations from a molecular clock or excess back-mutations. We also show that none of the methods provide strong statistical support for the placement of the root in any particular edge of the tree. These results suggest that phylogenetic evidence alone is unlikely to identify the origin of the SARS-CoV-2 virus and we caution against strong inferences regarding the early spread of the virus based solely on such evidence.
570, Coronaviruses, Evolution, Mutation, Missense, Evolutionary biology, Genome, Viral, SARS-CoV-2 phylogeny, Evolution, Molecular, Genetic, Models, Genetics, outgroup rooting, Animals, Humans, Viral, Molecular Biology, Ecology, Evolution, Behavior and Systematics, Discoveries, Phylogeny, Evolutionary Biology, Likelihood Functions, Genome, Models, Statistical, Models, Genetic, SARS-CoV-2, Uncertainty, Molecular, COVID-19, Bayes Theorem, Biological Sciences, Statistical, molecular clock rooting, Markov Chains, Infectious Diseases, Emerging Infectious Diseases, Biochemistry and cell biology, Mutation, RNA, RNA, Viral, Biochemistry and Cell Biology, Missense, Monte Carlo Method, Algorithms
570, Coronaviruses, Evolution, Mutation, Missense, Evolutionary biology, Genome, Viral, SARS-CoV-2 phylogeny, Evolution, Molecular, Genetic, Models, Genetics, outgroup rooting, Animals, Humans, Viral, Molecular Biology, Ecology, Evolution, Behavior and Systematics, Discoveries, Phylogeny, Evolutionary Biology, Likelihood Functions, Genome, Models, Statistical, Models, Genetic, SARS-CoV-2, Uncertainty, Molecular, COVID-19, Bayes Theorem, Biological Sciences, Statistical, molecular clock rooting, Markov Chains, Infectious Diseases, Emerging Infectious Diseases, Biochemistry and cell biology, Mutation, RNA, RNA, Viral, Biochemistry and Cell Biology, Missense, Monte Carlo Method, Algorithms
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 38 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
