
The recent reclassification of the Riboviria, and the introduction of multiple new taxonomic categories including both subfamilies and subgenera for coronaviruses (family Coronaviridae, subfamily Orthocoronavirinae), represents a major shift in how official classifications are used to designate specific viral lineages. While the newly defined subgenera provide much-needed standardization for commonly cited viruses of public health importance, no method has been proposed for the assignment of subgenus based on partial sequence data, or for sequences that are divergent from the designated holotype reference genomes. Here, we describe the genetic variation of a 387 nt region of the coronavirus RNA-dependent RNA polymerase (RdRp), which is one of the most used partial sequence loci for both detection and classification of coronaviruses in molecular epidemiology. We infer Bayesian phylogenies from more than 7000 publicly available coronavirus sequences and examine clade groupings relative to all subgenus holotype sequences. Our phylogenetic analyses are largely coherent with whole-genome analyses based on designated holotype members for each subgenus. Distance measures between sequences form discrete clusters between taxa, offering logical threshold boundaries that can attribute subgenus or indicate sequences that are likely to belong to unclassified subgenera both accurately and robustly. We thus propose that partial RdRp sequence data of coronaviruses are sufficient for the attribution of subgenus-level taxonomic classifications and we supply the R package, MyCoV, which provides a method for attributing subgenus and assessing the reliability of the attribution.
Gene Expression Regulation, Viral, RdRp, coronavirus, bats, bat, severe acute respiratory syndrome, RNA-dependent RNA polymerase, middle east resipiratory syndrome, Viral Proteins, MERS, Virology, Chiroptera, Animalia, Chordata, Biology, Phylogeny, SARS, [SDV.MP.VIR] Life Sciences [q-bio]/Microbiology and Parasitology/Virology, Recombination, Genetic, Base Sequence, Biodiversity, RNA-Dependent RNA Polymerase, taxonomy. Abbreviations: CoV, phylogenetics, Coronavirus, Mammalia, Human medicine, Engineering sciences. Technology, Research Article
Gene Expression Regulation, Viral, RdRp, coronavirus, bats, bat, severe acute respiratory syndrome, RNA-dependent RNA polymerase, middle east resipiratory syndrome, Viral Proteins, MERS, Virology, Chiroptera, Animalia, Chordata, Biology, Phylogeny, SARS, [SDV.MP.VIR] Life Sciences [q-bio]/Microbiology and Parasitology/Virology, Recombination, Genetic, Base Sequence, Biodiversity, RNA-Dependent RNA Polymerase, taxonomy. Abbreviations: CoV, phylogenetics, Coronavirus, Mammalia, Human medicine, Engineering sciences. Technology, Research Article
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 15 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
