
Abstract Rapidly evolving RNA viruses continuously produce minority haplotypes that can become dominant if they are drug-resistant or can better evade the immune system. Therefore, early detection and identification of minority viral haplotypes may help to promptly adjust the patient’s treatment plan preventing potential disease complications. Minority haplotypes can be identified using next-generation sequencing, but sequencing noise hinders accurate identification. The elimination of sequencing noise is a non-trivial task that still remains open. Here we propose CliqueSNV based on extracting pairs of statistically linked mutations from noisy reads. This effectively reduces sequencing noise and enables identifying minority haplotypes with the frequency below the sequencing error rate. We comparatively assess the performance of CliqueSNV using an in vitro mixture of nine haplotypes that were derived from the mutation profile of an existing HIV patient. We show that CliqueSNV can accurately assemble viral haplotypes with frequencies as low as 0.1% and maintains consistent performance across short and long bases sequencing platforms.
SARS-CoV-2, COVID-19, Computational Biology, High-Throughput Nucleotide Sequencing, Reproducibility of Results, HIV Infections, Polymorphism, Single Nucleotide, Sensitivity and Specificity, RNA Virus Infections, Gene Frequency, Haplotypes, Mutation, HIV-1, Methods Online, Humans, RNA Viruses, Algorithms
SARS-CoV-2, COVID-19, Computational Biology, High-Throughput Nucleotide Sequencing, Reproducibility of Results, HIV Infections, Polymorphism, Single Nucleotide, Sensitivity and Specificity, RNA Virus Infections, Gene Frequency, Haplotypes, Mutation, HIV-1, Methods Online, Humans, RNA Viruses, Algorithms
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 64 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 1% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
