
pmid: 37601254
pmc: PMC10439718
handle: 2268/303637 , 11480/16567 , 20.500.12483/8195 , 20.500.11850/629727
pmid: 37601254
pmc: PMC10439718
handle: 2268/303637 , 11480/16567 , 20.500.12483/8195 , 20.500.11850/629727
Recent developments in high-throughput sequencing (HTS) technologies and bioinformatics have drastically changed research in virology, especially for virus discovery. Indeed, proper monitoring of the viral population requires information on the different isolates circulating in the studied area. For this purpose, HTS has greatly facilitated the sequencing of new genomes of detected viruses and their comparison. However, bioinformatics analyses allowing reconstruction of genome sequences and detection of single nucleotide polymorphisms (SNPs) can potentially create bias and has not been widely addressed so far. Therefore, more knowledge is required on the limitations of predicting SNPs based on HTS-generated sequence samples. To address this issue, we compared the ability of 14 plant virology laboratories, each employing a different bioinformatics pipeline, to detect 21 variants of pepino mosaic virus (PepMV) in three samples through large-scale performance testing (PT) using three artificially designed datasets. To evaluate the impact of bioinformatics analyses, they were divided into three key steps: reads pre-processing, virus-isolate identification, and variant calling. Each step was evaluated independently through an original, PT design including discussion and validation between participants at each step. Overall, this work underlines key parameters influencing SNPs detection and proposes recommendations for reliable variant calling for plant viruses. The identification of the closest reference, mapping parameters and manual validation of the detection were recognized as the most impactful analysis steps for the success of the SNPs detections. Strategies to improve the prediction of SNPs are also discussed.
570, QH301-705.5, Bioinformatics, Genetics & genetic processes, [SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry, virus, Genome, Viral, Polymorphism, Single Nucleotide, 630, Génétique & processus génétiques, Agriculture & agronomie, 616, [SDV.BV]Life Sciences [q-bio]/Vegetal Biology, Humans, [SDV.BV] Life Sciences [q-bio]/Vegetal Biology, Biology (General), Variant, [INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM], Bioinformatic, Molecular Biology/Genomics [q-bio.GN], Bioinformatic; Genomic; Virus; Plant; Variant, bioinformatic, ta1183, R, High-Throughput Nucleotide Sequencing, Computational Biology, Plant, Agriculture & agronomy, Life sciences, Virus, Knowledge, variant, Sciences du vivant, Genomic, [SDV.BBM.GTP] Life Sciences [q-bio]/Biochemistry, Molecular Biology/Genomics [q-bio.GN], Medicine, [INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM], SNPs
570, QH301-705.5, Bioinformatics, Genetics & genetic processes, [SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry, virus, Genome, Viral, Polymorphism, Single Nucleotide, 630, Génétique & processus génétiques, Agriculture & agronomie, 616, [SDV.BV]Life Sciences [q-bio]/Vegetal Biology, Humans, [SDV.BV] Life Sciences [q-bio]/Vegetal Biology, Biology (General), Variant, [INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM], Bioinformatic, Molecular Biology/Genomics [q-bio.GN], Bioinformatic; Genomic; Virus; Plant; Variant, bioinformatic, ta1183, R, High-Throughput Nucleotide Sequencing, Computational Biology, Plant, Agriculture & agronomy, Life sciences, Virus, Knowledge, variant, Sciences du vivant, Genomic, [SDV.BBM.GTP] Life Sciences [q-bio]/Biochemistry, Molecular Biology/Genomics [q-bio.GN], Medicine, [INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM], SNPs
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 5 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
