Benchmarking variant callers in next-generation and third-generation sequencing analysis

descriptionPublicationkeyboard_double_arrow_right Article 23 Jul 2020 English Publisher:Oxford University Press (OUP)Journal:Briefings in Bioinformatics, volume 22 (issn: 1467-5463, eissn: 1477-4054,

Copyright policy )

Authors: Surui Pei; Tao Liu; Xue Ren; Weizhong Li 0001; Chongjian Chen; Zhi Xie;

doi: 10.1093/bib/bbaa148

pmid: 32698196

Benchmarking variant callers in next-generation and third-generation sequencing analysis

- Summary
- Subjects
- Metrics

Abstract

AbstractDNA variants represent an important source of genetic variations among individuals. Next- generation sequencing (NGS) is the most popular technology for genome-wide variant calling. Third-generation sequencing (TGS) has also recently been used in genetic studies. Although many variant callers are available, no single caller can call both types of variants on NGS or TGS data with high sensitivity and specificity. In this study, we systematically evaluated 11 variant callers on 12 NGS and TGS datasets. For germline variant calling, we tested DNAseq and DNAscope modes from Sentieon, HaplotypeCaller mode from GATK and WGS mode from DeepVariant. All the four callers had comparable performance on NGS data and 30× coverage of WGS data was recommended. For germline variant calling on TGS data, we tested DNAseq mode from Sentieon, HaplotypeCaller mode from GATK and PACBIO mode from DeepVariant. All the three callers had similar performance in SNP calling, while DeepVariant outperformed the others in InDel calling. TGS detected more variants than NGS, particularly in complex and repetitive regions. For somatic variant calling on NGS, we tested TNscope and TNseq modes from Sentieon, MuTect2 mode from GATK, NeuSomatic, VarScan2, and Strelka2. TNscope and Mutect2 outperformed the other callers. A higher proportion of tumor sample purity (from 10 to 20%) significantly increased the recall value of calling. Finally, computational costs of the callers were compared and Sentieon required the least computational cost. These results suggest that careful selection of a tool and parameters is needed for accurate SNP or InDel calling under different scenarios.

Related Organizations

Zhongshan Ophthalmic Center, Sun Yat-sen University
China (People's Republic of)
Sun Yat-sen University
China (People's Republic of)
Annoroad Gene Technology (China)
China (People's Republic of)

Keywords

Benchmarking, INDEL Mutation, Genome, Human, Computational Biology, High-Throughput Nucleotide Sequencing, Humans, Female, Databases, Nucleic Acid, Cell Line

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	84
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 1%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 1%

Found an issue? Give us feedback

84

Top 1%

Top 10%

Top 1%

hybrid

Fields of Science (3) View all

medical and health sciences

basic medicine

Fields of Science

medical and health sciences

basic medicine

View all