A statistical method for the detection of variants from next-generation resequencing of DNA pools

descriptionPublicationkeyboard_double_arrow_right Article 01 Jun 2010 English Publisher:Oxford University Press (OUP)Journal:Bioinformatics, volume 32, pages 3,213-3,213 (issn: 1367-4803, eissn: 1367-4811,

Copyright policy )

Authors: Vikas, Bansal; Vikas, Bansal; Ondrej, Libiger;

doi: 10.1093/bioinformatics/btw520 , 10.1093/bioinformatics/btq214

pmid: 20529923 , 27578802

pmc: PMC2881398 , PMC5048073

A statistical method for the detection of variants from next-generation resequencing of DNA pools

- Summary
- Subjects
- Metrics

Abstract

Abstract Motivation: Next-generation sequencing technologies have enabled the sequencing of several human genomes in their entirety. However, the routine resequencing of complete genomes remains infeasible. The massive capacity of next-generation sequencers can be harnessed for sequencing specific genomic regions in hundreds to thousands of individuals. Sequencing-based association studies are currently limited by the low level of multiplexing offered by sequencing platforms. Pooled sequencing represents a cost-effective approach for studying rare variants in large populations. To utilize the power of DNA pooling, it is important to accurately identify sequence variants from pooled sequencing data. Detection of rare variants from pooled sequencing represents a different challenge than detection of variants from individual sequencing. Results: We describe a novel statistical approach, CRISP [Comprehensive Read analysis for Identification of Single Nucleotide Polymorphisms (SNPs) from Pooled sequencing] that is able to identify both rare and common variants by using two approaches: (i) comparing the distribution of allele counts across multiple pools using contingency tables and (ii) evaluating the probability of observing multiple non-reference base calls due to sequencing errors alone. Information about the distribution of reads between the forward and reverse strands and the size of the pools is also incorporated within this framework to filter out false variants. Validation of CRISP on two separate pooled sequencing datasets generated using the Illumina Genome Analyzer demonstrates that it can detect 80–85% of SNPs identified using individual sequencing while achieving a low false discovery rate (3–5%). Comparison with previous methods for pooled SNP detection demonstrates the significantly lower false positive and false negative rates for CRISP. Availability: Implementation of this method is available at http://polymorphism.scripps.edu/∼vbansal/software/CRISP/ Contact: vbansal@scripps.edu

Related Organizations

Scripps Health
United States

Keywords

Genome, Base Sequence, Data Interpretation, Statistical, Genetic Variation, Sequence Analysis, DNA, Corrigendum, Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa, Polymorphism, Single Nucleotide, Algorithms

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	150
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 1%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 1%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 1%

Found an issue? Give us feedback

150

Top 1%

Green

gold

Fields of Science (3) View all

medical and health sciences

basic medicine

Fields of Science

medical and health sciences

basic medicine

View all