descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 16 Jun 2022Publisher:Cold Spring Harbor LaboratoryJournal:Bioinformatics, volume 39 (eissn: 1367-4811,

Authors: Xiyu Peng; Karin S Dorman;

doi: 10.1101/2022.06.12.495839 , 10.1093/bioinformatics/btad002

pmid: 36610988

pmc: PMC9891248

Accurate Estimation of Molecular Counts from Amplicon Sequence Data with Unique Molecular Identifiers

- Summary
- Subjects
- Related research
  (2)
- Metrics

Abstract

AbstractMotivationAmplicon sequencing is widely applied to explore heterogeneity and rare variants in genetic populations. Resolving true biological variants and quantifying their abundance is crucial for downstream analyses, but measured abundances are distorted by stochasticity and bias in amplification, plus errors during Polymerase Chain Reaction (PCR) and sequencing. One solution attaches Unique Molecular Identifiers (UMIs) to sample sequences before amplification eliminating amplification bias by clustering reads on UMI and counting clusters to quantify abundance. While modern methods improve over naïve clustering by UMI identity, most do not account for UMI reuse, or collision, and they do not adequately model PCR and sequencing errors in the UMIs and sample sequences.ResultsWe introduce Deduplication and accurate Abundance estimation with UMIs (DAUMI), a probabilistic framework to detect true biological sequences and accurately estimate their deduplicated abundance from amplicon sequence data. DAUMI recognizes UMI collision, even on highly similar sequences, and detects and corrects most PCR and sequencing errors in the UMI and sampled sequences. DAUMI performs better on simulated and real data compared to other UMI-aware clustering methods.AvailabilitySource code is available at https://github.com/xiyupeng/AmpliCI-UMI.

Related Organizations

Iowa State University
United States
Memorial Sloan Kettering Cancer Center
United States

Keywords

Original Paper, High-Throughput Nucleotide Sequencing, Cluster Analysis, Sequence Analysis, DNA, Polymerase Chain Reaction, Software

2 Research products, page 1 of 1

11751, 1828-05-16, DAUMY †, Ancien Directeur de l\'Hôtel des Monnaies de Toulouse
2017IsAmongTopNSimilarDocuments
AmpliCI software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	15
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%