
Abstract Summary Tandem DNA repeats can be sequenced with long-read technologies, but cannot be accurately deciphered due to the lack of computational tools taking high error rates of these technologies into account. Here we introduce Noise-Cancelling Repeat Finder (NCRF) to uncover putative tandem repeats of specified motifs in noisy long reads produced by Pacific Biosciences and Oxford Nanopore sequencers. Using simulations, we validated the use of NCRF to locate tandem repeats with motifs of various lengths and demonstrated its superior performance as compared to two alternative tools. Using real human whole-genome sequencing data, NCRF identified long arrays of the (AATGG)n repeat involved in heat shock stress response. Availability and implementation NCRF is implemented in C, supported by several python scripts, and is available in bioconda and at https://github.com/makovalab-psu/NoiseCancellingRepeatFinder. Supplementary information Supplementary data are available at Bioinformatics online.
570, Genome, Genome, Human, High-Throughput Nucleotide Sequencing, DNA, Sequence Analysis, DNA, Applications Notes, 004, Nanopores, Tandem Repeat Sequences, Humans, Sequence Analysis, Software, Human
570, Genome, Genome, Human, High-Throughput Nucleotide Sequencing, DNA, Sequence Analysis, DNA, Applications Notes, 004, Nanopores, Tandem Repeat Sequences, Humans, Sequence Analysis, Software, Human
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 41 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
