Read alignment using deep neural networks

Shrestha, Akash, author; Chitsaz, Hamidreza, advisor; Ben-Hur, Asa, committee member; Abdo, Zaid, committee member

Found an issue? Give us feedback

https://dx.doi.org/1...arrow_drop_down

https://dx.doi.org/10.25675/3....

Other literature type . 2019

Data sources: Datacite

Read alignment using deep neural networks

descriptionPublicationkeyboard_double_arrow_right Other literature type , Article 01 Jan 2019Embargo end date: 14 Jun 2019 United States English Publisher:Colorado State University. Libraries

Authors: Shrestha, Akash, author; Chitsaz, Hamidreza, advisor; Ben-Hur, Asa, committee member; Abdo, Zaid, committee member;

doi: 10.25675/3.018912

handle: 10217/195341

Read alignment using deep neural networks

- Summary
- Subjects
- Metrics

Abstract

Read alignment is the process of mapping short DNA sequences into the reference genome. With the advent of consecutively evolving "next generation" sequencing technologies, the need for sequence alignment tools appeared. Many scientific communities and the companies marketing the sequencing technologies developed a whole spectrum of read aligners/mappers for different error profiles and read length characteristics. Among the most recent successfully marketed sequencing technologies are Oxford Nanopore and PacBio SMRT sequencing, which are considered top players because of their extremely long reads and low cost. However, the reads may contain error up to 20% that are not generally uniformly distributed. To deal with that level of error rate and read length, proximity preserving hashing techniques, such as Minhash and Minimizers, were utilized to quickly map a read to the target region of the reference sequence. Subsequently, a variant of global or local alignment dynamic programming is then used to give the final alignment. In this research work, we train a Deep Neural Network (DNN) to yield a hashing scheme for the highly erroneous long reads, which is deemed superior to Minhash for mapping the reads. We implemented that idea to build a read alignment tool: DNNAligner. We evaluated the performance of our aligner against the popular read aligners in the bioinformatics community currently — minimap2, bwa-mem and graphmap. Our results show that the performance of DNNAligner is comparable to other tools without any code optimization or integration of other advanced features. Moreover, DNN exhibits superior performance in comparison with Minhashon neighborhood classification.

Country

United States

Related Organizations

Colorado State University Pueblo
United States
University of Colorado Anschutz Medical Campus
United States
University of Colorado Health
United States
Colorado School of Mines
United States
University of Colorado Colorado Springs
United States

View all View all

Keywords

pattern discovery, neural network, sequence alignment, Minhash, DNA

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green

Related to Research communities

UArctic