Correcting Illumina data

descriptionPublicationkeyboard_double_arrow_right Article 01 Sep 2014 English Publisher:Oxford University Press (OUP)Journal:Briefings in Bioinformatics, volume 16, pages 588-599 (issn: 1467-5463, eissn: 1477-4054,

Copyright policy )

Authors: Michael Molnar; Lucian Ilie;

doi: 10.1093/bib/bbu029

pmid: 25183248

Correcting Illumina data

- Summary
- Subjects
- Metrics

Abstract

Next-generation sequencing technologies revolutionized the ways in which genetic information is obtained and have opened the door for many essential applications in biomedical sciences. Hundreds of gigabytes of data are being produced, and all applications are affected by the errors in the data. Many programs have been designed to correct these errors, most of them targeting the data produced by the dominant technology of Illumina. We present a thorough comparison of these programs. Both HiSeq and MiSeq types of Illumina data are analyzed, and correcting performance is evaluated as the gain in depth and breadth of coverage, as given by correct reads and k-mers. Time and memory requirements, scalability and parallelism are considered as well. Practical guidelines are provided for the effective use of these tools. We also evaluate the efficiency of the current state-of-the-art programs for correcting Illumina data and provide research directions for further improvement.

Related Organizations

Western University
Canada

Keywords

Data Interpretation, Statistical, Sequence Analysis, DNA

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	28
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%