publication . Article . Other literature type . Conference object . 2018

Strand-seq enables reliable separation of long reads by chromosome via expectation maximization

Tobias Marschall; Tobias Marschall; Jan O. Korbel; Ashley D. Sanders; Evan E. Eichler; David Porubskỳ; David Porubskỳ; Maryam Ghareghani; Maryam Ghareghani; Sascha Meiers;
Open Access English
  • Published: 19 Mar 2018
  • Publisher: Oxford University Press
Abstract
Abstract Motivation Current sequencing technologies are able to produce reads orders of magnitude longer than ever possible before. Such long reads have sparked a new interest in de novo genome assembly, which removes reference biases inherent to re-sequencing approaches and allows for a direct characterization of complex genomic variants. However, even with latest algorithmic advances, assembling a mammalian genome from long error-prone reads incurs a significant computational burden and does not preclude occasional misassemblies. Both problems could potentially be mitigated if assembly could commence for each chromosome separately. Results To address this, we ...
Subjects
free text keywords: Cancer Research, Ismb 2018–Intelligent Systems for Molecular Biology Proceedings, Comparative and Functional Genomics, denovo, genome, assembly, long read, sequencing, Strand-seq, Statistics and Probability, Computational Theory and Mathematics, Biochemistry, Molecular Biology, Computational Mathematics, Computer Science Applications, Posterior probability, Latent variable model, Expectation–maximization algorithm, Computational biology, Coding strand, Computer science, In silico, Mammalian genome, Chromosome, Sequence assembly
Funded by
NIH| An Integrative Analysis of Structural Variation for the 1000 Genomes Project
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 3U41HG007497-04S1
  • Funding stream: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
23 references, page 1 of 2

Burton J.N.et al (2013) Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol., 31, 1119–1125.24185095 [OpenAIRE] [PubMed]

Chaisson M.J.P.et al (2017) Multi-platform discovery of haplotype-resolved structural variation in human genomes. bioRxiv, 193144.

Chin C.-S.et al (2016) Phased diploid genome assembly with single-molecule real-time sequencing. Nat. Methods, 13, 1050–1054.27749838 [OpenAIRE] [PubMed]

Claussin C.et al (2017) Genome-wide mapping of sister chromatid exchange events in single yeast cells using strand-seq. Elife, 6, e30560. [OpenAIRE]

Dempster A.P.et al (1977) Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B (Methodol.), 39, 1–38.

Falconer E.et al (2012) DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution. Nat. Methods, 9, 1107–1112.23042453 [OpenAIRE] [PubMed]

Gordon D.et al (2016) Long-read sequence assembly of the gorilla genome. Science, 352, aae0344.27034376 [OpenAIRE] [PubMed]

Hills M.et al (2013) Bait: organizing genomes and mapping rearrangements in single cells. Genome Med., 5, 82.24028793 [OpenAIRE] [PubMed]

Hills M.et al (2018) Construction of whole genomes from scaffolds using single cell strand-seq data. bioRxiv, 271510.

Jiao W.-B., Schneeberger K. (2017) The impact of third generation genomic technologies on plant genome assembly. Curr. Opin. Plant Biol., 36, 64–70.28231512 [PubMed]

Jiao W.-B.et al (2017) Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data. Genome Res., 27, 778–786.28159771 [OpenAIRE] [PubMed]

Koren S.et al (2017) Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res., 27, 722–736.28298431 [OpenAIRE] [PubMed]

Li H. (2016) Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics, 32, 2103–2110.27153593 [OpenAIRE] [PubMed]

Lin Y.et al (2016) Assembly of long error-prone reads using de Bruijn graphs. Proc. Natl. Acad. Sci., 113, E8396–E8405.27956617 [OpenAIRE] [PubMed]

Myers G. (2014) Efficient local alignment discovery amongst noisy long reads In: International Workshop on Algorithms in Bioinformatics. Springer, Berlin, Heidelberg, pp. 52–67.

23 references, page 1 of 2
Abstract
Abstract Motivation Current sequencing technologies are able to produce reads orders of magnitude longer than ever possible before. Such long reads have sparked a new interest in de novo genome assembly, which removes reference biases inherent to re-sequencing approaches and allows for a direct characterization of complex genomic variants. However, even with latest algorithmic advances, assembling a mammalian genome from long error-prone reads incurs a significant computational burden and does not preclude occasional misassemblies. Both problems could potentially be mitigated if assembly could commence for each chromosome separately. Results To address this, we ...
Subjects
free text keywords: Cancer Research, Ismb 2018–Intelligent Systems for Molecular Biology Proceedings, Comparative and Functional Genomics, denovo, genome, assembly, long read, sequencing, Strand-seq, Statistics and Probability, Computational Theory and Mathematics, Biochemistry, Molecular Biology, Computational Mathematics, Computer Science Applications, Posterior probability, Latent variable model, Expectation–maximization algorithm, Computational biology, Coding strand, Computer science, In silico, Mammalian genome, Chromosome, Sequence assembly
Funded by
NIH| An Integrative Analysis of Structural Variation for the 1000 Genomes Project
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 3U41HG007497-04S1
  • Funding stream: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
23 references, page 1 of 2

Burton J.N.et al (2013) Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol., 31, 1119–1125.24185095 [OpenAIRE] [PubMed]

Chaisson M.J.P.et al (2017) Multi-platform discovery of haplotype-resolved structural variation in human genomes. bioRxiv, 193144.

Chin C.-S.et al (2016) Phased diploid genome assembly with single-molecule real-time sequencing. Nat. Methods, 13, 1050–1054.27749838 [OpenAIRE] [PubMed]

Claussin C.et al (2017) Genome-wide mapping of sister chromatid exchange events in single yeast cells using strand-seq. Elife, 6, e30560. [OpenAIRE]

Dempster A.P.et al (1977) Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B (Methodol.), 39, 1–38.

Falconer E.et al (2012) DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution. Nat. Methods, 9, 1107–1112.23042453 [OpenAIRE] [PubMed]

Gordon D.et al (2016) Long-read sequence assembly of the gorilla genome. Science, 352, aae0344.27034376 [OpenAIRE] [PubMed]

Hills M.et al (2013) Bait: organizing genomes and mapping rearrangements in single cells. Genome Med., 5, 82.24028793 [OpenAIRE] [PubMed]

Hills M.et al (2018) Construction of whole genomes from scaffolds using single cell strand-seq data. bioRxiv, 271510.

Jiao W.-B., Schneeberger K. (2017) The impact of third generation genomic technologies on plant genome assembly. Curr. Opin. Plant Biol., 36, 64–70.28231512 [PubMed]

Jiao W.-B.et al (2017) Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data. Genome Res., 27, 778–786.28159771 [OpenAIRE] [PubMed]

Koren S.et al (2017) Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res., 27, 722–736.28298431 [OpenAIRE] [PubMed]

Li H. (2016) Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics, 32, 2103–2110.27153593 [OpenAIRE] [PubMed]

Lin Y.et al (2016) Assembly of long error-prone reads using de Bruijn graphs. Proc. Natl. Acad. Sci., 113, E8396–E8405.27956617 [OpenAIRE] [PubMed]

Myers G. (2014) Efficient local alignment discovery amongst noisy long reads In: International Workshop on Algorithms in Bioinformatics. Springer, Berlin, Heidelberg, pp. 52–67.

23 references, page 1 of 2
Any information missing or wrong?Report an Issue