publication . Other literature type . Preprint . 2019

A fully phased accurate assembly of an individual human genome

Porubsky, David; Ebert, Peter; Audano, Peter A.; Vollger, Mitchell R.; Harvey, William T.; Munson, Katherine M.; Sorensen, Melanie; Sulovari, Arvis; Haukness, Marina; Ghareghani, Maryam; ...
Open Access
  • Published: 26 Nov 2019
  • Publisher: bioRxiv
Abstract
<jats:p>The prevailing genome assembly paradigm is to produce consensus sequences that “collapse” parental haplotypes into a consensus sequence. Here, we leverage the chromosome-wide phasing and scaffolding capabilities of single-cell strand sequencing (Strand-seq)<jats:sup>1,2</jats:sup> and combine them with high-fidelity (HiFi) long sequencing reads<jats:sup>3</jats:sup>, in a novel reference-free workflow for diploid <jats:italic>de novo</jats:italic> genome assembly. Employing this strategy, we produce completely phased <jats:italic>de novo</jats:italic> genome assemblies separately for each haplotype of a single individual of Puerto Rican origin (HG00733) ...
Funded by
NIH| University of Washington PhD Training in Big Data for Genomics and Neuroscience
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5T32LM012419-04
  • Funding stream: NATIONAL LIBRARY OF MEDICINE
,
NIH| Sequence and Assembly of Segmental Duplications
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5R01HG002385-15
  • Funding stream: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
,
NIH| Sequence-resolved structural variation of human genomes
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 1R01HG010169-01
  • Funding stream: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
,
NIH| Interdisciplinary Training in Genome Sciences
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5T32HG000035-23
  • Funding stream: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
31 references, page 1 of 3

1. Falconer, E. et al. DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution. Nat. Methods 9, 1107-1112 (2012).

2. Sanders, A. D., Falconer, E., Hills, M., Spierings, D. C. J. & Lansdorp, P. M. Single-cell template strand sequencing by Strand-seq enables the characterization of individual homologs. Nat. Protoc. 12, 1151-1176 (2017).

3. Wenger, A. M. et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 37, 1155-1162 (2019). [OpenAIRE]

4. Levy, S. et al. The diploid genome sequence of an individual human. PLoS Biol. 5, e254 (2007).

5. Schneider, V. A. et al. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res. 27, 849- 864 (2017).

6. Kronenberg, Z. N. et al. High-resolution comparative analysis of great ape genomes. Science 360, (2018).

7. Miga, K. H. et al. Telomere-to-telomere assembly of a complete human X chromosome. bioRxiv 735928 (2019). doi:10.1101/735928 [OpenAIRE]

8. Chin, C.-S. et al. Phased diploid genome assembly with single-molecule real-time sequencing. Nat. Methods 13, 1050-1054 (2016).

9. Weisenfeld, N. I., Kumar, V., Shah, P., Church, D. M. & Jaffe, D. B. Direct determination of diploid genome sequences. Genome Res. 27, 757-767 (2017).

10. Koren, S. et al. De novo assembly of haplotype-resolved genomes with trio binning. Nat. Biotechnol. (2018). doi:10.1038/nbt.4277

11. Kronenberg, Z. N. et al. Extended haplotype phasing of de novo genome assemblies with FALCON-Phase. bioRxiv 327064 (2019). doi:10.1101/327064

12. Garg, S. et al. Efficient chromosome-scale haplotype-resolved assembly of human genomes. bioRxiv 810341 (2019). doi:10.1101/810341

13. Hills, M., O'Neill, K., Falconer, E., Brinkman, R. & Lansdorp, P. M. BAIT: Organizing genomes and mapping rearrangements in single cells. Genome Med. 5, 82 (2013).

14. O'Neill, K. et al. Assembling draft genomes using contiBAIT. Bioinformatics 33, 2737- 2739 (2017). [OpenAIRE]

15. Ghareghani, M. et al. Strand-seq enables reliable separation of long reads by chromosome via expectation maximization. Bioinformatics 34, i115-i123 (2018). [OpenAIRE]

31 references, page 1 of 3
Abstract
<jats:p>The prevailing genome assembly paradigm is to produce consensus sequences that “collapse” parental haplotypes into a consensus sequence. Here, we leverage the chromosome-wide phasing and scaffolding capabilities of single-cell strand sequencing (Strand-seq)<jats:sup>1,2</jats:sup> and combine them with high-fidelity (HiFi) long sequencing reads<jats:sup>3</jats:sup>, in a novel reference-free workflow for diploid <jats:italic>de novo</jats:italic> genome assembly. Employing this strategy, we produce completely phased <jats:italic>de novo</jats:italic> genome assemblies separately for each haplotype of a single individual of Puerto Rican origin (HG00733) ...
Funded by
NIH| University of Washington PhD Training in Big Data for Genomics and Neuroscience
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5T32LM012419-04
  • Funding stream: NATIONAL LIBRARY OF MEDICINE
,
NIH| Sequence and Assembly of Segmental Duplications
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5R01HG002385-15
  • Funding stream: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
,
NIH| Sequence-resolved structural variation of human genomes
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 1R01HG010169-01
  • Funding stream: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
,
NIH| Interdisciplinary Training in Genome Sciences
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5T32HG000035-23
  • Funding stream: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
31 references, page 1 of 3

1. Falconer, E. et al. DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution. Nat. Methods 9, 1107-1112 (2012).

2. Sanders, A. D., Falconer, E., Hills, M., Spierings, D. C. J. & Lansdorp, P. M. Single-cell template strand sequencing by Strand-seq enables the characterization of individual homologs. Nat. Protoc. 12, 1151-1176 (2017).

3. Wenger, A. M. et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 37, 1155-1162 (2019). [OpenAIRE]

4. Levy, S. et al. The diploid genome sequence of an individual human. PLoS Biol. 5, e254 (2007).

5. Schneider, V. A. et al. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res. 27, 849- 864 (2017).

6. Kronenberg, Z. N. et al. High-resolution comparative analysis of great ape genomes. Science 360, (2018).

7. Miga, K. H. et al. Telomere-to-telomere assembly of a complete human X chromosome. bioRxiv 735928 (2019). doi:10.1101/735928 [OpenAIRE]

8. Chin, C.-S. et al. Phased diploid genome assembly with single-molecule real-time sequencing. Nat. Methods 13, 1050-1054 (2016).

9. Weisenfeld, N. I., Kumar, V., Shah, P., Church, D. M. & Jaffe, D. B. Direct determination of diploid genome sequences. Genome Res. 27, 757-767 (2017).

10. Koren, S. et al. De novo assembly of haplotype-resolved genomes with trio binning. Nat. Biotechnol. (2018). doi:10.1038/nbt.4277

11. Kronenberg, Z. N. et al. Extended haplotype phasing of de novo genome assemblies with FALCON-Phase. bioRxiv 327064 (2019). doi:10.1101/327064

12. Garg, S. et al. Efficient chromosome-scale haplotype-resolved assembly of human genomes. bioRxiv 810341 (2019). doi:10.1101/810341

13. Hills, M., O'Neill, K., Falconer, E., Brinkman, R. & Lansdorp, P. M. BAIT: Organizing genomes and mapping rearrangements in single cells. Genome Med. 5, 82 (2013).

14. O'Neill, K. et al. Assembling draft genomes using contiBAIT. Bioinformatics 33, 2737- 2739 (2017). [OpenAIRE]

15. Ghareghani, M. et al. Strand-seq enables reliable separation of long reads by chromosome via expectation maximization. Bioinformatics 34, i115-i123 (2018). [OpenAIRE]

31 references, page 1 of 3
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue