publication . Article . Preprint . Other literature type . 2013

Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species

Matthias Haimel; Jared T. Simpson; Martin Hunt; Cristian Del Fabbro; Sergey Koren; Octávio S. Paulo; Jacques Corbeil; Pavel Fedotov; Sergey Melnikov; Jun Wang; ...
Open Access English
  • Published: 22 Jul 2013
  • Publisher: HAL CCSD
Abstract
Comment: Additional files available at http://korflab.ucdavis.edu/Datasets/Assemblathon/Assemblathon2/Additional_files/ Major changes 1. Accessions for the 3 read data sets have now been included 2. New file: spreadsheet containing details of all Study, Sample, Run, & Experiment identifiers 3. Made miscellaneous changes to address reviewers comments. DOIs added to GigaDB datasets
Subjects
free text keywords: [INFO.INFO-AR]Computer Science [cs]/Hardware Architecture [cs.AR], COMPASS, [INFO.INFO-AR] Computer Science [cs]/Hardware Architecture [cs.AR], Assessment, N50, Research, Genome assembly, Scaffolds, Quantitative Biology - Genomics, Heterozygosity, GENOMES; De novo Assembly, Computational biology, Fosmid, Vertebrate, biology.animal, biology, Data sequences, Gene, Genome, Scalability, Whole genome sequencing, Computer science, Sequence assembly
Funded by
WT| Wellcome Trust Sanger Institute - generic account for deposition of all core- funded research papers
Project
  • Funder: Wellcome Trust (WT)
  • Project Code: 098051
  • Funding stream: Cellular and Molecular Neuroscience
,
NIH| Using genomics to understand physiologic water conservation in desert rodents
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5F32DK093227-02
  • Funding stream: NATIONAL INSTITUTE OF DIABETES AND DIGESTIVE AND KIDNEY DISEASES
,
NSF| XSEDE: eXtreme Science and Engineering Discovery Environment
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 1053575
  • Funding stream: Directorate for Computer & Information Science & Engineering | Division of Advanced Cyberinfrastructure
,
FCT| PTDC/EIA-EIA/121686/2010
Project
PTDC/EIA-EIA/121686/2010
ADE - Adverse Drug Effects Detection
  • Funder: Fundação para a Ciência e a Tecnologia, I.P. (FCT)
  • Project Code: 121686
  • Funding stream: COMPETE
,
NIH| Massively Parallel Contiguity Mapping
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5R01HG006283-05
  • Funding stream: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
78 references, page 1 of 6

1. Bentley DR: Whole-genome re-sequencing. Curr Opin Genet Dev 2006, 16:545-552.

2. Haussler D, O'Brien SJ, Ryder OA, Barker FK, Clamp M, Crawford AJ, Hanner R, Hanotte O, Johnson WE, McGuire JA: Genome 10K: a proposal to obtain whole-genome sequence for 10 000 vertebrate species. J Hered 2009, 100:659-674.

3. i5K - ArthropodBase wiki. http://www.arthropodgenomes.org/wiki/i5K

4. Kumar S, Schiffer PH, Blaxter M: 959 Nematode Genomes: a semantic wiki for coordinating sequencing projects. Nucleic Acids Res 2012, 40:D1295-D1300.

5. Pevzner PA, Tang H, Waterman MS: An Eulerian path approach to DNA fragment assembly. Proc Natl Acad Sci 2001, 98:9748-9753.

6. Butler J, MacCallum I, Kleber M, Shlyakhter IA, Belmonte MK, Lander ES, Nusbaum C, Jaffe DB: ALLPATHS: De novo assembly of whole-genome shotgun microreads. Genome Res 2008, 18:810-820.

7. Zerbino DR, Birney E: Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 2008, 18:821-829. [OpenAIRE]

8. Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJM, Birol I: ABySS: a parallel assembler for short read sequence data. Genome Res 2009, 19:1117-1123.

9. Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, He G, Chen Y, Pan Q, Liu Y: SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience 2012, 1:18.

10. Li R, Fan W, Tian G, Zhu H, He L, Cai J, Huang Q, Cai Q, Li B, Bai Y, Zhang Z, Zhang Y, Wang W, Li J, Wei F, Li H, Jian M, Li J, Zhang Z, Nielsen R, Li D, Gu W, Yang Z, Xuan Z, Ryder OA, Leung FC-C, Zhou Y, Cao J, Sun X, Fu Y, et al: The sequence and de novo assembly of the giant panda genome. Nature 2010, 463:311-317.

11. Simpson JT, Durbin R: Efficient de novo assembly of large genomes using compressed data structures. Genome Res 2011, 22(3):549-556.

12. Li H: Exploring single-sample SNP and INDEL calling with whole-genome de novo assembly. Bioinformatics 2012, 28:1838-1844.

13. Miller JR, Koren S, Sutton G: Assembly algorithms for next-generation sequencing data. Genomics 2010, 95:315.

14. Henson J, Tischler G, Ning Z: Next-generation sequencing and large genome assemblies. Pharmacogenomics 2012, 13:901-915.

15. Narzisi GAM, Bud M: Comparing de novo genome assembly: the long and short of it. PLoS One 2011, 6:e19175. [OpenAIRE]

78 references, page 1 of 6
Abstract
Comment: Additional files available at http://korflab.ucdavis.edu/Datasets/Assemblathon/Assemblathon2/Additional_files/ Major changes 1. Accessions for the 3 read data sets have now been included 2. New file: spreadsheet containing details of all Study, Sample, Run, & Experiment identifiers 3. Made miscellaneous changes to address reviewers comments. DOIs added to GigaDB datasets
Subjects
free text keywords: [INFO.INFO-AR]Computer Science [cs]/Hardware Architecture [cs.AR], COMPASS, [INFO.INFO-AR] Computer Science [cs]/Hardware Architecture [cs.AR], Assessment, N50, Research, Genome assembly, Scaffolds, Quantitative Biology - Genomics, Heterozygosity, GENOMES; De novo Assembly, Computational biology, Fosmid, Vertebrate, biology.animal, biology, Data sequences, Gene, Genome, Scalability, Whole genome sequencing, Computer science, Sequence assembly
Funded by
WT| Wellcome Trust Sanger Institute - generic account for deposition of all core- funded research papers
Project
  • Funder: Wellcome Trust (WT)
  • Project Code: 098051
  • Funding stream: Cellular and Molecular Neuroscience
,
NIH| Using genomics to understand physiologic water conservation in desert rodents
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5F32DK093227-02
  • Funding stream: NATIONAL INSTITUTE OF DIABETES AND DIGESTIVE AND KIDNEY DISEASES
,
NSF| XSEDE: eXtreme Science and Engineering Discovery Environment
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 1053575
  • Funding stream: Directorate for Computer & Information Science & Engineering | Division of Advanced Cyberinfrastructure
,
FCT| PTDC/EIA-EIA/121686/2010
Project
PTDC/EIA-EIA/121686/2010
ADE - Adverse Drug Effects Detection
  • Funder: Fundação para a Ciência e a Tecnologia, I.P. (FCT)
  • Project Code: 121686
  • Funding stream: COMPETE
,
NIH| Massively Parallel Contiguity Mapping
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5R01HG006283-05
  • Funding stream: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
78 references, page 1 of 6

1. Bentley DR: Whole-genome re-sequencing. Curr Opin Genet Dev 2006, 16:545-552.

2. Haussler D, O'Brien SJ, Ryder OA, Barker FK, Clamp M, Crawford AJ, Hanner R, Hanotte O, Johnson WE, McGuire JA: Genome 10K: a proposal to obtain whole-genome sequence for 10 000 vertebrate species. J Hered 2009, 100:659-674.

3. i5K - ArthropodBase wiki. http://www.arthropodgenomes.org/wiki/i5K

4. Kumar S, Schiffer PH, Blaxter M: 959 Nematode Genomes: a semantic wiki for coordinating sequencing projects. Nucleic Acids Res 2012, 40:D1295-D1300.

5. Pevzner PA, Tang H, Waterman MS: An Eulerian path approach to DNA fragment assembly. Proc Natl Acad Sci 2001, 98:9748-9753.

6. Butler J, MacCallum I, Kleber M, Shlyakhter IA, Belmonte MK, Lander ES, Nusbaum C, Jaffe DB: ALLPATHS: De novo assembly of whole-genome shotgun microreads. Genome Res 2008, 18:810-820.

7. Zerbino DR, Birney E: Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 2008, 18:821-829. [OpenAIRE]

8. Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJM, Birol I: ABySS: a parallel assembler for short read sequence data. Genome Res 2009, 19:1117-1123.

9. Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, He G, Chen Y, Pan Q, Liu Y: SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience 2012, 1:18.

10. Li R, Fan W, Tian G, Zhu H, He L, Cai J, Huang Q, Cai Q, Li B, Bai Y, Zhang Z, Zhang Y, Wang W, Li J, Wei F, Li H, Jian M, Li J, Zhang Z, Nielsen R, Li D, Gu W, Yang Z, Xuan Z, Ryder OA, Leung FC-C, Zhou Y, Cao J, Sun X, Fu Y, et al: The sequence and de novo assembly of the giant panda genome. Nature 2010, 463:311-317.

11. Simpson JT, Durbin R: Efficient de novo assembly of large genomes using compressed data structures. Genome Res 2011, 22(3):549-556.

12. Li H: Exploring single-sample SNP and INDEL calling with whole-genome de novo assembly. Bioinformatics 2012, 28:1838-1844.

13. Miller JR, Koren S, Sutton G: Assembly algorithms for next-generation sequencing data. Genomics 2010, 95:315.

14. Henson J, Tischler G, Ning Z: Next-generation sequencing and large genome assemblies. Pharmacogenomics 2012, 13:901-915.

15. Narzisi GAM, Bud M: Comparing de novo genome assembly: the long and short of it. PLoS One 2011, 6:e19175. [OpenAIRE]

78 references, page 1 of 6
Any information missing or wrong?Report an Issue