publication . Article . Other literature type . 2016

HybPiper: Extracting coding sequence and introns for phylogenetics from high-throughput sequencing reads using target enrichment.

Rafael Medina; Norman Wickett; Matthew G Johnson; Nyree Zerega;
Open Access English
  • Published: 01 Jul 2016 Journal: Applications in Plant Sciences, volume 4, issue 7 (eissn: 2168-0450, Copyright policy)
  • Publisher: Botanical Society of America
Abstract
Premise of the study: Using sequence data generated via target enrichment for phylogenetics requires reassembly of high-throughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a user-friendly platform for assembly of gene regions, extraction of exon and intron sequences, and identification of paralogous gene copies. We test HybPiper using baits designed to target 333 phylogenetic markers and 125 genes of functional significance in Artocarpus (Moraceae). Methods and Results: HybPiper implements parallel execution of sequence assembly in three phases: read mapping, contig assembly, and target sequence extract...
Subjects
free text keywords: Software Note, bioinformatics, Hyb-Seq, phylogenomics, sequence assembly
Funded by
NSF| Collaborative Research: AToL: Assembling the Pleurocarp Tree of Life: Resolving the rapid radiation using genomics and transcriptomics
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 1239992
  • Funding stream: Directorate for Biological Sciences | Division of Environmental Biology
,
NSF| REVSYS: Phylogeny and Revision of Artocarpus (Moraceae) with a Focus on Understanding the Origins and Diversity of Cultivated Members of the Genus
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 0919119
  • Funding stream: Directorate for Biological Sciences | Division of Environmental Biology
,
NSF| Collaborative Research: AToL: Assembling the Pleurocarp Tree of Life: Resolving the rapid radiation using genomics and transcriptomics
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 1240045
  • Funding stream: Directorate for Biological Sciences | Division of Environmental Biology
,
NSF| Collaborative Research: AToL: Assembling the Pleurocarp Tree of Life: Resolving the rapid radiation using genomics and transcriptomics
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 1239980
  • Funding stream: Directorate for Biological Sciences | Division of Environmental Biology
,
NSF| Rapid radiation and sporophyte evolution in the Funariaceae: inferences from phylogenomics and cross generational cuticle development studies
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 1146295
  • Funding stream: Directorate for Biological Sciences | Division of Environmental Biology
37 references, page 1 of 3

Bankevich A.Nurk S.Antipov D.Gurevich A. A.Dvorkin M.Kulikov A. S.Lesin V. M.2012 SPAdes: A new genome assembly algorithm and its applications to single-cell sequencing. Journal of Computational Biology 19: 455–477. 22506599 [OpenAIRE] [PubMed]

Bi K.Vanderpool D.Singhal S.Linderoth T.Moritz C.Good J. M.2012 Transcriptome-based exon capture enables highly cost-effective comparative genomic data collection at moderate evolutionary scales. BMC Genomics 13: 403. 22900609 [OpenAIRE] [PubMed]

Bolger A. M.Lohse M.Usadel B.2014 Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics (Oxford, England)30: 2114–2120. [OpenAIRE]

Bragg J. G.Potter S.Bi K.Moritz C.2015 Exon capture phylogenomics: Efficacy across scales of divergence. Molecular Ecology Resources 10.1111/1755-0998.12449.

Brandley M. C.Bragg J. G.Singhal S.Chapple D. G.Jennings C. K.Lemmon A. R.Lemmon E. M.2015 Evaluating the performance of anchored hybrid enrichment at the tips of the tree of life: A phylogenetic analysis of Australian Eugongylus group scincid lizards. BMC Evolutionary Biology 15: 62. 25880916 [OpenAIRE] [PubMed]

Camacho C.Coulouris G.Avagyan V.Ma N.Papadopoulos J.Bealer K.Madden T. L.2009 BLAST+: Architecture and applications. BMC Bioinformatics 10: 421. 20003500 [OpenAIRE] [PubMed]

Capella-Gutierrez S.Silla-Martinez J. M.Gabaldon T.2009 trimAl: A tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics (Oxford, England)25: 1972–1973. [OpenAIRE]

Cock P. J. A.Antao T.Chang J. T.Chapman B. A.Cox C. J.Dalke A.Friedberg I.2009 Biopython: Freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics (Oxford, England)25: 1422–1423.

Cronn R.Knaus B. J.Liston A.Maughan P. J.Parks M.Syring J. V.Udall J.2012 Targeted enrichment strategies for next-generation plant biology. American Journal of Botany 99: 291–311. 22312117 [OpenAIRE] [PubMed]

Faircloth B. C.2015 PHYLUCE is a software package for the analysis of conserved genomic loci. Bioinformatics (Oxford, England)32: 786–788. [OpenAIRE]

Faircloth B. C.McCormack J. E.Crawford N. G.Harvey M. G.Brumfield R. T.Glenn T. C.2012 Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales. Systematic Biology 61: 717–726. 22232343 [PubMed]

Folk R. A.Mandel J. R.Freudenstein J. V.2015 A protocol for targeted enrichment of intron-containing sequence markers for recent radiations: A phylogenomic example from Heuchera (Saxifragaceae). Applications in Plant Sciences 3(8): 1500039. [OpenAIRE]

Gardner E. M.Johnson M. G.Ragone D.Wickett N. J.Zerega N. J. C.2016 Low-coverage, whole-genome sequencing of Artocarpus camansi (Moraceae) for phylogenetic marker development and gene discovery. Applications in Plant Sciences 4(7): 1600017.

Giarla T. C.Esselstyn J. A.2015 The challenges of resolving a rapid, recent radiation: Empirical and simulated phylogenomics of Philippine shrews. Systematic Biology 64: 727–740. 25979143 [OpenAIRE] [PubMed]

Gnirke A.Melnikov A.Maguire J.Rogov P.LeProust E. M.Brockman W.Fennell T.2009 Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nature Biotechnology 27: 182–189. [OpenAIRE]

37 references, page 1 of 3
Abstract
Premise of the study: Using sequence data generated via target enrichment for phylogenetics requires reassembly of high-throughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a user-friendly platform for assembly of gene regions, extraction of exon and intron sequences, and identification of paralogous gene copies. We test HybPiper using baits designed to target 333 phylogenetic markers and 125 genes of functional significance in Artocarpus (Moraceae). Methods and Results: HybPiper implements parallel execution of sequence assembly in three phases: read mapping, contig assembly, and target sequence extract...
Subjects
free text keywords: Software Note, bioinformatics, Hyb-Seq, phylogenomics, sequence assembly
Funded by
NSF| Collaborative Research: AToL: Assembling the Pleurocarp Tree of Life: Resolving the rapid radiation using genomics and transcriptomics
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 1239992
  • Funding stream: Directorate for Biological Sciences | Division of Environmental Biology
,
NSF| REVSYS: Phylogeny and Revision of Artocarpus (Moraceae) with a Focus on Understanding the Origins and Diversity of Cultivated Members of the Genus
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 0919119
  • Funding stream: Directorate for Biological Sciences | Division of Environmental Biology
,
NSF| Collaborative Research: AToL: Assembling the Pleurocarp Tree of Life: Resolving the rapid radiation using genomics and transcriptomics
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 1240045
  • Funding stream: Directorate for Biological Sciences | Division of Environmental Biology
,
NSF| Collaborative Research: AToL: Assembling the Pleurocarp Tree of Life: Resolving the rapid radiation using genomics and transcriptomics
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 1239980
  • Funding stream: Directorate for Biological Sciences | Division of Environmental Biology
,
NSF| Rapid radiation and sporophyte evolution in the Funariaceae: inferences from phylogenomics and cross generational cuticle development studies
Project
  • Funder: National Science Foundation (NSF)
  • Project Code: 1146295
  • Funding stream: Directorate for Biological Sciences | Division of Environmental Biology
37 references, page 1 of 3

Bankevich A.Nurk S.Antipov D.Gurevich A. A.Dvorkin M.Kulikov A. S.Lesin V. M.2012 SPAdes: A new genome assembly algorithm and its applications to single-cell sequencing. Journal of Computational Biology 19: 455–477. 22506599 [OpenAIRE] [PubMed]

Bi K.Vanderpool D.Singhal S.Linderoth T.Moritz C.Good J. M.2012 Transcriptome-based exon capture enables highly cost-effective comparative genomic data collection at moderate evolutionary scales. BMC Genomics 13: 403. 22900609 [OpenAIRE] [PubMed]

Bolger A. M.Lohse M.Usadel B.2014 Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics (Oxford, England)30: 2114–2120. [OpenAIRE]

Bragg J. G.Potter S.Bi K.Moritz C.2015 Exon capture phylogenomics: Efficacy across scales of divergence. Molecular Ecology Resources 10.1111/1755-0998.12449.

Brandley M. C.Bragg J. G.Singhal S.Chapple D. G.Jennings C. K.Lemmon A. R.Lemmon E. M.2015 Evaluating the performance of anchored hybrid enrichment at the tips of the tree of life: A phylogenetic analysis of Australian Eugongylus group scincid lizards. BMC Evolutionary Biology 15: 62. 25880916 [OpenAIRE] [PubMed]

Camacho C.Coulouris G.Avagyan V.Ma N.Papadopoulos J.Bealer K.Madden T. L.2009 BLAST+: Architecture and applications. BMC Bioinformatics 10: 421. 20003500 [OpenAIRE] [PubMed]

Capella-Gutierrez S.Silla-Martinez J. M.Gabaldon T.2009 trimAl: A tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics (Oxford, England)25: 1972–1973. [OpenAIRE]

Cock P. J. A.Antao T.Chang J. T.Chapman B. A.Cox C. J.Dalke A.Friedberg I.2009 Biopython: Freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics (Oxford, England)25: 1422–1423.

Cronn R.Knaus B. J.Liston A.Maughan P. J.Parks M.Syring J. V.Udall J.2012 Targeted enrichment strategies for next-generation plant biology. American Journal of Botany 99: 291–311. 22312117 [OpenAIRE] [PubMed]

Faircloth B. C.2015 PHYLUCE is a software package for the analysis of conserved genomic loci. Bioinformatics (Oxford, England)32: 786–788. [OpenAIRE]

Faircloth B. C.McCormack J. E.Crawford N. G.Harvey M. G.Brumfield R. T.Glenn T. C.2012 Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales. Systematic Biology 61: 717–726. 22232343 [PubMed]

Folk R. A.Mandel J. R.Freudenstein J. V.2015 A protocol for targeted enrichment of intron-containing sequence markers for recent radiations: A phylogenomic example from Heuchera (Saxifragaceae). Applications in Plant Sciences 3(8): 1500039. [OpenAIRE]

Gardner E. M.Johnson M. G.Ragone D.Wickett N. J.Zerega N. J. C.2016 Low-coverage, whole-genome sequencing of Artocarpus camansi (Moraceae) for phylogenetic marker development and gene discovery. Applications in Plant Sciences 4(7): 1600017.

Giarla T. C.Esselstyn J. A.2015 The challenges of resolving a rapid, recent radiation: Empirical and simulated phylogenomics of Philippine shrews. Systematic Biology 64: 727–740. 25979143 [OpenAIRE] [PubMed]

Gnirke A.Melnikov A.Maguire J.Rogov P.LeProust E. M.Brockman W.Fennell T.2009 Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nature Biotechnology 27: 182–189. [OpenAIRE]

37 references, page 1 of 3
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue