Powered by OpenAIRE graph
Found an issue? Give us feedback
addClaim

Data from: A chromosomal-scale genome assembly of Tectona grandis reveals the importance of tandem gene duplication and enables discovery of genes in natural product biosynthetic pathways

Authors: Zhao, Dongyan; Hamilton, John P.; Bhat, Wajid Waheed; Johnson, Sean R.; Godden, Grant T.; Kinser, Taliesin J.; Boachon, Benoît; +5 Authors

Data from: A chromosomal-scale genome assembly of Tectona grandis reveals the importance of tandem gene duplication and enables discovery of genes in natural product biosynthetic pathways

Abstract

Background: Teak, a member of the Lamiaceae family, produces one of the most expensive hardwoods in the world. High demand coupled with deforestation have caused a decrease in natural teak forests, and future supplies will be reliant on teak plantations. Hence, selection of teak tree varieties for clonal propagation with superior growth performance is of great importance, and access to high-quality genetic and genomic resources can accelerate the selection process by identifying genes underlying desired traits. Findings: To facilitate teak research and variety improvement, we generated a highly contiguous, chromosomal-scale genome assembly using high-coverage PacBio long reads coupled with high-throughput chromatin conformation capture. Of the 18 teak chromosomes, we generated 17 near-complete pseudomolecules with one chromosome present as two chromosome arm scaffolds. Genome annotation yielded 31,168 genes encoding 46,826 gene models, of which, 39,930 and 41,155 had Pfam domain and expression evidence, respectively. We identified 14 clusters of tandem-duplicated terpene synthases (TPSs), genes central to the biosynthesis of terpenes which are involved in plant defense and pollinator attraction. Transcriptome analysis revealed 10 TPSs highly expressed in woody tissues, of which, 8 were in tandem, revealing the importance of resolving tandemly duplicated genes and the quality of the assembly and annotation. We also validated the enzymatic activity of four TPSs to demonstrate the function of key TPSs. Conclusions: In summary, this high-quality chromosomal-scale assembly and functional annotation of the teak genome will facilitate the discovery of candidate genes related to traits critical for sustainable production of teak and for anti-insecticidal natural products.

teak_tectona_grandis_26Jun2018_7GlFM_fmt_tp.fafasta sequences of the assemblyteak.working_models_HiC.cdna_con_sorted.facDNA sequences of all isoforms of the working gene setteak.working_models_HiC.cds_con_sorted.faCDS of all isoforms of the working gene setteak.working_models_HiC.pep_con_sorted.faPeptide sequence of all isoforms of the working gene setteak.working_models_HiC_fmtDes_con_sorted.gffGFF of all isoforms of the working gene setteak_hc_models_HiC_con_sorted_modiGeneID.gffGFF of all high-confidence gene modelsteak_hc_models_HiC.cdna_con_sorted_modiGeneID.facDNA sequences of all high-confidence gene modelsteak_hc_models_HiC.cds_con_sorted_modiGeneID.faCDS of all high-confidence gene modelsteak_hc_models_HiC.pep_con_sorted_modiGeneID.faPeptide sequences of all high-confidence gene modelsteak_repr_hc_models_HiC_con_sorted_modiGeneID.gffGFF of representative high-confidence gene modelsteak_repr_hc_models_HiC.cdna_con_sorted_modiGeneID.facDNA sequences of representative high-confidence gene modelsteak_repr_hc_models_HiC.cds_con_sorted_modiGeneID.faCDS of representative high-confidence gene modelsteak_repr_hc_models_HiC.pep_con_sorted_modiGeneID.faPeptide sequences of representative high-confidence gene modelsteak_working_gene_fpkm_matrix_con_sorted.txtExpression abundances of the working gene set were estimated using cufflinks RNAseq experiment atlas from NCBI SRA BioProject PRJNA2876042018.10.23-teak-data-readme.docxReadme

Keywords

chromosomal-scale assembly, terpene synthases, teak, tandem-duplicated genes

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 28
    download downloads 6
  • 28
    views
    6
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
28
6
Related to Research communities