Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2025
License: CC BY
Data sources: ZENODO
ZENODO
Dataset . 2025
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2025
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

DiazoTIME Database: a metabolically-resolved reference database of nitrogen-fixing microbial genomes

Authors: Damashek, Julian; Sheik, Cody; Petro, Caitlin; Furbo Reeder, Christian; Chowdhury, Subhadeep; Kramer, Benjamin; DeVilbiss, Stephen; +8 Authors

DiazoTIME Database: a metabolically-resolved reference database of nitrogen-fixing microbial genomes

Abstract

Database assembly, curation, and taxonomic annotation of genomes To create a reference database for classifying environmental sequences, we constructed a database of putatively diazotrophic genomes from the Genome Taxonomy Database (r214; Parks et al. 2022). Species-representative genomes containing any of nifH, nifD, or nifK were identified with AnnoTree (Mendler et al. 2019). Since a large number of genomes with nifH (or homologous genes) do not contain any other nif genes (Mise et al. 2021), any genomes without all three nifDHK genes were assumed to not be “true” diazotrophs and were discarded, leaving 2798 genomes (3.3% of GTDB representative genomes) with the full suite of nifHDK genes that were assumed to be capable of N2-fixation, i.e., the “DiazoTIME" database. To assess the metabolic capabilities of these diazotrophs, we used METABOLIC v4 (Zhou et al. 2022) to annotate the metabolic genes of each diazotroph genome. METABOLIC identifies key functional pathways by aggregating results from genome searches using Hidden Markov Models (HMMs) from KOFam (Kanehisa et al. 2023), TIGR (Li et al. 2021), and select custom models. These gene annotations were used to categorize genomes into broad metabolic categories, focused on energy production and carbon sources. References Kanehisa M, Furumichi M, Sato Y, Kawashima M, Ishiguro-Watanabe M. 2023. KEGG for taxonomy-based analysis of pathways and genomes. D1. Nucleic Acids Research 51:D587–D592. Li W, O’Neill KR, Haft DH, DiCuccio M, Chetvernin V, Badretdin A, Coulouris G, Chitsaz F, Derbyshire MK, Durkin AS, Gonzales NR, Gwadz M, Lanczycki CJ, Song JS, Thanki N, Wang J, Yamashita RA, Yang M, Zheng C, Marchler-Bauer A, Thibaud-Nissen F. 2021. RefSeq: expanding the Prokaryotic Genome Annotation Pipeline reach with protein family model curation. Nucleic Acids Research 49:D1020–D1028. Mendler K, Chen H, Parks DH, Lobb B, Hug LA, Doxey AC. 2019. AnnoTree: visualization and exploration of a functionally annotated microbial tree of life. Nucleic Acids Res. 47(9):4442-4448. doi: 10.1093/nar/gkz246. Mise K, Masuda Y, Senoo K, Itoh H. 2021. Undervalued pseudo-nifH sequences in public databases distort metagenomic insights into biological nitrogen fixers. mSphere 6, e00785-21. Parks DH, Chuvochina M, Rinke C, Mussig AJ, Chaumeil P-A, Hugenholtz P. 2022. GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Research 50:D785–D794. Zhou Z, Tran PQ, Breister AM, Liu Y, Kieft K, Cowley ES, Karaoz U, Anantharaman K. 2022. METABOLIC: high-throughput profiling of microbial genomes for functional traits, metabolism, biogeochemistry, and community-scale functional networks. 1. Microbiome 10:33.

Description File Name Genome metadata (accession number, taxonomy, metabolic prediction) DiazoTIME_GTDBr214_taxonomy_and_METABOLIC.xlsx List of genomes from GTDB r214 with all 3 nif genes (nifH, nifD, nifK) GTDB_r214_AnnoTree_genome_Nifs_N2fixation_potential.xlsx METABOLIC program output METABOLIC_raw_outputs.xlsx nifH, nifD, nifK nucleotide sequences gtdb_r214_nifHDK_with_tax.fna.zip nifH, nifD, nifK amino acid sequences gtdb_r214_nifHDK_with_tax.faa.zip Full genomes nucleotide sequences gtdb_diazotroph_genome_full_fnas.tar.gz Dictionary linking NCBI and GTDB accessions combined_gtdb_r214_genome_contigs_dict.txt

The Diazotroph Taxonomic Identity and MEtabolism (DiazoTIME) database contains annotated taxonomy and metabolic predictions for nifH-, nifD-, and nifK- containing genomes (2798 genomes) in the Genome Taxonomy Database (GTDB; r214; Parks et al. 2022). This database provides a useful reference for studies focused on diazotroph biodiversity, environmental distribution, and functional potential.

Keywords

microorganism, diazotroph, biological nitrogen fixation, genome

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Related to Research communities
Italian National Biodiversity Future Center