Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2022
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2022
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2022
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2022
License: CC BY
Data sources: ZENODO
versions View all 2 versions
addClaim

AuCoMe: inferring and comparing metabolisms across heterogeneous sets of annotated genomes

Authors: Belcour, Arnaud; Got, Jeanne; Aite, Méziane; Delage, Ludovic; Collen, Jonas; Frioux, Clémence; Leblanc, Catherine; +4 Authors

AuCoMe: inferring and comparing metabolisms across heterogeneous sets of annotated genomes

Abstract

CONTENT OF THIS ARCHIVE The Zenodo archive is composed of one file and two main directories: * analyses: This directory contains all tabulated files used to create the figures and results of the paper. * datasets: This directory gathers all datasets on which AuCoMe was run: the bacterial, fungal, and algal datasets,and the 32 synthetic datasets, which contain an E. coli K–12 MG1655 genome to which various degradations were applied, together with 28 other bacterial genomes. * metacyc 23.5.padmet: This is the version 23.5 of the MetaCyc database (https://metacyc.org/) in the PADMET format. It was used by AuCoMe to reconstruct all the metabolic networks. Hence metacyc 23.5.padmet is required to reproduce our work. 1/ Content of the analyses subdirectory * figure_2_bacterial_nb_reactions.tsv: For each species of the bacterial dataset, this file gives the number of reactions at each AuCoMe step. It was used to plot Fig 2B of this paper. * figure_2_fungal_nb_reactions.tsv: For each species of the fungal dataset, this file gives the number of reactions at each AuCoMe step. It was used to plot Fig 2C of this article. * figure_2_algal_nb_reactions.tsv: For each species of the algal dataset, this file gives the number of reactions at each AuCoMe step. It was used to plot Fig. 2D of this paper. * figure_3_nb_reactions_step.tsv: For each run on 32 synthetic bacterial datasets, these are the number of reactions at each AuCoMe step. It was used to plot Fig 3A of this article. * figure_3_fmeasure_steps.tsv: For each run on 32 synthetic bacterial datasets, these are values of F-measures after comparison of the GSMNs recovered for each E. coli K–12 MG1655 genome replicate with the gold-standard network. It was used to plot Fig 3B of this paper. * figure_S1_Deepec_fungal.tsv: For each species of the fungal dataset, at each AuCoMe step: robust orthology, non-robust orthology, and annotation or orthology, several measures were computed, i.e.: the number of reactions, the number of ECs, the number of ECs valided by DeepEC, and ratio number of ECs valided by DeepEC / number of ECs. It was used to design Fig. S3(a) of this article. * figure_S1_Deepec_algal.tsv: For each species of the algal dataset, at each AuCoMe step: robust orthology, non-robust orthology, and annotation or orthology, several measures were computed, i.e.: the number of reactions, the number of ECs, the number of ECs valided by DeepEC, and the ratio number of ECs valided by DeepEC / number of ECs. It was used to design Fig. S3(a) of this paper. * SuplFile_o-Aminophenol_reactions.ods: This file comprises three tables: S9, S10, and S11 with more detail (like the amino acid sequences in the S11). 2/ Content of the datasets subdirectory 2.1/ Content of the algal, bacterial, and fungal directories These three directories are composed of 8 subdirectories: * FASTA: It contains the proteome of each species as a FASTA file. * cleaned_GBKs: For each species, it contains the annotated genome, with the protein sequences as a GenBank file. * dictionaries: For some species, genes needed to be renamed for compatibility reasons. In this case a CVS file with the old names of genes and the new ones is provided. * annotated_DATs: It contains a subdirectory per species with all the output files from Pathway Tools v23.5, without any post-treatment, in the DAT format. * annotated_PADMETs: For each species, it contains a metabolic network of the draft reconstruction step of AuCoMe, in the PADMET format. * final_SBMLs: For each species, it contains a metabolic network generated by the AuCoMe workflow, in the SBML format. * final_PADMETs: For each species, it contains a metabolic network generated by the AuCoMe workflow, at the PADMET format. * panmetabolism: It is composed of 7 files describing the final metabolic networks: – genes.tsv: This table contains, for each organism, the list of genes and the associated reactions. – metabolites.tsv: This table contains the list of metabolites present in the panmetabolism. Then, for each metabolite and for each organism, it lists the reactions that produced this compound and the reactions that consumed it. – pathways.tsv: This table contains the list of pathways present in the panmetabolism. For each pathway and for each organism, it indicates the number of reactions present in this pathway, and the names of these reactions. – reactions.tsv This table contains the list of reactions present in the panmetabolism. Then for each reaction, it indicates whether or not it belongs to an organism. If a reaction is found in a species, the genes associated with the reaction are also listed. – pvclust_reaction_dendrogram.png: Based on the presence/absence matrix of reactions in different species of the dataset, it computes the Jaccard distances between these species, and it applies a hierarchical clustering on these data with a complete linkage to create a dendrogram. The R package pvclust is used to create the dendrogram, then we added multiscale bootstrap resampling. For each node, a p-value indicates how strong the cluster is supported by data. This dendrogram is provided a PNG picture. 2.2/ Content of the synthetic bacterial repertory The synthetic bacterial repertory contains 32 subdirectories named Run 00, Run 01, ... , etc, Run 31. Each subdirectory is composed of 9 files: * K_12_MG1655.gbk: The annotated genome of E. coli K–12 MG1655 to which degradation of the functional and/or structural annotations was applied. * annotated_K_12_MG1655.sbml: The metabolic network of E. coli K–12 MG1655 output of the draft reconstruction step of AuCoMe in the SBML format. * annotated_K_12_MG1655.padmet: The metabolic network of E. coli K–12 MG1655 output of the draft reconstruction step of AuCoMe in the PADMET format. * orthology_K_12_MG1655.sbml: The metabolic network of E. coli K–12 MG1655 output of the orthology propagation step of AuCoMe in the SBML format. * orthology_K_12_MG1655.padmet: The metabolic network of E. coli K–12 MG1655 output of the orthology propagation step of AuCoMe in the PADMET format. * structural_K_12_MG1655.sbml: The metabolic network of E. coli K–12 MG1655 output of the structural verification step of AuCoMe in the SBML format. * structural_K_12_MG1655.padmet: The metabolic network of E. coli K–12 MG1655 output of the structural verification step of AuCoMe in the PADMET format. * final_K_12_MG1655.sbml: The metabolic network of E. coli K–12 MG1655 output of the AuCoMe workflow in the SBML format. * final_K_12_MG1655.padmet: The metabolic network of E. coli K–12 MG1655 output of the AuCoMe worflow in the PADMET format.

Keywords

metabolic evolution, genomics, systems biology, genomes, metabolism

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 10
  • 10
    views
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
0
Average
Average
Average
10