Downloads provided by UsageCounts
CONTENT OF THIS ARCHIVE The Zenodo archive is composed of one file and four main directories: * analyses gathers three subdirectories: algae, bacteria, and fungi. It includes all files used to create the figures, supplemental figures, and results of the paper. * code contains all AuCoMe and PADMET codes. – aucome_v0.5.1 this directory gathers the code of AuCoMe used to run the three datasets. – padmet_v5.0.1 this directory contains the code of PADMET used to run AuCoMe. * datasets this directory gathers all datasets on which AuCoMe was run: the bacterial, fungal, and algal datasets, and the 32 synthetic datasets, which contain an E. coli K–12 MG1655 genome to which various degradations were applied, together with 28 other bacterial genomes. It also encompasses the version 23.5 of MetaCyc database. * scripts_analyses this directory contains several scripts to generate the figures, supplemental figures and a script to degrade the E. coli K–12 MG1655 genome. 1/ Content of the analyses repertory It is composed of three subdirectories: algae, bacteria, and fungi. 1.1/ Content of the algae subdirectory It encompasses 9 files. * Figure_2_algal_nb_reactions.tsv for each species of the algal dataset, this file gives the number of reactions at each AuCoMe step. It was used to create figure 2D. * Figure_S10_Deepec_algal.tsv for each species of the algal dataset, at each AuCoMe step (robust orthology, non-robust orthology, and annotation or orthology), several measures were computed, i.e.: the number of reactions, the number of ECs, the number of ECs validated by DeepEC, and the ratio number of ECs valided by DeepEC / number of ECs. It was used to design figure S10(b). * Table_S6_50_random_reactions_found.xlsx contains manual validation of 50 randomly chosen reactions found in any of the species (is the Supplemental Table S6). * Table_S7_50_random reactions absent.xlsx includes manual validation of 50 reactions absent from a species and randomly chosen (is the Supplemental Table S7). * Table_S8_reactions_common_only_Cokamuranus_Sjaponica.xlsx encompasses reactions common to Saccharina japonica and Cladosiphon okamuranus but not found in other brown algae (is the Supplemental Table S8). * Table_S9_homologues_Esiliculosus_Sjaponica.xlsx contains additional homologs in E. siliculosus found by BLASTP searches for sequences inferred to be present only in C. okamuranus and S. japonica (is the Supplemental Table S9). * Table_S10_o-aminophenol_Esiliculosus_holomogues.xlsx includes additional o-aminophenol oxidases from E. siliculosus and their homologs in other stramenopiles. It is the Supplemental Table S10 with more detail (like the amino acid sequences). * Table_S11_reactions_cryptophytes_haptophytes_stramenopiles_archeplastida.xlsx encompasses reactions distinguishing the cryptophyte, haptophyte, stramenopile, and archeplastida groups (is the Supplemental Table S11). * Table_S12_pathways_cryptophytes_haptophytes_stramenopiles_archeplastida.xlsx contains shared metabolic pathways as well as the absence of pathways between chryptophytes, haptophytes, stramenopiles, and archaeplastida (is the Supplemental Table S12). 1.2/ Content of the bacteria subdirectory It gathers 12 files and 9 repertories. * aucome_final.tsv output file of the figure S4 comparison bacteria.py script, for each of the 29 bacterial metabolic networks produced with AuCoMe, this table contains the number of ECs, the number of unique ECs, the number of total reactions, the number of enzymatic reactions with genes, the number of enzymatic reactions without genes, and the number of spontaneous reactions. * carveme_stat.tsv output file of the figure S4 comparison bacteria.py script, for each of the 29 bacterial metabolic networks produced with CarveMe, this table contains the number of ECs, the number of unique ECs, the number of total reactions, the number of enzymatic reactions with genes, the number of enzymatic reactions without genes, and the number of spontaneous reactions. * ecocyc.padmet contains the EcoCyc database version 23.5 at the PADMet, is used to generate the Supplemental Fig. S5. * Figure_2_bacterial_nb_reactions.tsv for each species of the bacterial dataset, this file gives the number of reactions at each AuCoMe step. It was used to create figure 2B. * Figure_3_nb_reactions_step.tsv for each dataset of the 32 synthetic bacterial datasets, this file enumerates the number of reactions at each AuCoMe step. It was used to create figure 3A. * Figure_3_fmeasure_steps.tsv for each dataset of the 32 synthetic bacterial datasets, this file indicates the values of the F-measures resulting of the comparison of the GSMNs recovered for each E. coli K–12 MG1655 genome replicate with the gold-standard network EcoCyc. It was used to create figure 3B. * Figure_S4_output contains 3 output files of the figure_S4_comparison_bacteria.py script: – Figure_S4_boxplot_networks.svg is the Supplemental Figure S4 in high resolution. – Figure_S4_boxplot_networks.tsv contains the number of reactions, the type of reactions (All, Reactions with genes, ...), and the used software. Thes data were produced and used in the figure_S4_comparison_bacteria.py script. – Figure_S4_barplot_time_networks.svg for each software, shows the required time in seconds used to reconstruct these bacterial metabolic networks. * Figure_S5_output encompasses 3 output files of figure S5 reference catalog.py script: – Figure_S5_ec_union.svg is the Supplemental Figure S5 in high resolution. – Figure_S5_ec_union_venn.svg another visualisation of presenting the results of the Supplemental Fig. S5. – Figure_S5_refence_ec_catalog_K12MG1655.tsv contains an EC catalog to E. coli K-12 MG1655 from the BIGG, EcoCyc, KEGG, and ModelSEED databases. This file is used to produce the Supplemental Figure S5. * Figure_S6_output includes 2 output files of the figure_S6.py script: – Figure_S6_comparison_all.svg is the Supplemental Figure S6 in high resolution. – Figure_S6_comparison_all.tsv contains data used to produce the Supplemental Figure S6. * gapseq_stat.tsv output file of the figure_S4_comparison_bacteria.py script, for each of the 29 bacterial metabolic networks produced with gapseq, this table contains the number of ECs, the number of unique ECs, the number of total reactions, the number of enzymatic reactions with genes, the number of enzymatic reactions without genes, and the number of spontaneous reactions. * jsons_bigg todate contains the five metabolic networks of E. coli K–12 MG1655 that can find in BIGG at JSON format. These files correspond to the BIGG reference metabolic network on the Supplemental Figure S5. * jsons_modelseed todate includes the metabolic network of E. coli K–12 MG1655 that can find in ModelSEED at JSON format. It is the ModelSEED reference metabolic network on the Supplemental Figure S5. * kegg_ecs.txt input file of the figure_S5_reference_catalog.py script, it contains matches between EC numbers and all the entries of E. coli K–12 MG1655 in the KEGG database. * mapping_modelseed_ec.tsv input file of the figure_S4_comparison_bacteria.py script, it encompasses matches between ModelSEED reactions and EC numbers. * modelseed_stat.tsv output file of the figure_S4_comparison_bacteria.py script, for each of the 29 bacterial metabolic networks produced with ModelSEED, this table contains the number of ECs, the number of unique ECs, the number of total reactions, the number of enzymatic reactions with genes, the number of enzymatic reactions without genes, and the number of spontaneous reactions. * networks_aucome for each of the 29 bacteria, contains a metabolic networks at the PADMet format obtained with AuCoMe. * networks_carveme for each of the 29 bacteria, contains a metabolic networks at the SBML format got to CarveMe. * networks_gapseq composes of 29 subdirectories (one for each bacterium). All these subdirectories contain 10 files about the metabolic networks a obtained with gapseq: – species-all-Pathways.tbl encompasses data on pathways at TBL format. – species-all-Reactions.tbl includes data on reactions at TBL format. – species-draft.RDS is a draft metabolic network at RDS (R Data Format). – species-draft.xml is a draft metabolic network at SBML format. – species-medium.csv encompasses all the metabolites allow the default medium. – species.RDS is the final metabolic network at RDS (R Data Format). – species-rxnWeights.RDS is a temporary file nedeed to gapseq fill at RDS (R Data Format). – species-rxnXgenes.RDS is a temporary file nedeed to gapseq fill at RDS (R Data Format). – species-Transporter.tbl includes data on transporters at TBL format. – species.xml is the final metabolic network at SBML format. * networks_modelseed includes two subdirectories: – sbml for each of the 29 bacteria, encompasses a metabolic networks at the SBML format got to ModelSEED. – tsv for each of the 29 bacteria, contains two TSV files: – genomeset__species.gbk_genome.fbamodel-compounds.tsv includes data on compounds at TSV format. – genomeset__species.gbk_genome.fbamodel-reactions.tsv encompasses data on reactions at TSV format. * time_carveme.txt input file of the figure S4 comparison bacteria.py script, for each of the 29 bacteria it stores the running time of CarveMe (in seconds) to reconstruct a metabolic network. * time_gapseq.txt input file of the figure S4 comparison bacteria.py script, for each of the 29 bacteria it stores the running time of gapseq (in seconds) to reconstruct a metabolic network. 1.3/ Content of the fungi repertory It contains three files and five directories. * All-pathways-of-S.-cerevisiae-S288c.txt encompasses all the YeastCyc pathways. * Figure_2_fungal_nb_reactions.tsv for each species of the fungal dataset, this file gives the number of reactions at each AuCoMe step. It was used to create figure 2C. * Figure_S7_output contains 11 output files of the figure S7 comparison pathway fungi.py script: – completion_pathway_species.svg for each of the 5 fungi (L. bicolor, N. crassa, R. oryzae, S. cerevisiae S288C, and S. pombe), contains a subfigure of the Supplemental Fig. S7. – fungi_stats.tsv is the Supplemental Table S5. – pathway_venn_species.png for each of the 5 fungi (L. bicolor, N. crassa, R. oryzae, S. cerevisiae S288C, and S. pombe), includes a Venn diagram about all the pathways found with the 3 software (AuCoMe, gapseq, and ModelSEED). * Figures_S8_S9_output contains 11 files, in all these files, a comparison of all pathways of metabolic networks of S. cerevisiae S288C obtained with AuCoMe and gapseq to those of YeastCyc was released. – comparison_yeastcyc.png is a picture about number of pathways true positive, false positive, and false negative are found, according the used method (AuCoMe and gapseq). – completion_pathway_gapseq.svg includes the number of pathways common or specific to YeastCyc and gapseq with their completeness ratio predicted by gapseq. – Figure_S8_completion_pathway_aucome.svg contanis the number of pathways common or specific to YeastCyc and AuCoMe with their completeness ratio predicted by AuCoMe, is the Supplemental Figure S8. – Figure_S9_venn_diagram_70_100.svg is the Supplemental Figure S9. All pathways of AuCoMe, gapseq and YeastCyc with a completion rate between 50% and 70% are compared. – venn_diagram.svg in this picture, all pathways are compared. – venn_diagram 50.svg all pathways of AuCoMe, gapseq and YeastCyc with a completion rate less than 50% are compared. – venn_diagram_50_gapseq.svg all pathways of gapseq whatever their completion rate are compared to the AuCoMe and YeastCyc pathways with a completion rate less than 50%. – venn_diagram_50_70.svg all pathways of AuCoMe, gapseq, and YeastCyc with a completion rate between 50% and 70% are compared. – venn_diagram_50_70_gapseq.svg all pathways of gapseq whatever their completion rate are compared to the AuCoMe and YeastCyc pathways with a completion rate between 50% and 70%. – venn diagram_70_100_gapseq.svg all pathways of gapseq whatever their completion rate are compared to the AuCoMe and YeastCyc pathways with a completion rate between 70% and 100%. – yeast_cyc_comparison.tsv contains the number of pathways true positive, false positive, and false negative are found, according the used method (AuCoMe and gapseq). * Figure_S10_Deepec_fungal.tsv for each species of the fungal dataset, at each AuCoMe step (robust orthology, non-robust orthology, and annotation or orthology), several measures were computed, i.e.: the number of reactions, the number of ECs, the number of ECs valided by DeepEC, and ratio number of ECs validated by DeepEC / number of ECs. It was used to design figure S10(a). * networks_aucome for each of the 5 fungi (L. bicolor, N. crassa, R. oryzae, S. cerevisiae S288C, and S. pombe), contains a metabolic networks at the PADMet format obtained with AuCoMe. * networks_gapseq is composed of 5 subdirectories (one for each fungus). All these subdirectories contain two files about the metabolic networks a obtained with gapseq: – species-all-Pathways.tbl encompasses data on pathways at TBL format. – species-all-Reactions.tbl includes data on reactions at TBL format. * networks_modelseed for each of the 5 fungi (L. bicolor, N. crassa, R. oryzae, S. cerevisiae S288C, and S. pombe), contains two TSV files: – species.gbk_genome.draftModel-compounds.tsv includes data on compounds at TSV format. – species.gbk_genome.draftModel-reactions.tsv encompasses data on reactions at TSV format. 2/ Content of the code repertory It gathers two directories aucome v0.5.1 and padmet_v5.0.1. 2.1/ Content of the aucome v0.5.1 subdirectory This directory contains a copy of the AuCoMe project on the GitHub site: https://github.com/AuReMe/aucome (downloaded the 15/11/2022). It is composed of two subdirectories and five files: * LICENCE licence of the AuCoMe software. * README.rst README of the AuCoMe software. * requirements.txt contains the list of requires Python packages. * setup.cfg contains metadata about AuCoMe package and is used with setup.py to distribute AuCoMe. * setup.py contains various information relevant to the AuCoMe package including options and metadata. Then, it is used to distribute AuCoMe with PyPI. It is also used to create an entrypoint when installing it with pip. * recipes this subdirectory contains two files: – Dockerfile contains instructions to run AuCoMe in a Docker environment. – Singularity contains instructions to run AuCoMe in a Singularity container. * aucome this directory contains 11 Python files: – __init__.py indicates the directory as a python module. – __main__.py contains the functions implementing the command-line interface of AuCoMe. – analysis.py contains the functions to analyse the AuCoMe results. – check.py contains the functions to check the input files. – compare.py contains the functions to compare the AuCoMe results between two distinct subgroups. – orthology.py contains the functions to propagate reaction through orthology. – reconstruction.py contains the functions to perform the reconstruction of draft GSMNs by using Pathway Tools in a parallel implementation. – spontaneous.py contains the functions to add spontaneous reactions to some GSMNs if it completes MetaCyc metabolic pathway. – structural.py contains the functions to check that no reactions are m
metabolic evolution, genomics, systems biology, genomes, metabolism
metabolic evolution, genomics, systems biology, genomes, metabolism
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 11 | |
| downloads | 3 |

Views provided by UsageCounts
Downloads provided by UsageCounts