
Dataset contents This dataset includes files with the loci subsets, such as core loci subsets for cgMLST analysis, and the correspondence between the legacy loci IDs used by the discontinued Chewie-NS instance and the loci IDs used by the latest Chewie-NS instance. Each ZIP archive includes files for the schemas of a specific species. The contents of each ZIP archive are the following: species1_Spyogenes.zip -- contains the files for the schemas of the species with ID=1 (Streptococcus pyogenes). species1_Spyogenes_schema1 -- contains the files for the schema with ID=1. species1_Spyogenes_schema1_loci_IDs_mapping.tsv -- contains the loci ID correspondence between the loci IDs used by the current instance of Chewie-NS and the original loci IDs. species1_Spyogenes_schema1_cgMLST95_loci_IDs.txt -- contains the list of loci IDs used by the current instance of Chewie-NS for the core loci defined based on a loci presence threshold of 95%. species1_Spyogenes_schema1_cgMLST95_loci_IDs_mapping.tsv -- contains the correspondence between the loci IDs for the core loci, defined based on a loci presence threshold of 95%, used by the current instance of Chewie-NS and the original loci IDs. species1_Spyogenes_schema1_cgMLST99_loci_IDs.txt -- contains the list of loci IDs used by the current instance of Chewie-NS for the core loci defined based on a loci presence threshold of 99%. species1_Spyogenes_schema1_cgMLST99_loci_IDs_mapping.tsv -- contains the correspondence between the loci IDs for the core loci, defined based on a loci presence threshold of 99%, used by the current instance of Chewie-NS and the original loci IDs. species1_Spyogenes_schema1_cgMLST100_loci_IDs.txt -- contains the list of loci IDs used by the current instance of Chewie-NS for the core loci defined based on a loci presence threshold of 100%. species1_Spyogenes_schema1_cgMLST100_loci_IDs_mapping.tsv -- contains the correspondence between the loci IDs for the core loci, defined based on a loci presence threshold of 100%, used by the current instance of Chewie-NS and the original loci IDs. species1_Spyogenes_schema1_Transcriptional_Regulators_loci_IDs.txt -- contains the list of loci IDs used by the current instance of Chewie-NS for a set of transcriptional regulators. species1_Spyogenes_schema1_Transcriptional_Regulators_loci_IDs_mapping.tsv -- contains the correspondence between the loci IDs for the transcriptional regulators used by the current instance of Chewie-NS and the original loci IDs. species1_Spyogenes_schema1_Virulence_Factors_loci_IDs.txt -- contains the list of loci IDs used by the current instance of Chewie-NS for a set of virulence factors. species1_Spyogenes_schema1_Virulence_Factors_loci_IDs_mapping.tsv -- contains the correspondence between the loci IDs for the virulence factors used by the current instance of Chewie-NS and the original loci IDs. species10_Ecoli.zip -- contains the files for the schemas of the species with ID=10 (Escherichia coli). species10_Ecoli_schema1 -- contains the files for the schema with ID=1 (more information about the schema creation process and the definition of the loci subsets is available here). species10_Ecoli_schema1_loci_IDs_mapping.tsv -- contains the loci ID correspondence between the loci IDs used by the current instance of Chewie-NS, the original loci IDs, and the loci IDs used in the first instance of Chewie-NS (discontinued on July 2025). species10_Ecoli_schema1_cgMLST99_loci_IDs.txt -- contains the list of loci IDs used by the current instance of Chewie-NS for the core loci defined based on a loci presence threshold of 99%. species10_Ecoli_schema1_cgMLST99_loci_IDs_mapping.tsv -- contains the correspondence between the loci IDs for the core loci, defined based on a loci presence threshold of 99% as described here, used by the current instance of Chewie-NS, the original loci IDs, and the loci IDs used in the first instance of Chewie-NS (discontinued on July 2025). species14_Senterica.zip -- contains the files for the schemas of the species with ID=14 (Salmonella enterica). species14_Senterica_schema1 -- contains the files for the schema with ID=1 (more information about the schema creation process and the definition of the loci subsets is available here). species14_Senterica_schema1_loci_IDs_mapping.tsv -- contains the loci ID correspondence between the loci IDs used by the current instance of Chewie-NS, the original loci IDs, and the loci IDs used in the first instance of Chewie-NS (discontinued on July 2025). species14_Senterica_schema1_cgMLST99_loci_IDs.txt -- contains the list of loci IDs used by the current instance of Chewie-NS for the core loci defined based on a loci presence threshold of 99%. species14_Senterica_schema1_cgMLST99_loci_IDs_mapping.tsv -- contains the correspondence between the loci IDs for the core loci, defined based on a loci presence threshold of 99% as described here, used by the current instance of Chewie-NS, the original loci IDs, and the loci IDs used in the first instance of Chewie-NS (discontinued on July 2025). species18_Lmonocytogenes.zip -- contains the files for the schemas of the species with ID=18 (Listeria monocytogenes). species18_Lmonocytogenes_schema1 -- contains the files for the schema with ID=1 (corresponding to the Institut Pasteur Listeria moncytogenes cgMLST schema described in Moura et al, 2016, available at https://bigsdb.pasteur.fr/listeria/). species18_Lmonocytogenes_schema1_loci_IDs_mapping.tsv -- contains the correspondence between the loci IDs for the core loci used by the current instance of Chewie-NS, the original loci IDs, and the loci IDs used in the first instance of Chewie-NS (discontinued on July 2025). Converting legacy loci IDs to the latest loci IDs It is possible to convert the legacy loci IDs in results files generated with schemas downloaded from the discontinued Chewie-NS instance to the loci IDs used by the latest Chewie-NS instance by using the convert_ids.py Python script included in this dataset. This script converts any legacy loci IDs in results files to the loci IDs used by the latest instance of Chewie-NS. The script accepts a single results files (e.g., files generated by chewBBACA's AlleleCall module, such as the results_alleles.tsv or loci_summary_stats.tsv files) and a TSV file with the loci ID correspondence. This dataset includes files with the loci ID correspondence between legacy and latest schemas for the following species: Escherichia coli (species10_Ecoli_schema1_loci_IDs_mapping.tsv) Salmonella enterica (species14_Senterica_schema1_loci_IDs_mapping.tsv) Listeria monocytogenes (species18_Lmonocytogenes_schema1_loci_IDs_mapping.tsv) To provide an example, to convert legacy loci IDs in a results file for E. coli, such as the results_alleles.tsv file containing allelic profiles, with the following contents: FILE INNUENDO_wgMLST-00016024 INNUENDO_wgMLST-00016025 INNUENDO_wgMLST-00016026 Genome1 1 2 1 Genome2 2 2 2 Genome3 1 1 1 All that is necessary is to run the following command: python convert_ids.py -i results_alleles.tsv -it species10_Ecoli_schema1_loci_IDs_mapping.tsv The script will substitute all legacy loci IDs by the loci IDs used by the latest instance of Chewie-NS, resulting in the following file contents: FILE wgMLST-00027274 wgMLST-00027275 wgMLST-00027276 Genome1 1 2 1 Genome2 2 2 2 Genome3 1 1 1 The script can be used to convert loci IDs in any file that includes legacy loci IDs. It is also possible to convert back to the legacy loci IDs by providing the --invert option. To view the full usage instructions for the script, run the following command: python convert_ids.py -h
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
