
cgDist is an ultra-fast distance calculator for bacterial genomics that computes SNP and InDel-level distances directly from core genome Multi-Locus Sequence Typing (cgMLST) allelic profiles. Key Innovation While traditional cgMLST analysis treats all allelic differences as equivalent units, cgDist achieves nucleotide-level resolution by performing pairwise sequence alignment only on differing alleles. This approach bridges the gap between the computational efficiency of cgMLST and the genetic resolution of SNP-based methods. Main Features Multi-mode distance calculations: SNPs-only, SNPs+InDel-events, SNPs+InDel-bases Unified cache architecture: Enables incremental surveillance where new samples are analyzed without re-aligning the entire dataset Allele caller agnostic: Compatible with any cgMLST schema (chewBBACA, BLAST, etc.) Integrated recombination detection: Identifies potential horizontal gene transfer events High performance: 94% time reduction with progressive performance gains as cache hit rates reach 88.3% Use Cases cgDist is designed for outbreak investigation and source attribution in bacterial genomics, particularly for foodborne pathogens (Salmonella, Listeria monocytogenes). It provides fine-scale genetic discrimination for epidemiological clustering while maintaining compatibility with existing cgMLST surveillance workflows. Citation If you use cgDist in your research, please cite our manuscript: bioRxiv preprint: https://doi.org/10.1101/2025.10.16.682749 Implementation Written in Rust for high performance. Source code includes comprehensive documentation, installation instructions, usage examples, and API reference.
If you use this software, please cite it as below.
bacterial genomics, genomics, cgmlst, bioinformatics
bacterial genomics, genomics, cgmlst, bioinformatics
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
