
Abstract MLST (multi-locus sequence typing) is a classic technique for genotyping bacteria, widely applied for pathogen outbreak surveillance. Traditionally, MLST is based on identifying sequence types from a small number of housekeeping genes. With the increasing availability of whole-genome sequencing (WGS) data, MLST methods have evolved toward larger typing schemes, based on a few hundred genes (core genome MLST, cgMLST) to a few thousand genes (whole genome MLST, wgMLST). Such large-scale MLST schemes have been shown to provide a finer resolution and are increasingly used in various contexts such as hospital outbreaks or foodborne pathogen outbreaks. This methodological shift raises new computational challenges, especially given the large size of the schemes involved. Very few available MLST callers are currently capable of dealing with large MLST schemes. We introduce MentaLiST, a new MLST caller, based on a k -mer voting algorithm and written in the Julia language, specifically designed and implemented to handle large typing schemes. We test it on real and simulated data to show that MentaLiST is faster than any other available MLST caller while providing the same or better accuracy, and is capable of dealing with MLST scheme with up to thousands of genes while requiring limited computational resources. MentaLiST source code and easy installation instructions using a Conda package are available at https://github.com/WGS-TB/MentaLiST .
0604 Genetics, Molecular Epidemiology, Genes, Essential, Time Factors, Bacteria, Whole Genome Sequencing, Methods Paper, Enterococcus faecium, 610, Mycobacterium tuberculosis, Bacterial Typing Techniques, Disease Outbreaks, Foodborne Diseases, Salmonella, 616, Epidemiological Monitoring, Humans, Genome, Bacterial, Software, 0605 Microbiology, Multilocus Sequence Typing
0604 Genetics, Molecular Epidemiology, Genes, Essential, Time Factors, Bacteria, Whole Genome Sequencing, Methods Paper, Enterococcus faecium, 610, Mycobacterium tuberculosis, Bacterial Typing Techniques, Disease Outbreaks, Foodborne Diseases, Salmonella, 616, Epidemiological Monitoring, Humans, Genome, Bacterial, Software, 0605 Microbiology, Multilocus Sequence Typing
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 46 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
