
Integrative Computational Profiling of Antimicrobial Resistance Genes and Machine Learning-Based Prediction of Drug-Resistant Mycobacterium tuberculosis in Bangladesh. Sifatullah Bilal Tuberculosis (TB) caused by Mycobacterium tuberculosis (MTB) remains a critical public health emergency in Bangladesh, one of the 30 high-burden TB countries globally. This study presents the largest integrative whole-genome sequencing (WGS) and machine learning (ML) analysis of drug-resistant MTB in Bangladesh to date, combining WGS of 250 clinical isolates collected from all eight administrative divisions (2015–2025) with systematic antimicrobial resistance (AMR) gene profiling, statistical analysis, and multi-model ML classification. The cohort was dominated by lineages L2 (East Asian/Beijing; 44%) and L4 (Euro-American; 35.6%), with a combined MDR/Pre-XDR/XDR burden of 23.2%. rpoB mutations were detected in 41.6% of isolates, followed by katG (36%), embB (22.8%), and pncA (22.4%). Strong resistance co-occurrence was observed between rpoB–pncA (φ=0.61) and rpoB–embB (φ=0.60). Six machine learning classifiers were evaluated for binary MDR prediction; XGBoost achieved the highest performance (ROC-AUC: 0.974, F1: 0.959, MCC: 0.921). SHAP explainability analysis identified katG S315T, rpoB S450L, and rpoB H445Y as the dominant predictive biomarkers. Geographically, the Dhaka and Chittagong divisions showed disproportionately high MDR and Pre-XDR burdens. Novel findings include lineage-specific pncA enrichment in L2 isolates, a potential XDR micro-cluster in the Dhaka–Chittagong corridor, and retreatment case type as an independent ML predictor. These findings support the implementation of WGS-based genomic surveillance and explainable ML as a clinical framework for personalised TB treatment and targeted public health interventions in Bangladesh.
