
The rise of spam messages, in the form of malware, phishing attacks, and unrequested messages, poses a serious threat to internet users and security infrastructures. Conventional spam filtering techniques that rely solely on strict rules and keyword lists struggle to keep pace with contemporary spammer tactics that mask malicious content. This study proposes a solution to this challenge by developing a hybrid machine learning methodology that leverages Naive Bayes (NB) and a Support Vector Machine (SVM), combining them into an ensemble for improved accuracy and resilience in spam detection. The technique uses the wellknown SMS Spam Collection Dataset. It employs more complex textual feature extraction (TF-IDF), as well as additional nontextual features such as message length, word capitalisation, and the frequency of previously determined keywords. The proposed system is extensively evaluated using standard classification metrics—accuracy, F1 score, precision, and recall —to assess its reliability and validity. The research findings indicate that the proposed machine learning hybrid ensemble is effective at reducing false positives while more boldly tackling the challenges inherent in the real-world spam data environment. The research project offers practical potential for use; the hybrid proposed system is computationally efficient enough for most real-time deployment applications in automated systems to combat spam. This research contributes scalable, adaptive spam-detection mechanisms suitable for real-time messaging environments.
Machine Learning, Naive Bayes, Hybrid Algorithm, Spam Detection, Cybersecurity, Support Vector Machines, Ensemble Models, Data Analytics, Nlp, Text Classification
Machine Learning, Naive Bayes, Hybrid Algorithm, Spam Detection, Cybersecurity, Support Vector Machines, Ensemble Models, Data Analytics, Nlp, Text Classification
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
