Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Article
Data sources: ZENODO
addClaim

Comprehensive Analysis of Hybrid Detection Spam Detection Models using Machine Learning

Authors: Sarode, Ashwini Janardhan; Bamnote, Prof. (Dr) Gajendra;

Comprehensive Analysis of Hybrid Detection Spam Detection Models using Machine Learning

Abstract

In the modern digital era, the rapid growth of electronic communication through email and SMS has also led to a significant increase in unsolicited and harmful spam messages. These messages not only cause inconvenience to users but can also lead to phishing attacks, fraud, and data breaches. To address this issue, the proposed system aims to develop an intelligent Comprehensive Analysis of Hybrid Detection Spam Detection Models using Machine Learning using Machine Learning techniques. The system is designed as a binary classification model that categorizes incoming text messages as either spam or ham (not spam). It utilizes a publicly available dataset such as the SMS Spam Collection Dataset from Kaggle. The dataset undergoes several preprocessing steps including text cleaning, tokenization, removal of stopwords, and optional stemming or lemmatization to improve data quality. After preprocessing, the textual data is transformed into numerical feature vectors using techniques such as TF-IDF (Term Frequency–Inverse Document Frequency). These features are then used to train machine learning models such as Naïve Bayes, Logistic Regression, and Support Vector Machine (SVM) to achieve accurate classification results. The performance of the models is evaluated using metrics like precision, recall, accuracy, and F1-score.

Powered by OpenAIRE graph
Found an issue? Give us feedback