Cyberbullying Detection in Social Media Contents using Machine Learning Techniques

Amey Gujar; Akhilesh Ghorpade; Indrajeet Chougule; Vedant Gawas; Paras Gurjar; Himanshu Baboria; Prof. Vinod Khetade

Found an issue? Give us feedback

ZENODOarrow_drop_down

ZENODO

Preprint

Data sources: ZENODO

Cyberbullying Detection in Social Media Contents using Machine Learning Techniques

descriptionPublicationkeyboard_double_arrow_right Preprint Under curation English Publisher:Zenodo

Authors: Amey Gujar; Akhilesh Ghorpade; Indrajeet Chougule; Vedant Gawas; Paras Gurjar; Himanshu Baboria; Prof. Vinod Khetade;

doi: 10.5281/zenodo.20546687

Cyberbullying Detection in Social Media Contents using Machine Learning Techniques

- Summary

Abstract

Cyberbullying is a big issue in today's digital world. It ruins people's feelings and mental health through nasty messages and harsh words. Since there's a flood of content on social media all the time, spotting this stuff manually is nearly impossible because it takes ages, and you can't expand that easily. So, researchers came up with an awesome solution: a Machine Learning framework that automatically spots cyberbullying. It uses Natural Language Processing techniques, which first tidy up the text by dealing with things like normalizing words, splitting text into tokens, and understanding emojis. Plus, it can manage English, Hindi, Marathi, and Hinglish texts too!After sorting the text out, the system changes this info into numbers using something called TF-IDF. Then, it employs a Linear Support Vector Machine for classification, using sklearn’s svm.SVC with a linear kernel. During development, several SVM setups were looked at, yet the linear one showed the best results in terms of accuracy and computing power needed.Our experiment proves that the TF-IDF and Linear SVM setup works really well for classification, while keeping things efficient and cutting down on resources. We tested it on a set of 31,183 social media text examples, including 23,820 bullying cases and 7,363 safe ones. What makes our system unique is its ability to process multiple languages and understand emojis. This lets it deal with the many different ways people communicate on social media. Additionally, we deployed it using a Flask-based API, which means it can easily be integrated into web apps. So, it's a handy tool for real-world content moderation and boosting online safety.

Found an issue? Give us feedback