Name: Offensive Language Detection in Turkish Language by Using NLP
Keywords: Siber Güvenlik ve Gizlilik (Diğer), social media, Cybersecurity and Privacy (Other), Engineering (General). Civil engineering (General), cyberbullying, Makine Öğrenme (Diğer), sınıflandırma algoritmaları, Cyberhate;Social media;Natural language processing;Classification algorithms;Cyberbullying, cyberhate, Chemistry

descriptionPublicationkeyboard_double_arrow_right Article 28 Feb 2025Publisher:Sakarya University Journal of ScienceJournal:Sakarya University Journal of Science, volume 29, pages 1-17 (eissn: 2147-835X,

Authors: Bekir Furkan Kesgin; Rüştü Murat Demirer;

doi: 10.16984/saufenbilder.1349956

Offensive Language Detection in Turkish Language by Using NLP

- Summary
- Subjects
- Metrics

Abstract

The growing use of social media has increased online harassment, cyberhate, and the use of offensive language. This poses significant challenges for effectively detecting and addressing such issues. Natural Language Processing (NLP) has seen considerable advancements; however, automatically identifying offensive language remains a complex task due to the ambiguous and informal nature of user-generated content and the social context in which it occurs. In this thesis, our goal is to develop methods for automatic detection of offensive language in social media. Multiple classification algorithms, including Multinomial Naive Bayes, Gaussian Naive Bayes, SVM, Logistic Regression, and LSTM, are implemented and evaluated. Key measures including accuracy, F1 score, and AUC score are used to evaluate how well these algorithms work. Results show that the Random Forest Classifier obtains an AUC score of 0.65 and an accuracy of 0.82 without word2vec. On the other hand, LSTM demonstrates a competitive AUC score of 0.78 when compared to the Random Forest Classifier. These findings provide insights into the effectiveness of different algorithms for offensive language detection. The research contributes to the field by providing valuable tools and insights to enhance Turkish language processing and prioritize online safety, particularly in combating cyberbullying and fostering a tolerant online environment. The findings also pave the way for future research endeavors in natural language processing and have practical implications for protecting individuals and promoting a secure online space.

Related Organizations

Bahçeşehir University
Turkey

Keywords

Siber Güvenlik ve Gizlilik (Diğer), social media, Cybersecurity and Privacy (Other), Engineering (General). Civil engineering (General), cyberbullying, Makine Öğrenme (Diğer), sınıflandırma algoritmaları, Cyberhate;Social media;Natural language processing;Classification algorithms;Cyberbullying, cyberhate, Chemistry, Cyberhate;Sosyal Medya;Doğal Dil İşleme;Sınıflandırma Algoritmaları;Siber Zorbalık, classification algorithms, sosyal medya, siber zorbalık, natural language processing, TA1-2040, doğal dil i̇şleme, QD1-999, Machine Learning (Other)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	1
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Average

gold

Related to Research communities

Digital Humanities and Cultural Heritage