
The growing use of social media has increased online harassment, cyberhate, and the use of offensive language. This poses significant challenges for effectively detecting and addressing such issues. Natural Language Processing (NLP) has seen considerable advancements; however, automatically identifying offensive language remains a complex task due to the ambiguous and informal nature of user-generated content and the social context in which it occurs. In this thesis, our goal is to develop methods for automatic detection of offensive language in social media. Multiple classification algorithms, including Multinomial Naive Bayes, Gaussian Naive Bayes, SVM, Logistic Regression, and LSTM, are implemented and evaluated. Key measures including accuracy, F1 score, and AUC score are used to evaluate how well these algorithms work. Results show that the Random Forest Classifier obtains an AUC score of 0.65 and an accuracy of 0.82 without word2vec. On the other hand, LSTM demonstrates a competitive AUC score of 0.78 when compared to the Random Forest Classifier. These findings provide insights into the effectiveness of different algorithms for offensive language detection. The research contributes to the field by providing valuable tools and insights to enhance Turkish language processing and prioritize online safety, particularly in combating cyberbullying and fostering a tolerant online environment. The findings also pave the way for future research endeavors in natural language processing and have practical implications for protecting individuals and promoting a secure online space.
Siber Güvenlik ve Gizlilik (Diğer), social media, Cybersecurity and Privacy (Other), Engineering (General). Civil engineering (General), cyberbullying, Makine Öğrenme (Diğer), sınıflandırma algoritmaları, Cyberhate;Social media;Natural language processing;Classification algorithms;Cyberbullying, cyberhate, Chemistry, Cyberhate;Sosyal Medya;Doğal Dil İşleme;Sınıflandırma Algoritmaları;Siber Zorbalık, classification algorithms, sosyal medya, siber zorbalık, natural language processing, TA1-2040, doğal dil i̇şleme, QD1-999, Machine Learning (Other)
Siber Güvenlik ve Gizlilik (Diğer), social media, Cybersecurity and Privacy (Other), Engineering (General). Civil engineering (General), cyberbullying, Makine Öğrenme (Diğer), sınıflandırma algoritmaları, Cyberhate;Social media;Natural language processing;Classification algorithms;Cyberbullying, cyberhate, Chemistry, Cyberhate;Sosyal Medya;Doğal Dil İşleme;Sınıflandırma Algoritmaları;Siber Zorbalık, classification algorithms, sosyal medya, siber zorbalık, natural language processing, TA1-2040, doğal dil i̇şleme, QD1-999, Machine Learning (Other)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
