
Social media platforms have become indispensable parts of our lives, offering avenues to share news, thoughts, and updates, as well as connect with new friends and explore various fields of knowledge. However, these platforms also harbor challenges, as they can inadvertently propagate hate speech and offensive content. Arabic, being the sixth most spoken language globally and widely used in over 22 countries, requires special attention to control and prevent the spread of hate speech. The core objective of this study is to develop an improved Arabic model for classifying offensive content, achieved by merging multiple Arabic hate and offensive datasets, including Iraqi offensive samples. Three distinct strategies were used: support vector machine (SVM), long short-term memory (LSTM), and the AraBERT transformer model. The used models were evaluated using recall, precision, F1-score, and accuracy metrics. Notably, the transformer model consistently outperformed the others across all metrics, showcasing its superior performance. Moreover, each dataset underwent assessment using the three models, consistently revealing the transformer's heightened efficiency.
Transformer, LSTM and SVM NLP Machine learning Transformer SVM offensive language, SVM, Machine learning, offensive language, and SVM NLP, [INFO] Computer Science [cs], LSTM
Transformer, LSTM and SVM NLP Machine learning Transformer SVM offensive language, SVM, Machine learning, offensive language, and SVM NLP, [INFO] Computer Science [cs], LSTM
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
