Evaluating ChatGPT-4 and Machine Learning Models for Sentiment Analysis on a Multi-Script Moroccan Arabic Corpus: Insights, Challenges, and Future Directions

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 02 Apr 2025 English Publisher:Centre pour la Communication Scientifique Directe (CCSD)Journal:Journal of Data Mining & Digital Humanities, volume NLP4DH (eissn: 2416-5999,

Copyright policy )

Authors: null Mohamed HANNANI; null Abdelhadi SOUDI; null Kristof Van Laerhoven;

doi: 10.46298/jdmdh.15092 , 10.5281/zenodo.14968383 , 10.5281/zenodo.14675835 , 10.5281/zenodo.14968542 , 10.5281/zenodo.14968384 , 10.5281/zenodo.14675836

Evaluating ChatGPT-4 and Machine Learning Models for Sentiment Analysis on a Multi-Script Moroccan Arabic Corpus: Insights, Challenges, and Future Directions

- Summary
- Metrics

Abstract

The application of Large Language Models (LLMs) to low-resource languages and dialects, such as Moroccan Arabic (MA), remains a relatively unexplored area. This study evaluates the performance of ChatGPT-4, fine-tuned BERT models, FastText embeddings, and traditional machine learning approaches for sentiment analysis on MA. Using two publicly available MA datasets—the Moroccan Arabic Corpus (MAC) from X (formerly Twitter) and the Moroccan Arabic YouTube Corpus (MYC)—we assess the ability of these models to detect sentiment across different contexts. Although fine-tuned models performed well, ChatGPT-4 exhibited substantial potential for sentiment analysis, even in zero-shot scenarios. However, performance on MA was generally lower than on Modern Standard Arabic (MSA), attributed to factors such as regional variability, lack of standardization, and limited data availability. Future work should focus on expanding and standardizing MA datasets, as well as developing new methods like combining FastText and BERT embeddings with attention mechanisms to improve performance on this challenging dialect.

Related Organizations

University of Siegen
Germany
National School of Mineral Industry
Morocco
ECOLE NATIONALE SUPERIEURE DES MINES DE RABAT
Morocco

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

2

Top 10%

Average

Green

Published in a Diamond OA journal