Parameter-Efficient Fine-Tuning of XLS-R for Arabic Speech Recognition

Parameter-Efficient Fine-Tuning of XLS-R for Arabic Speech Recognition Arabic Automatic Speech Recognition (ASR) faces persistent challenges due to complex morphology, dialectal variation, and limited labeled data. While large self-supervised models such as wav2vec2-XLSR (XLS-R) have demonstrated strong performance for Arabic ASR, their large size makes full fine-tuning computationally expensive and impractical in many settings. This release accompanies our study on parameter-efficient fine-tuning (PEFT) methods for Arabic ASR, providing the first systematic evaluation of LoRA and DoRA applied to a CTC-based self-supervised model (XLS-R). We evaluate full fine-tuning, LoRA, and DoRA on the newly released Mozilla Common Voice Arabic v24.0 dataset. Our results show that full fine-tuning achieves 23.03% Word Error Rate (WER), establishing a new state-of-the-art among XLS-R-based Arabic ASR models. LoRA achieves 36.10% WER while training only ~2.2% of model parameters, offering a strong accuracy–efficiency trade-off and enabling lightweight deployment via small adapters. DoRA is evaluated for Arabic speech recognition for the first time. This Zenodo record includes the training and evaluation code, configuration files, and trained LoRA and DoRA adapters, supporting reproducibility and future research on efficient Arabic ASR systems.

Related Organizations

King Fahd University of Petroleum and Minerals
Saudi Arabia

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average