Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Phonicaarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
Phonica
Article . 2025 . Peer-reviewed
License: CC BY
Data sources: Crossref
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
Phonica
Article . 2025
Data sources: DOAJ
versions View all 2 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Assessment of L2 Spanish pronunciation accuracy via Automatic Speech Recognition

Authors: Albina Sarymsakova; Patricia Martín Rodilla;

Assessment of L2 Spanish pronunciation accuracy via Automatic Speech Recognition

Abstract

This study contributes to the evaluation of non-native Spanish speakers’ acoustic production using Artificial Intelligence (AI) tools, specifically Automatic Speech Recognition (ASR) models. In order to determine whether leading ASR models can provide adequate feedback on L2 Spanish pronunciation, we evaluated four models (Wav2Vec, Whisper-large-v2, Whisper-large-v3, and SeamlessM4T) using datasets of non-native Spanish speakers with English, Russian, and German as L1s. Based on a Word Error Rate and Character Error Rate evaluation framework, Whisper-large-v3 and SeamlessM4T demonstrated the highest accuracy for non-native speech recognition. A qualitative and phonetic error analysis revealed that these models struggle when vowel formant boundaries of L2 speakers exceed those of standard Spanish or when voiceless consonants are influenced by phonetic assimilation processes. Additionally, we identified gender bias, with models performing better on female speech than male speech, and substitution errors as the most frequent error type. In conclusion, while ASR models like Whisper-large-v3 and SeamlessM4T perform adequately, an accurate pronunciation assessment for L2 Spanish learners requires their outputs to be complemented by a detailed phonetic analysis.

Keywords

ASR, Phonetics, Spanish L2, Computational linguistics. Natural language processing, P1-1091, Non-native speakers, P98-98.5, Whisper, Philology. Linguistics

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
gold