Downloads provided by UsageCounts
Languages are disappearing at an alarming rate, linguistics rights of speakers of most of the 7000 languages are under risk. ICT plays a key role in the preservation of endangered languages; as ultimate use of ICT, natural language processing must be highlighted since in this century the lack of such support hampers literacy acquisition as well as prevents the use of Internet and any electronic means. The first step is the building of resources for processing, therefore we introduce the first speech corpus of Southern Quechua, Siminchik, suitable for training and evaluating speech recognition systems. The corpus consists of 97 hours of spontaneous conversations recorded in radio programs in the Southern regions of Peru. The annotation task was carried out by native speakers from those regions using the unified written convention. We present initial experiments on speech recognition and language modeling and explain the challenges inherent to the nature and current status of this ancestral language.
Cardenas, R., Zevallos, R., Baquerizo, R., & Camacho, L. (2018). Siminchik: A Speech Corpus for Preservation of Southern Quechua. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC'18).
Quechua, endangered languages, corpus, speech recognition
Quechua, endangered languages, corpus, speech recognition
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 15 | |
| downloads | 9 |

Views provided by UsageCounts
Downloads provided by UsageCounts