
Low-resource languageslanguages with limited annotated corpora, lexicons, and digital resourcespose major challenges for modern natural language processing (NLP). Recent progress in transfer learning, multilingual pretraining, parameter-efficient adaptation, data augmentation, and community-driven dataset creation has substantially improved capabilities for many such languages, yet large performance gaps remain compared to high-resource languages. This article surveys the technical advances that enable NLP for low-resource languages (including unsupervised and weakly supervised methods, multilingual and massively multilingual models, few-shot and in-context learning with large language models, and adapter/LoRA-style parameter-efficient fine-tuning). We examine practical pipelines for tasks such as machine translation, speech recognition, OCR, and information extraction; describe prominent dataset and community projects; summarize typical evaluation strategies and their pitfalls; and outline promising research directions (community data collection, privacy-preserving methods, on-device adaptation, and ethics-aware deployments). The review highlights approaches that balance performance, compute cost, and data-efficiency, and recommends research and deployment practices to accelerate inclusive language technology.
RA / adapters, data augmentation, machine translation, speech datasets, Masakhane, Common Voice Reference:
RA / adapters, data augmentation, machine translation, speech datasets, Masakhane, Common Voice Reference:
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
