
doi: 10.3390/app14219992
In this paper, a contrastive learning approach for morphological disambiguation (MD) using large language models (LLMs) is presented. A contrastive loss function is introduced for training the approach, which reduces the distance between the correct analysis and contextual embeddings while maintaining a margin between correct and incorrect embeddings. One of the aims of the paper is to analyze the effects of fine-tuning an LLM on MD in morphologically complex languages (MCLs) with special reference to low-resource languages such as Kazakh, as well as Turkish. Another goal of the paper is to consider various distance measures for this contrastive loss function, aiming to achieve better results when performing disambiguation by computing the distance between the context and the analysis embeddings. The existing approaches for morphological disambiguation, such as HMM-based and feature-engineering approaches, have limitations in modeling long-term dependencies and in the case of large, sparse tagsets. These challenges are mitigated in the proposed approach by leveraging LLMs, thus achieving better accuracy in handling the cases of ambiguity and OOV tokens without the need to rely on other features. Experiments were conducted on three datasets for two MCLs, Kazakh and Turkish—the former is a typical low-resource language. The results revealed that the proposed approach with contrastive loss improves MD performance when integrated with knowledge from large language models.
Technology, contrastive learning, QH301-705.5, T, Physics, QC1-999, Engineering (General). Civil engineering (General), low-resource language, Chemistry, morphological disambiguation, large language models, TA1-2040, Biology (General), QD1-999
Technology, contrastive learning, QH301-705.5, T, Physics, QC1-999, Engineering (General). Civil engineering (General), low-resource language, Chemistry, morphological disambiguation, large language models, TA1-2040, Biology (General), QD1-999
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 3 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
