
doi: 10.5128/lv28.01
Suurte korpuste automaatsel töötlemisel kasutatakse erinevat keeletarkvara ja statistilist analüüsi, mille valik ning kombineerimisvõimalused sõltuvad keelest, uurimisobjektist ja eesmärkidest. Artiklis tutvustame teksti keelekasutusmustrite otsimiseks mõeldud integreeritud tarkvararakendust Klastrileidja ja selle toimesüsteemi, anname ülevaate lingvistilise klasteranalüüsi abil saadud uurimistulemustest. Eesmärk on seletada, mida selle meetodi rakendamine loomuliku keele töötluse käigus võimaldab avastada eesti keele ja õppija keelekasutuse kohta ning kuidas neid teadmisi pedagoogilistel vajadustel rakendada. *** "Patterns of language use and linguistic cluster analysis" For automatic processing of large electronic corpora, different language analysis tools and statistical methods are applied, the choice and combination of which depend on the language, the object and goals of study. In this article, we introduce an integrated software tool Klastrileidja (Cluster Catcher), which has been developed for finding language use patterns, and we give an overview of the study results obtained, using linguistic cluster analysis. The purpose is to explain the possibilities that this method offers for natural language processing, exploring Estonian and learner language use as well as for pedagogical needs.
Finnic. Baltic-Finnic, eesti keel, õppijakeel, learner language, loomuliku keele töötlus, language use patterns, Estonian, PH91-98.5, keelekasutusmustrid, natural language processing
Finnic. Baltic-Finnic, eesti keel, õppijakeel, learner language, loomuliku keele töötlus, language use patterns, Estonian, PH91-98.5, keelekasutusmustrid, natural language processing
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
