
Textual data is the basis for most of historical researches. This circumstance makes the development of methods and technologies of natural language processing especially significant for historical science. In recent years, deep learning methods have dominated the field of natural language processing. Many variants of large pre-trained language models have emerged. This article analyzes the experience of creating language models based on transformers for historical languages. Possible risks and prospects for their implementation are considered.
Анализируется опыт создания языковых моделей на основе трансформеров для исторических языков, поскольку текстовые данные являются базой для большинства исторических исследований, что делает особенно значимым для развитие методов и технологий обработки естественного языка исторической науки. Рассмотрены возможные риски и перспективы внедрения подобных языковых моделей.
DIGITAL HUMANITIES, NATURAL LANGUAGE PROCESSING, ЦИФРОВАЯ ИСТОРИЯ, ЦИФРОВАЯ ГУМАНИТАРИСТИКА, DIGITAL HISTORY, MACHINE LEARNING, МАШИННОЕ ОБУЧЕНИЕ, ОБРАБОТКА ЕСТЕСТВЕННОГО ЯЗЫКА
DIGITAL HUMANITIES, NATURAL LANGUAGE PROCESSING, ЦИФРОВАЯ ИСТОРИЯ, ЦИФРОВАЯ ГУМАНИТАРИСТИКА, DIGITAL HISTORY, MACHINE LEARNING, МАШИННОЕ ОБУЧЕНИЕ, ОБРАБОТКА ЕСТЕСТВЕННОГО ЯЗЫКА
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
