Powered by OpenAIRE graph
Found an issue? Give us feedback
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Дообучение больших языковых моделей для решения специализированных задач : магистерская диссертация

Authors: Molchanova, T. A.;

Дообучение больших языковых моделей для решения специализированных задач : магистерская диссертация

Abstract

В выпускной квалификационной работе рассмотрены методы дообучения больших языковых моделей для решения специализированных задач. В качестве специализированной задачи был выбран мультиязычный перевод в сфере информационной безопасности. Для дообучения и оценки моделей был собран датасет из 1001 тройки параллельных предложений на русском, английском и испанском языках из документов компаний Trellix, IBM, Kaspersky и Dr. Web. В качестве моделей для дообучения были выбраны Mistral Instruct 7B и Llama Chat 7B. Дообучение моделей проводилось методами zero-shot, few-shot и PEFT ввиду ограничений исследования, заключающихся в использовании одного устройства с одной видеокартой объёмом 12-24 ГБ. Оценка качества переводов полученных моделей рассчитывалась на основе метрики BLEU.

The work is devoted to comparison of LLM-tuning methods for specific tasks. Multilingual translation in the domain of information security was chosen as a specific task. In order to tune and evaluate the models, a dataset of 1001 triples of parallel sentences in Russian, English and Spanish was collected from documentation of Trellix, IBM, Kaspersky and Dr. Web. The models that were used for tuning are Mistral Instruct 7B and Llama Chat 7B. As for the tuning technics, zero-shot, few-shot and PEFT were used due to the limitations grounded in usage of one device with one GPU of 12-24 GB. The translation capabilities of the resulting models were measured by the BLEU metric.

Keywords

LANGUAGE MODELLING, MACHINE TRANSLATION, ДООБУЧЕНИЕ МОДЕЛЕЙ, МАГИСТЕРСКАЯ ДИССЕРТАЦИЯ, MASTER'S THESIS, MODEL TUNING, TRANSFORMERS, МАШИННЫЙ ПЕРЕВОД, МУЛЬТИЯЗЫЧНЫЙ МАШИННЫЙ ПЕРЕВОД, БОЛЬШИЕ ЯЗЫКОВЫЕ МОДЕЛИ, КВАНТИЗАЦИЯ МОДЕЛЕЙ, MULTILINGUAL MACHINE TRANSLATION, LARGE LANGUAGE MODELS, ЯЗЫКОВОЕ МОДЕЛИРОВАНИЕ, MODEL QUANTIZATION, ТРАНСФОРМЕРЫ

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!