Role of Language Relatedness in Multilingual Fine-tuning of Language Models: A Case Study in Indo-Aryan Languages

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Jan 2021Embargo end date: 01 Jan 2021Publisher:Association for Computational Linguistics (ACL)Journal:Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Authors: Dhamecha, Tejas Indulal; Murthy V, Rudra; Bharadwaj, Samarth; Sankaranarayanan, Karthik; Bhattacharyya, Pushpak;

doi: 10.18653/v1/2021.emnlp-main.675 , 10.48550/arxiv.2109.10534

arXiv: 2109.10534

Role of Language Relatedness in Multilingual Fine-tuning of Language Models: A Case Study in Indo-Aryan Languages

- Summary
- Subjects
- Related research
  (4)
- Metrics

Abstract

We explore the impact of leveraging the relatedness of languages that belong to the same family in NLP models using multilingual fine-tuning. We hypothesize and validate that multilingual fine-tuning of pre-trained language models can yield better performance on downstream NLP applications, compared to models fine-tuned on individual languages. A first of its kind detailed study is presented to track performance change as languages are added to a base language in a graded and greedy (in the sense of best boost of performance) manner; which reveals that careful selection of subset of related languages can significantly improve performance than utilizing all related languages. The Indo-Aryan (IA) language family is chosen for the study, the exact languages being Bengali, Gujarati, Hindi, Marathi, Oriya, Punjabi and Urdu. The script barrier is crossed by simple rule-based transliteration of the text of all languages to Devanagari. Experiments are performed on mBERT, IndicBERT, MuRIL and two RoBERTa-based LMs, the last two being pre-trained by us. Low resource languages, such as Oriya and Punjabi, are found to be the largest beneficiaries of multilingual fine-tuning. Textual Entailment, Entity Classification, Section Title Prediction, tasks of IndicGLUE and POS tagging form our test bed. Compared to monolingual fine tuning we get relative performance improvement of up to 150% in the downstream tasks. The surprise take-away is that for any language there is a particular combination of other languages which yields the best performance, and any additional language is in fact detrimental.

Accepted in EMNLP 2021

Keywords

FOS: Computer and information sciences, Computer Science - Computation and Language, Computation and Language (cs.CL)

4 Research products, page 1 of 1

Role of Language Relatedness in Multilingual Fine-tuning of Language Models: A Case Study in Indo-Aryan Languages
2021IsAmongTopNSimilarDocuments
magahi software on GitHub
IsRelatedTo
apex software on GitHub
IsRelatedTo
bert software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	8
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%