Comparison of Simultaneous Multi-Task and Sequential Training for Zero-Shot Cross-Lingual Accuracy in Low-Resource Languages

Assignee Research

Found an issue? Give us feedback

ZENODOarrow_drop_down

ZENODO

Report

Data sources: ZENODO

Comparison of Simultaneous Multi-Task and Sequential Training for Zero-Shot Cross-Lingual Accuracy in Low-Resource Languages

descriptionPublicationkeyboard_double_arrow_right Report Under curation English Publisher:Zenodo

Authors: Assignee Research;

doi: 10.5281/zenodo.20817982

Comparison of Simultaneous Multi-Task and Sequential Training for Zero-Shot Cross-Lingual Accuracy in Low-Resource Languages

- Summary

Abstract

Pre-trained multilingual language encoders, such as multilingual BERT and XLM-R, show great potential for zero-shot cross-lingual transfer. However, these multilingual encoders do not precisely align words and phrases across languages. Especially, learning alignments in the multilingual embedding space usually requires sentence-level or word-level parallel corpora, which are expensive to be obtained for low-resource languages. An alternative is to make the multilingual encoders more robust; when fine-tuning the encoder using downstream task, we train the encoder to tolerate noise in the contexResearch goal: How does simultaneous multi-task intermediate training compare to sequential training in improving zero-shot cross-lingual accuracy on XTREME-R for low-resource languages?Autonomous synthesis report generated by Assignee Research. Tribunal consensus score: 7.9/10.

Found an issue? Give us feedback