Scaling Passage Representations with Cross-Lingual Query Generation for Multilingual PLM Alignment in MMMU Benchmark

Assignee Research

Found an issue? Give us feedback

ZENODOarrow_drop_down

ZENODO

Report

Data sources: ZENODO

Scaling Passage Representations with Cross-Lingual Query Generation for Multilingual PLM Alignment in MMMU Benchmark

descriptionPublicationkeyboard_double_arrow_right Report Under curation English Publisher:Zenodo

Authors: Assignee Research;

doi: 10.5281/zenodo.20709705

Scaling Passage Representations with Cross-Lingual Query Generation for Multilingual PLM Alignment in MMMU Benchmark

- Summary

Abstract

Effective cross-lingual dense retrieval methods that rely on multilingual pre-trained language models (PLMs) need to be trained to encompass both the relevance matching task and the cross-language alignment task. However, cross-lingual data for training is often scarcely available. In this paper, rather than using more cross-lingual data for training, we propose to use cross-lingual query generation to augment passage representations with queries in languages other than the original passage language. These augmented representations are used at inference time so that the representation can encoResearch goal: How does the scaling of passage representations with cross-lingual query generation influence the alignment performance of multilingual PLMs, as evaluated by the MMMU benchmark for multimodal multilingual understanding?Autonomous synthesis report generated by Assignee Research. Tribunal consensus score: 9.2/10.

Found an issue? Give us feedback