Neural Machine Translation with Phrase-Level Universal Visual Representations

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 01 Jan 2022Embargo end date: 01 Jan 2022Publisher:Association for Computational Linguistics (ACL)Journal:Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Authors: Qingkai Fang; Yang Feng 0004;

doi: 10.18653/v1/2022.acl-long.390 , 10.48550/arxiv.2203.10299

arXiv: 2203.10299

Neural Machine Translation with Phrase-Level Universal Visual Representations

- Summary
- Subjects
- Related research
  (3)
- Metrics

Abstract

Multimodal machine translation (MMT) aims to improve neural machine translation (NMT) with additional visual information, but most existing MMT methods require paired input of source sentence and image, which makes them suffer from shortage of sentence-image pairs. In this paper, we propose a phrase-level retrieval-based method for MMT to get visual information for the source input from existing sentence-image data sets so that MMT can break the limitation of paired sentence-image input. Our method performs retrieval at the phrase level and hence learns visual information from pairs of source phrase and grounded region, which can mitigate data sparsity. Furthermore, our method employs the conditional variational auto-encoder to learn visual representations which can filter redundant visual information and only retain visual information related to the phrase. Experiments show that the proposed method significantly outperforms strong baselines on multiple MMT datasets, especially when the textual context is limited.

ACL 2022 main conference

Related Organizations

University of Chinese Academy of Sciences
China (People's Republic of)
Chinese Academy of Sciences
China (People's Republic of)
Institute Of Computing Technology
China (People's Republic of)
INSTITUTE OF COMPUTING TECHNOLOGY,CHINESE ACADEMY OF SCIENCES
China (People's Republic of)

Keywords

FOS: Computer and information sciences, Computer Science - Computation and Language, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, I.2.7, Computation and Language (cs.CL)

3 Research products, page 1 of 1

dataset software on GitHub
IsRelatedTo
fairseq software on GitHub
IsRelatedTo
sacrebleu software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	17
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%