descriptionPublicationkeyboard_double_arrow_right Article , Preprint 18 Jul 2021Embargo end date: 01 Jan 2021Publisher:IEEEJournal:2021 International Joint Conference on Neural Networks (IJCNN)

Authors: Michael Zeng; Linjun Shou; Ming Gong; Hong Qu; Yu Shi; Junwei Liao;

doi: 10.1109/ijcnn52387.2021.9534401 , 10.48550/arxiv.2102.06578

arXiv: http://arxiv.org/abs/2102.06578

Improving Zero-shot Neural Machine Translation on Language-specific Encoders- Decoders

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

Recently, universal neural machine translation (NMT) with shared encoder-decoder gained good performance on zero-shot translation. Unlike universal NMT, jointly trained language-specific encoders-decoders aim to achieve universal representation across non-shared modules, each of which is for a language or language family. The non-shared architecture has the advantage of mitigating internal language competition, especially when the shared vocabulary and model parameters are restricted in their size. However, the performance of using multiple encoders and decoders on zero-shot translation still lags behind universal NMT. In this work, we study zero-shot translation using language-specific encoders-decoders. We propose to generalize the non-shared architecture and universal NMT by differentiating the Transformer layers between language-specific and interlingua. By selectively sharing parameters and applying cross-attentions, we explore maximizing the representation universality and realizing the best alignment of language-agnostic information. We also introduce a denoising auto-encoding (DAE) objective to jointly train the model with the translation task in a multi-task manner. Experiments on two public multilingual parallel datasets show that our proposed model achieves a competitive or better results than universal NMT and strong pivot baseline. Moreover, we experiment incrementally adding new language to the trained model by only updating the new model parameters. With this little effort, the zero-shot translation between this newly added language and existing languages achieves a comparable result with the model trained jointly from scratch on all languages.

Related Organizations

University of Electronic Science and Technology of China
Department of Computer Sciences
Austria
Microsoft (United States)
United States
University of Electronic Science and Technology of China
China (People's Republic of)
Department of Computer Science
Spain

View all View all

Keywords

FOS: Computer and information sciences, Computer Science - Computation and Language, Computation and Language (cs.CL)

1 Research products, page 1 of 1

transformers software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	5
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Top 10%

Average

Green

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Improving Zero-shot Neural Machine Translation on Language-specific Encoders- Decoders

Improving Zero-shot Neural Machine Translation on Language-specific Encoders- Decoders

1 Research products, page 1 of 1

transformers software on GitHub