publication . Preprint . 2016

Toward Multilingual Neural Machine Translation with Universal Encoder and Decoder

Ha, Thanh-Le; Niehues, Jan; Waibel, Alexander;
Open Access English
  • Published: 15 Nov 2016
Abstract
In this paper, we present our first attempts in building a multilingual Neural Machine Translation framework under a unified approach. We are then able to employ attention-based NMT for many-to-many multilingual translation tasks. Our approach does not require any special treatment on the network architecture and it allows us to learn minimal number of free parameters in a standard way of training. Our approach has shown its effectiveness in an under-resourced translation scenario with considerable improvements up to 2.6 BLEU points. In addition, the approach has achieved interesting and promising results when applied in the translation task that there is no dir...
Subjects
free text keywords: Computer Science - Computation and Language
Download from
18 references, page 1 of 2

[Bahdanau et al.2014] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural Machine Translation by Jointly Learning to Align and Translate. CoRR, abs/1409.0473.

[Bojar et al.2016] Ondrej Bojar, Rajen Chatterjee, Christian Federmann, Barry Haddow, Philipp Koehn, Johannes Leveling, Christof Monz, Pavel Pecina, Matt Post, Herve Saint-Amand, et al. 2016. Findings of the 2016 Conference on Machine Translation (WMT16). In Proceedings of the First Conference on Machine Translation (WMT16), pages 12-58, Berlin, Germany. Association for Computational Linguistics.

[Cettolo et al.2012] Mauro Cettolo, Christian Girardi, and Marcello Federico. 2012. Wit3: Web inventory of transcribed and translated talks. In Proceedings of the 16th Conference of the European Association for Machine Translation (EAMT), pages 261-268, Trento, Italy, May.

[Cettolo et al.2015] M Cettolo, J Niehues, S Stu¨ker, L Bentivogli, R Cattoni, and M Federico. 2015. The IWSLT 2015 Evaluation Campaign. In Proceedings of the 12th International Workshop on Spoken Language Translation (IWSLT 2015), Danang, Vietnam.

[Cho et al.2014] Kyunghyun Cho, Bart van Merrienboer, C¸ aglar Gu¨lc¸ehre, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In Proceedings of Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8, Baltimore, ML, USA, Jule. Association for Computational Linguistics.

[Dong et al.2015] Daxiang Dong, Hua Wu, Wei He, Dianhai Yu, and Haifeng Wang. 2015. Multi-Task Learning for Multiple Language Translation. In Proceedings of ACL-IJNLP 2015, pages 1723-1732, Beijing, China, July. Association for Computational Linguistics.

[Firat et al.2016] Orhan Firat, KyungHyun Cho, and Yoshua Bengio. 2016. Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism. CoRR, abs/1601.01073.

[Gu¨lc¸ehre et al.2015] C¸ aglar Gu¨lc¸ehre, Orhan Firat, Kelvin Xu, Kyunghyun Cho, Lo¨ıc Barrault, Huei-Chi Lin, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2015. On Using Monolingual Corpora in Neural Machine Translation. CoRR, abs/1503.03535.

[Hochreiter and Schmidhuber1997] Sepp Hochreiter and Ju¨rgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput., 9(8):1735-1780, November.

[Luong et al.2015a] Minh-Thang Luong, Hieu Pham, and Christopher D. Manning. 2015a. Effective approaches to attention-based neural machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP 15, pages 1412-1421, Lisbon, Portugal, September. Association for Computational Linguistics.

[Luong et al.2015b] Minh-Thang Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, and Wojciech Zaremba. 2015b. Addressing the rare word problem in neural machine translation. In Proceedings of ACL-IJNLP 2015, pages 11-19, Beijing, China, July. Association for Computational Linguistics.

[Luong et al.2016] Minh-Thang Luong, Quoc V. Le, Ilya Sutskever, Oriol Vinyals, and Lukasz Kaiser. 2016. Multi-task sequence to sequence learning. In International Conference on Learning Representations (ICLR), San Juan, Puerto Rico, May.

[Maaten and Hinton2008] Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-sne. Journal of Machine Learning Research, 9(Nov):2579-2605.

[Papineni et al.2002] Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL 2002), pages 311-318. Association for Computational Linguistics.

[Sennrich et al.2016a] Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016a. Improving Neural Machine Translation Models with Monolingual Data. In Association for Computational Linguistics (ACL 2016), Berlin, Germany, August.

18 references, page 1 of 2
Abstract
In this paper, we present our first attempts in building a multilingual Neural Machine Translation framework under a unified approach. We are then able to employ attention-based NMT for many-to-many multilingual translation tasks. Our approach does not require any special treatment on the network architecture and it allows us to learn minimal number of free parameters in a standard way of training. Our approach has shown its effectiveness in an under-resourced translation scenario with considerable improvements up to 2.6 BLEU points. In addition, the approach has achieved interesting and promising results when applied in the translation task that there is no dir...
Subjects
free text keywords: Computer Science - Computation and Language
Download from
18 references, page 1 of 2

[Bahdanau et al.2014] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural Machine Translation by Jointly Learning to Align and Translate. CoRR, abs/1409.0473.

[Bojar et al.2016] Ondrej Bojar, Rajen Chatterjee, Christian Federmann, Barry Haddow, Philipp Koehn, Johannes Leveling, Christof Monz, Pavel Pecina, Matt Post, Herve Saint-Amand, et al. 2016. Findings of the 2016 Conference on Machine Translation (WMT16). In Proceedings of the First Conference on Machine Translation (WMT16), pages 12-58, Berlin, Germany. Association for Computational Linguistics.

[Cettolo et al.2012] Mauro Cettolo, Christian Girardi, and Marcello Federico. 2012. Wit3: Web inventory of transcribed and translated talks. In Proceedings of the 16th Conference of the European Association for Machine Translation (EAMT), pages 261-268, Trento, Italy, May.

[Cettolo et al.2015] M Cettolo, J Niehues, S Stu¨ker, L Bentivogli, R Cattoni, and M Federico. 2015. The IWSLT 2015 Evaluation Campaign. In Proceedings of the 12th International Workshop on Spoken Language Translation (IWSLT 2015), Danang, Vietnam.

[Cho et al.2014] Kyunghyun Cho, Bart van Merrienboer, C¸ aglar Gu¨lc¸ehre, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In Proceedings of Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8, Baltimore, ML, USA, Jule. Association for Computational Linguistics.

[Dong et al.2015] Daxiang Dong, Hua Wu, Wei He, Dianhai Yu, and Haifeng Wang. 2015. Multi-Task Learning for Multiple Language Translation. In Proceedings of ACL-IJNLP 2015, pages 1723-1732, Beijing, China, July. Association for Computational Linguistics.

[Firat et al.2016] Orhan Firat, KyungHyun Cho, and Yoshua Bengio. 2016. Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism. CoRR, abs/1601.01073.

[Gu¨lc¸ehre et al.2015] C¸ aglar Gu¨lc¸ehre, Orhan Firat, Kelvin Xu, Kyunghyun Cho, Lo¨ıc Barrault, Huei-Chi Lin, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2015. On Using Monolingual Corpora in Neural Machine Translation. CoRR, abs/1503.03535.

[Hochreiter and Schmidhuber1997] Sepp Hochreiter and Ju¨rgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput., 9(8):1735-1780, November.

[Luong et al.2015a] Minh-Thang Luong, Hieu Pham, and Christopher D. Manning. 2015a. Effective approaches to attention-based neural machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP 15, pages 1412-1421, Lisbon, Portugal, September. Association for Computational Linguistics.

[Luong et al.2015b] Minh-Thang Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, and Wojciech Zaremba. 2015b. Addressing the rare word problem in neural machine translation. In Proceedings of ACL-IJNLP 2015, pages 11-19, Beijing, China, July. Association for Computational Linguistics.

[Luong et al.2016] Minh-Thang Luong, Quoc V. Le, Ilya Sutskever, Oriol Vinyals, and Lukasz Kaiser. 2016. Multi-task sequence to sequence learning. In International Conference on Learning Representations (ICLR), San Juan, Puerto Rico, May.

[Maaten and Hinton2008] Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-sne. Journal of Machine Learning Research, 9(Nov):2579-2605.

[Papineni et al.2002] Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL 2002), pages 311-318. Association for Computational Linguistics.

[Sennrich et al.2016a] Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016a. Improving Neural Machine Translation Models with Monolingual Data. In Association for Computational Linguistics (ACL 2016), Berlin, Germany, August.

18 references, page 1 of 2
Any information missing or wrong?Report an Issue