publication . Other literature type . Conference object . Preprint . 2019

How Transformer Revitalizes Character-based Neural Machine Translation: An Investigation on Japanese-Vietnamese Translation Systems

Ngo, Thi-Vinh; Ha, Thanh-Le; Nguyen, Phuong-Thai; Nguyen, Le-Minh;
Open Access English
  • Published: 02 Nov 2019
  • Publisher: Zenodo
Abstract
While translating between East Asian languages, many works have discovered clear advantages of using characters as the translation unit. Unfortunately, traditional recurrent neural machine translation systems hinder the practical usage of those character-based systems due to their architectural limitations. They are unfavorable in handling extremely long sequences as well as highly restricted in parallelizing the computations. In this paper, we demonstrate that the new transformer architecture can perform character-based trans- lation better than the recurrent one. We conduct experiments on a low-resource language pair: Japanese-Vietnamese. Our models considerab...
Subjects
free text keywords: Computer Science - Computation and Language
Download fromView all 4 versions
Zenodo
Other literature type . 2019
Provider: Datacite
ZENODO
Conference object . 2019
Provider: ZENODO
Zenodo
Other literature type . 2019
Provider: Datacite

[4] S. Hochreiter and J. Schmidhuber, “Long shortterm memory,” Neural Comput., vol. 9, no. 8, pp. 1735-1780, Nov. 1997. [Online]. Available: http://dx.doi.org/10.1162/neco.1997.9.8.1735

[5] K. Cho, B. van Merrienboer, Ç. Gülçehre, F. Bougares, H. Schwenk, and Y. Bengio, “Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation,” in Proceedings of Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8. Baltimore, ML, USA: Association for Computational Linguistics, Jule 2014.

[6] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” CoRR, vol. abs/1512.03385, 2015. [Online]. Available: http: //arxiv.org/abs/1512.03385

[7] M.-T. Luong, H. Pham, and C. D. Manning, “Effective approaches to attention-based neural machine translation,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP 15. Lisbon, Portugal: Association for Computational Linguistics, September 2015, pp. 1412-1421. [Online]. Available: http://aclweb.org/anthology/D15-1166

[8] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” CoRR, vol. abs/1706.03762, 2017. [Online]. Available: http://arxiv.org/abs/1706. 03762

[9] T.-V. Ngo, T.-L. Ha, P.-T. Nguyen, and L.-M. Nguyen, “Combining advanced methods in japanese-vietnamese neural machine translation,” 2018 10th International Conference on Knowledge and Systems Engineering (KSE), pp. 318-322, 2018.

[10] H. Riza, M. Purwoadi, Gunarso, T. Uliniansyah, A. A. Ti, S. M. Aljunied, L. C. Mai, V. T. Thang, N. P. Thai, V. Chea, R. Sun, S. Sam, S. Seng, K. M. Soe, K. T. Nwet, M. Utiyama, and C. Ding, “Introduction of the asian language treebank,” in 2016 Conference of The Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques (O-COCOSDA), Oct 2016, pp. 1-6.

[11] G. Neubig, Y. Nakata, and S. Mori, “Pointwise prediction for robust, adaptable japanese morphological analysis,” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers - Volume 2, ser. HLT '11. Stroudsburg, PA, USA: Association for Computational Linguistics, 2011, pp. 529-533. [Online]. Available: http://dl.acm.org/citation.cfm?id=2002736.2002841

[12] G. Klein, Y. Kim, Y. Deng, J. Senellart, and A. M. Rush, “Opennmt: Open-source toolkit for neural machine translation,” in Proceedings of the 55th Annual Meeting of the Association for Computational LinguisticsSystem Demonstrations. Vancouver, Canada, July 30 - August 4, 2017: Association for Computational Linguistics, 2017, pp. 67-72. [OpenAIRE]

[13] H. Isozaki, T. Hirao, K. Duh, K. Sudoh, and H. Tsukada, “Automatic evaluation of translation quality for distant language pairs,” in Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, ser. EMNLP '10.

Stroudsburg, PA, USA: Association for Computational Linguistics, 2010, pp. 944-952. [Online]. Available: http://dl.acm.org/citation.cfm?id=1870658.1870750

[14] J. Lee, K. Cho, and T. Hofmann, “Fully characterlevel neural machine translation without explicit segmentation,” CoRR, vol. abs/1610.03017, 2016. [Online]. Available: http://arxiv.org/abs/1610.03017

[15] C. Cherry, G. Foster, A. Bapna, O. Firat, and W. Macherey, “Revisiting character-based neural machine translation with capacity and compression,” CoRR, vol. abs/1808.09943, 2018. [Online]. Available: http://arxiv.org/abs/1808.09943

Abstract
While translating between East Asian languages, many works have discovered clear advantages of using characters as the translation unit. Unfortunately, traditional recurrent neural machine translation systems hinder the practical usage of those character-based systems due to their architectural limitations. They are unfavorable in handling extremely long sequences as well as highly restricted in parallelizing the computations. In this paper, we demonstrate that the new transformer architecture can perform character-based trans- lation better than the recurrent one. We conduct experiments on a low-resource language pair: Japanese-Vietnamese. Our models considerab...
Subjects
free text keywords: Computer Science - Computation and Language
Download fromView all 4 versions
Zenodo
Other literature type . 2019
Provider: Datacite
ZENODO
Conference object . 2019
Provider: ZENODO
Zenodo
Other literature type . 2019
Provider: Datacite

[4] S. Hochreiter and J. Schmidhuber, “Long shortterm memory,” Neural Comput., vol. 9, no. 8, pp. 1735-1780, Nov. 1997. [Online]. Available: http://dx.doi.org/10.1162/neco.1997.9.8.1735

[5] K. Cho, B. van Merrienboer, Ç. Gülçehre, F. Bougares, H. Schwenk, and Y. Bengio, “Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation,” in Proceedings of Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8. Baltimore, ML, USA: Association for Computational Linguistics, Jule 2014.

[6] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” CoRR, vol. abs/1512.03385, 2015. [Online]. Available: http: //arxiv.org/abs/1512.03385

[7] M.-T. Luong, H. Pham, and C. D. Manning, “Effective approaches to attention-based neural machine translation,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP 15. Lisbon, Portugal: Association for Computational Linguistics, September 2015, pp. 1412-1421. [Online]. Available: http://aclweb.org/anthology/D15-1166

[8] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” CoRR, vol. abs/1706.03762, 2017. [Online]. Available: http://arxiv.org/abs/1706. 03762

[9] T.-V. Ngo, T.-L. Ha, P.-T. Nguyen, and L.-M. Nguyen, “Combining advanced methods in japanese-vietnamese neural machine translation,” 2018 10th International Conference on Knowledge and Systems Engineering (KSE), pp. 318-322, 2018.

[10] H. Riza, M. Purwoadi, Gunarso, T. Uliniansyah, A. A. Ti, S. M. Aljunied, L. C. Mai, V. T. Thang, N. P. Thai, V. Chea, R. Sun, S. Sam, S. Seng, K. M. Soe, K. T. Nwet, M. Utiyama, and C. Ding, “Introduction of the asian language treebank,” in 2016 Conference of The Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques (O-COCOSDA), Oct 2016, pp. 1-6.

[11] G. Neubig, Y. Nakata, and S. Mori, “Pointwise prediction for robust, adaptable japanese morphological analysis,” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers - Volume 2, ser. HLT '11. Stroudsburg, PA, USA: Association for Computational Linguistics, 2011, pp. 529-533. [Online]. Available: http://dl.acm.org/citation.cfm?id=2002736.2002841

[12] G. Klein, Y. Kim, Y. Deng, J. Senellart, and A. M. Rush, “Opennmt: Open-source toolkit for neural machine translation,” in Proceedings of the 55th Annual Meeting of the Association for Computational LinguisticsSystem Demonstrations. Vancouver, Canada, July 30 - August 4, 2017: Association for Computational Linguistics, 2017, pp. 67-72. [OpenAIRE]

[13] H. Isozaki, T. Hirao, K. Duh, K. Sudoh, and H. Tsukada, “Automatic evaluation of translation quality for distant language pairs,” in Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, ser. EMNLP '10.

Stroudsburg, PA, USA: Association for Computational Linguistics, 2010, pp. 944-952. [Online]. Available: http://dl.acm.org/citation.cfm?id=1870658.1870750

[14] J. Lee, K. Cho, and T. Hofmann, “Fully characterlevel neural machine translation without explicit segmentation,” CoRR, vol. abs/1610.03017, 2016. [Online]. Available: http://arxiv.org/abs/1610.03017

[15] C. Cherry, G. Foster, A. Bapna, O. Firat, and W. Macherey, “Revisiting character-based neural machine translation with capacity and compression,” CoRR, vol. abs/1808.09943, 2018. [Online]. Available: http://arxiv.org/abs/1808.09943

Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue