publication . Other literature type . Conference object . 2019

Generic and Specialized Word Embeddings for Multi-Domain Machine Translation

Pham, MinhQuang; Crego, Josep; Yvon, François; Senellart, Jean;
Open Access
  • Published: 02 Nov 2019
  • Publisher: Zenodo
  • Country: France
Abstract
International audience; Supervised machine translation works well when the train and test data are sampled from the same distribution. When this is not the case, adaptation techniques help ensure that the knowledge learned from out-of-domain texts generalises to in-domain sentences. We study here a related setting, multi-domain adaptation, where the number of domains is potentially large and adapting separately to each domain would waste training resources. Our proposal transposes to neural machine translation the feature expansion technique of (Daum\'e III, 2007): it isolates domain-agnostic from domain-specific lexical representations, while sharing the most o...
Subjects
free text keywords: Machine Translation, Domain Adaptation, [INFO]Computer Science [cs], [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
Related Organizations
44 references, page 1 of 3

[1] K. Cho, B. van Merrienboer, D. Bahdanau, and Y. Bengio, “On the properties of neural machine translation: Encoder-decoder approaches,” in Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, Doha, Qatar, October 2014, pp. 103-111. [Online]. Available: http://www.aclweb.org/anthology/W14-4012

[2] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” in Proceedings of the International Conference on Learning Representations, ser. ICLR, San Diego, CA, 2015. [Online]. Available: https: //arxiv.org/pdf/1409.0473.pdf

[3] J. Gehring, M. Auli, D. Grangier, D. Yarats, and Y. N. Dauphin, “Convolutional sequence to sequence learning,” in Proceedings of the 34th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, D. Precup and Y. W. Teh, Eds., vol. 70, International Convention Centre, Sydney, Australia, 2017, pp. 1243-1252. [Online]. Available: http://proceedings.mlr.press/v70/gehring17a.html

[4] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds. Curran Associates, Inc., 2017, pp. 5998- 6008. [Online]. Available: http://papers.nips.cc/paper/ 7181-attention-is-all-you-need.pdf

[5] O. Firat, K. Cho, and Y. Bengio, “Multi-way, multilingual neural machine translation with a shared attention mechanism,” in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 2016, pp. 866-875. [Online]. Available: http://www.aclweb.org/anthology/N16-1101

[6] T.-H. Ha, J. Niehues, and A. Waibel, “Toward multilingual neural machine translationwith universal encoder and decoder,” in Proceedings of the International Workshop on Spoken Language Translation. Vancouver, Canada: IWSLT, 2016. [OpenAIRE]

[7] M. Johnson, M. Schuster, Q. Le, M. Krikun, Y. Wu, Z. Chen, N. Thorat, F. a. Vie´gas, M. Wattenberg, G. Corrado, M. Hughes, and J. Dean, “Google's multilingual neural machine translation system: Enabling zero-shot translation,” Transactions of the Association for Computational Linguistics, vol. 5, pp. 339-351, 2017. [Online]. Available: https: //transacl.org/ojs/index.php/tacl/article/view/1081

[8] G. Foster and R. Kuhn, “Mixture-model adaptation for SMT,” in Proceedings of the Second Workshop on Statistical Machine Translation, Prague, Czech Republic, 2007, pp. 128-135. [Online]. Available: http://www.aclweb.org/anthology/W/W07/W07-0717

[9] A. Axelrod, X. He, and J. Gao, “Domain adaptation via pseudo in-domain data selection,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, ser. EMNLP '11, Edinburgh, United Kingdom, 2011, pp. 355-362. [Online]. Available: http://dl.acm.org/citation.cfm?id=2145432.2145474

[10] C. Chu and R. Wang, “A survey of domain adaptation for neural machine translation,” in Proceedings of the 27th International Conference on Computational Linguistics, ser. COLING 2018, Santa Fe, New Mexico, USA, 2018, pp. 1304-1319. [Online]. Available: http://aclweb.org/anthology/C18-1111

[11] R. Sennrich, H. Schwenk, and W. Aransa, “A multidomain translation model framework for statistical machine translation,” in Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Sofia, Bulgaria: Association for Computational Linguistics, Aug. 2013, pp. 832-840. [Online]. Available: https: //www.aclweb.org/anthology/P13-1082 [OpenAIRE]

[12] M. A. Farajian, M. Turchi, M. Negri, and M. Federico, “Multi-domain neural machine translation through unsupervised adaptation,” in Proceedings of the Second Conference on Machine Translation, Copenhagen, Denmark, Sept. 2017, pp. 127-137. [Online]. Available: https://www.aclweb.org/anthology/W17-4713

[13] R. Caruana, “Multitask learning,” Mach. Learn., vol. 28, no. 1, pp. 41-75, July 1997. [Online]. Available: https://doi.org/10.1023/A:1007379606734

[14] H. Daume´ III, “Frustratingly easy domain adaptation,” in Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, 2007, pp. 256-263. [Online]. Available: http://aclweb.org/anthology/P07-1033

[15] P. Koehn, “Europarl: A parallel corpus for Statistical Machine Translation,” in 2nd Workshop on EBMT of MT-Summit X, Phuket, Thailand, 2005, pp. 79-86.

44 references, page 1 of 3
Abstract
International audience; Supervised machine translation works well when the train and test data are sampled from the same distribution. When this is not the case, adaptation techniques help ensure that the knowledge learned from out-of-domain texts generalises to in-domain sentences. We study here a related setting, multi-domain adaptation, where the number of domains is potentially large and adapting separately to each domain would waste training resources. Our proposal transposes to neural machine translation the feature expansion technique of (Daum\'e III, 2007): it isolates domain-agnostic from domain-specific lexical representations, while sharing the most o...
Subjects
free text keywords: Machine Translation, Domain Adaptation, [INFO]Computer Science [cs], [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
Related Organizations
44 references, page 1 of 3

[1] K. Cho, B. van Merrienboer, D. Bahdanau, and Y. Bengio, “On the properties of neural machine translation: Encoder-decoder approaches,” in Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, Doha, Qatar, October 2014, pp. 103-111. [Online]. Available: http://www.aclweb.org/anthology/W14-4012

[2] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” in Proceedings of the International Conference on Learning Representations, ser. ICLR, San Diego, CA, 2015. [Online]. Available: https: //arxiv.org/pdf/1409.0473.pdf

[3] J. Gehring, M. Auli, D. Grangier, D. Yarats, and Y. N. Dauphin, “Convolutional sequence to sequence learning,” in Proceedings of the 34th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, D. Precup and Y. W. Teh, Eds., vol. 70, International Convention Centre, Sydney, Australia, 2017, pp. 1243-1252. [Online]. Available: http://proceedings.mlr.press/v70/gehring17a.html

[4] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds. Curran Associates, Inc., 2017, pp. 5998- 6008. [Online]. Available: http://papers.nips.cc/paper/ 7181-attention-is-all-you-need.pdf

[5] O. Firat, K. Cho, and Y. Bengio, “Multi-way, multilingual neural machine translation with a shared attention mechanism,” in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 2016, pp. 866-875. [Online]. Available: http://www.aclweb.org/anthology/N16-1101

[6] T.-H. Ha, J. Niehues, and A. Waibel, “Toward multilingual neural machine translationwith universal encoder and decoder,” in Proceedings of the International Workshop on Spoken Language Translation. Vancouver, Canada: IWSLT, 2016. [OpenAIRE]

[7] M. Johnson, M. Schuster, Q. Le, M. Krikun, Y. Wu, Z. Chen, N. Thorat, F. a. Vie´gas, M. Wattenberg, G. Corrado, M. Hughes, and J. Dean, “Google's multilingual neural machine translation system: Enabling zero-shot translation,” Transactions of the Association for Computational Linguistics, vol. 5, pp. 339-351, 2017. [Online]. Available: https: //transacl.org/ojs/index.php/tacl/article/view/1081

[8] G. Foster and R. Kuhn, “Mixture-model adaptation for SMT,” in Proceedings of the Second Workshop on Statistical Machine Translation, Prague, Czech Republic, 2007, pp. 128-135. [Online]. Available: http://www.aclweb.org/anthology/W/W07/W07-0717

[9] A. Axelrod, X. He, and J. Gao, “Domain adaptation via pseudo in-domain data selection,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, ser. EMNLP '11, Edinburgh, United Kingdom, 2011, pp. 355-362. [Online]. Available: http://dl.acm.org/citation.cfm?id=2145432.2145474

[10] C. Chu and R. Wang, “A survey of domain adaptation for neural machine translation,” in Proceedings of the 27th International Conference on Computational Linguistics, ser. COLING 2018, Santa Fe, New Mexico, USA, 2018, pp. 1304-1319. [Online]. Available: http://aclweb.org/anthology/C18-1111

[11] R. Sennrich, H. Schwenk, and W. Aransa, “A multidomain translation model framework for statistical machine translation,” in Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Sofia, Bulgaria: Association for Computational Linguistics, Aug. 2013, pp. 832-840. [Online]. Available: https: //www.aclweb.org/anthology/P13-1082 [OpenAIRE]

[12] M. A. Farajian, M. Turchi, M. Negri, and M. Federico, “Multi-domain neural machine translation through unsupervised adaptation,” in Proceedings of the Second Conference on Machine Translation, Copenhagen, Denmark, Sept. 2017, pp. 127-137. [Online]. Available: https://www.aclweb.org/anthology/W17-4713

[13] R. Caruana, “Multitask learning,” Mach. Learn., vol. 28, no. 1, pp. 41-75, July 1997. [Online]. Available: https://doi.org/10.1023/A:1007379606734

[14] H. Daume´ III, “Frustratingly easy domain adaptation,” in Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, 2007, pp. 256-263. [Online]. Available: http://aclweb.org/anthology/P07-1033

[15] P. Koehn, “Europarl: A parallel corpus for Statistical Machine Translation,” in 2nd Workshop on EBMT of MT-Summit X, Phuket, Thailand, 2005, pp. 79-86.

44 references, page 1 of 3
Any information missing or wrong?Report an Issue