Chinese–Portuguese machine translation: a study on building parallel corpora from comparable texts

Conference object, Preprint English OPEN
Liu, Siyou; Wang, Longyue; Liu, Chao-Hong;
(2018)
  • Publisher: European Language Resource Association
  • Subject: Computer Science - Computation and Language | Chinese–Portuguese; Low-Resource; Statistical Machine Translation; Neural Machine Translation; Parallel Corpus | Machine learning
    acm: ComputingMethodologies_DOCUMENTANDTEXTPROCESSING | ComputingMethodologies_ARTIFICIALINTELLIGENCE

Although there are increasing and significant ties between China and Portuguese-speaking countries, there is not much parallel corpora in the Chinese-Portuguese language pair. Both languages are very populous, with 1.2 billion native Chinese speakers and 279 million nat... View more
  • References (35)
    35 references, page 1 of 4

    Bahdanau, D., Cho, K., and Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In Proceedings of the 3rd International Conference on Learning Representations, pages 1-15, San Diego, USA.

    Bertoldi, N., Barbaiani, M., Federico, M., and Cattoni, R. (2008). Phrase-based statistical machine translation with pivot languages. In Proceedings of the 2008 International Workshop on Spoken Language Translation, pages 143-149, Honolulu, Hawaii, USA.

    Brown, P. F., Pietra, V. J. D., Pietra, S. A. D., and Mercer, R. L. (1993). The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19(2):263-311.

    Collins, M., Koehn, P., and Kucerova, I. (2005). Clause restructuring for statistical machine translation. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pages 531-540, Ann Arbor, Michigan, USA.

    Gu, J., Hassan, H., Devlin, J., and Li, V. O. (2018). Universal neural machine translation for extremely low resource languages. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics. In press.

    Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580.

    Kalchbrenner, N. and Blunsom, P. (2013). Recurrent continuous translation models. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1700-1709, Seattle, Washington, USA.

    Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., and Herbst, E. (2007). Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, pages 177-180, Prague, Czech Republic.

    Lison, P. and Tiedemann, J. (2016). Opensubtitles2016: Extracting large parallel corpora from movie and TV subtitles. In Proceedings of the 10th International Conference on Language Resources and Evaluation LREC, pages 923-929, Portorozˇ, Slovenia.

    Liu, S. and Leal, A. L. V. (2016). Analysis of temporal adverbial phrases for portuguese-chinese machine translation. In Proceedings of the 12th International Conference on Computational Processing of the Portuguese Language, pages 62-73, Tomar, Portugal.

  • Related Research Results (3)
  • Metrics
Share - Bookmark