publication . Conference object . Contribution for newspaper or weekly magazine . 2015

A joint dependency model of morphological and syntactic structure for statistical machine translation

Rico Sennrich; Barry Haddow;
Open Access
  • Published: 01 Jan 2015
Abstract
When translating between two languages that differ in their degree of morphological synthesis, syntactic structures in one language may be realized as morphological structures in the other, and SMT models need a mechanism to learn such translations. Prior work has used morpheme splitting with flat representations that do not encode the hierarchical structure between morphemes, but this structure is relevant for learning morphosyntactic constraints and selectional preferences. We propose to model syntactic and morphological structure jointly in a dependency translation model, allowing the system to generalize to the level of morphemes. We present a dependency rep...
Subjects
free text keywords: Machine translation, computer.software_genre, computer, Natural language processing, Speech recognition, Rule-based machine translation, German, language.human_language, language, Computer science, Syntactic structure, Morpheme, ENCODE, Syntax, Artificial intelligence, business.industry, business
Funded by
EC| QT21
Project
QT21
QT21: Quality Translation 21
  • Funder: European Commission (EC)
  • Project Code: 645452
  • Funding stream: H2020 | RIA
Validated by funder
,
EC| HimL
Project
HimL
Health in my Language
  • Funder: European Commission (EC)
  • Project Code: 644402
  • Funding stream: H2020 | IA
,
SNSF| Smarter Model Learning in Syntax-based Statistical Machine Translation
Project
  • Funder: Swiss National Science Foundation (SNSF)
  • Project Code: P2ZHP1_148717
  • Funding stream: Careers;Fellowships | Early Postdoc.Mobility
,
EC| TraMOOC
Project
TraMOOC
Translation for Massive Open Online Courses
  • Funder: European Commission (EC)
  • Project Code: 644333
  • Funding stream: H2020 | IA
Communities
Digital Humanities and Cultural Heritage
25 references, page 1 of 2

Colin Cherry and George Foster. 2012. Batch Tuning Strategies for Statistical Machine Translation. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT '12, pages 427-436, Montreal, Canada. Association for Computational Linguistics. 9We released source code and configuration files at https://github.com/rsennrich/ wmt2014-scripts.

Jonathan H. Clark, Chris Dyer, Alon Lavie, and Noah A. Smith. 2011. Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, pages 176-181, Portland, Oregon. Association for Computational Linguistics.

Killian A. Foth. 2005. Eine umfassende ConstraintDependenz-Grammatik des Deutschen. University of Hamburg, Hamburg.

Alexander Fraser, Marion Weller, Aoife Cahill, and Fabienne Cap. 2012. Modeling Inflection and WordFormation in SMT. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, pages 664-674, Avignon, France. Association for Computational Linguistics.

Fabienne Fritzinger and Alexander Fraser. 2010. How to Avoid Burning Ducks: Combining Linguistic Analysis and Corpus Statistics for German Compound Processing. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, WMT '10, pages 224-234, Uppsala, Sweden. Association for Computational Linguistics.

Michel Galley, Jonathan Graehl, Kevin Knight, Daniel Marcu, Steve DeNeefe, Wei Wang, and Ignacio Thayer. 2006. Scalable inference and training of context-rich syntactic translation models. In ACL44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pages 961-968, Sydney, Australia. Association for Computational Linguistics.

Matthias Huck, Hieu Hoang, and Philipp Koehn. 2014. Preference Grammars and Soft Syntactic Constraints for GHKM Syntax-based Statistical Machine Translation. In Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, pages 148-156, Doha, Qatar. Association for Computational Linguistics.

Sébastien Jean, Orhan Firat, Kyunghyun Cho, Roland Memisevic, and Yoshua Bengio. 2015. Montreal Neural Machine Translation Systems for WMT'15 . In Proceedings of the Tenth Workshop on Statistical Machine Translation.

Philipp Koehn and Kevin Knight. 2003. Empirical Methods for Compound Splitting. In EACL '03: Proceedings of the Tenth Conference on European Chapter of the Association for Computational Linguistics, pages 187-193, Budapest, Hungary. Association for Computational Linguistics.

Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrˇej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open Source Toolkit for Statistical Machine Translation.

In Proceedings of the ACL-2007 Demo and Poster Sessions, pages 177-180, Prague, Czech Republic.

Klaus Macherey, Andrew Dai, David Talbot, Ashok Popat, and Franz Och. 2011. Languageindependent compound splitting with morphological operations. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 1395- 1404, Portland, Oregon, USA. Association for Computational Linguistics.

Sonja Nießen and Hermann Ney. 2000. Improving SMT quality with morpho-syntactic analysis. In 18th Int. Conf. on Computational Linguistics, pages 1081-1085.

Sonja Nießen and Hermann Ney. 2001. Morphosyntactic analysis for Reordering in Statistical Machine Translation. In Machine Translation Summit, pages 247-252, Santiago de Compostela, Spain.

Maja Popovic, Daniel Stein, and Hermann Ney. 2006. Statistical Machine Translation of German Compound Words. In Advances in Natural Language Processing, 5th International Conference on NLP, FinTAL 2006, pages 616-624, Turku, Finland.

25 references, page 1 of 2
Any information missing or wrong?Report an Issue