publication . Conference object . 2017

Discovering Light Verb Constructions and their Translations from Parallel Corpora without Word Alignment

Vargas, Natalie; Ramisch, Carlos; Caseli, Helena;
Open Access English
  • Published: 01 Jan 2017
  • Publisher: HAL CCSD
Abstract
International audience; We propose a method for joint unsu-pervised discovery of multiword expressions (MWEs) and their translations from parallel corpora. First, we apply independent monolingual MWE extraction in source and target languages simultaneously. Then, we calculate translation probability , association score and distributional similarity of co-occurring pairs. Finally, we rank all translations of a given MWE using a linear combination of these features. Preliminary experiments on light verb constructions show promising results.
Subjects
free text keywords: [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
Download fromView all 3 versions
25 references, page 1 of 2

Mohammed Attia, Antonio Toral, Lamia Tounsi, Pavel Pecina, and Josef van Genabith. 2010. Automatic extraction of Arabic multiword expressions. In E´ric Laporte, Preslav Nakov, Carlos Ramisch, and Aline Villavicencio, editors, Proc. of the COLING Workshop on MWEs: from Theory to Applications (MWE 2010), pages 18-26, Beijing, China, Aug. ACL.

Wilker Aziz and Lucia Specia. 2011. Fully automatic compilation of a Portuguese-English parallel corpus for statistical machine translation. In STIL 2011, Cuiaba´, MT, Obtober.

Anabela Barreiro, Johanna Monti, Brigitte Orliac, Susanne Preuß, Kutz Arrieta, Wang Ling, Fernando Batista, and Isabel Trancoso. 2014. Linguistic evaluation of support verb constructions by openlogos and google translate. In Proc. of the Ninth LREC (LREC 2014), Reykjavik, Iceland, May. ELRA. [OpenAIRE]

Alexandre Be´rard, Christophe Servan, Olivier Pietquin, and Laurent Besacier. 2016. MultiVec: a Multilingual and Multilevel Representation Learning Toolkit for NLP. In The 10th edition of the Language Resources and Evaluation Conference (LREC 2016), May. [OpenAIRE]

Fabienne Cap, Manju Nirmal, Marion Weller, and Sabine Schulte im Walde. 2015. How to account for idiomatic German support verb constructions in statistical machine translation. In Proc. of the 11th Workshop on MWEs (MWE 2015) (con, 2015), pages 19-28.

Marine Carpuat and Mona Diab. 2010. Task-based evaluation of multiword expressions: a pilot study in statistical machine translation. In Proc. of HLT: The 2010 Annual Conf. of the NAACL (NAACL 2003), pages 242-245, Los Angeles, California, Jun. ACL.

Helena de Medeiros Caseli, Carlos Ramisch, Maria das Grac¸as Volpe Nunes, and Aline Villavicencio. 2010. Alignment-based extraction of multiword expressions. In Lang. Res. & Eval. Special Issue on Multiword expression: hard going or plain sailing (jou, 2010), pages 59-77.

Yaacov Choueka. 1988. Looking for needles in a haystack or locating interesting collocational expressions in large textual databases. In Christian Fluhr and Donald E. Walker, editors, Proceedings of the 2nd International Conference on Computer-Assisted Information Retrieval (Recherche d'Information et ses Applications - RIA 1988), pages 609-624, Cambridge, MA, USA, Mar. CID.

Kenneth Church and Patrick Hanks. 1990. Word association norms mutual information, and lexicography. Comp. Ling., 16(1):22-29.

2015. Proc. of the 11th Workshop on MWEs (MWE 2015), Denver, Colorado, USA. ACL.

2010. Lang. Res. & Eval. Special Issue on Multiword expression: hard going or plain sailing, 44(1-2), Apr.

Christopher D. Manning and Hinrich Schu¨tze. 1999. Foundations of statistical natural language processing. MIT Press, Cambridge, USA. 620 p.

I. Dan Melamed. 1997. Automatic discovery of non-compositional compounds in parallel data. In Proc. of the 2nd EMNLP (EMNLP-2), pages 97- 108, Brown University, RI, USA, Aug. ACL. [OpenAIRE]

Emmanuel Morin and Be´atrice Daille. 2010. Compositionality and lexical alignment of multi-word terms. In Lang. Res. & Eval. Special Issue on Multiword expression: hard going or plain sailing (jou, 2010), pages 79-95. [OpenAIRE]

Carlos Ramisch, Laurent Besacier, and Oleksandr Kobzar. 2013. How hard is it to automatically translate phrasal verbs from English to French? In Ruslan Mitkov, Johanna Monti, Gloria Corpas Pastor, and Violeta Seretan, editors, Proc. of the MT Summit 2013 MUMTTT workshop (MUMTTT 2013), pages 53-61, Nice, France, Sep.

25 references, page 1 of 2
Abstract
International audience; We propose a method for joint unsu-pervised discovery of multiword expressions (MWEs) and their translations from parallel corpora. First, we apply independent monolingual MWE extraction in source and target languages simultaneously. Then, we calculate translation probability , association score and distributional similarity of co-occurring pairs. Finally, we rank all translations of a given MWE using a linear combination of these features. Preliminary experiments on light verb constructions show promising results.
Subjects
free text keywords: [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
Download fromView all 3 versions
25 references, page 1 of 2

Mohammed Attia, Antonio Toral, Lamia Tounsi, Pavel Pecina, and Josef van Genabith. 2010. Automatic extraction of Arabic multiword expressions. In E´ric Laporte, Preslav Nakov, Carlos Ramisch, and Aline Villavicencio, editors, Proc. of the COLING Workshop on MWEs: from Theory to Applications (MWE 2010), pages 18-26, Beijing, China, Aug. ACL.

Wilker Aziz and Lucia Specia. 2011. Fully automatic compilation of a Portuguese-English parallel corpus for statistical machine translation. In STIL 2011, Cuiaba´, MT, Obtober.

Anabela Barreiro, Johanna Monti, Brigitte Orliac, Susanne Preuß, Kutz Arrieta, Wang Ling, Fernando Batista, and Isabel Trancoso. 2014. Linguistic evaluation of support verb constructions by openlogos and google translate. In Proc. of the Ninth LREC (LREC 2014), Reykjavik, Iceland, May. ELRA. [OpenAIRE]

Alexandre Be´rard, Christophe Servan, Olivier Pietquin, and Laurent Besacier. 2016. MultiVec: a Multilingual and Multilevel Representation Learning Toolkit for NLP. In The 10th edition of the Language Resources and Evaluation Conference (LREC 2016), May. [OpenAIRE]

Fabienne Cap, Manju Nirmal, Marion Weller, and Sabine Schulte im Walde. 2015. How to account for idiomatic German support verb constructions in statistical machine translation. In Proc. of the 11th Workshop on MWEs (MWE 2015) (con, 2015), pages 19-28.

Marine Carpuat and Mona Diab. 2010. Task-based evaluation of multiword expressions: a pilot study in statistical machine translation. In Proc. of HLT: The 2010 Annual Conf. of the NAACL (NAACL 2003), pages 242-245, Los Angeles, California, Jun. ACL.

Helena de Medeiros Caseli, Carlos Ramisch, Maria das Grac¸as Volpe Nunes, and Aline Villavicencio. 2010. Alignment-based extraction of multiword expressions. In Lang. Res. & Eval. Special Issue on Multiword expression: hard going or plain sailing (jou, 2010), pages 59-77.

Yaacov Choueka. 1988. Looking for needles in a haystack or locating interesting collocational expressions in large textual databases. In Christian Fluhr and Donald E. Walker, editors, Proceedings of the 2nd International Conference on Computer-Assisted Information Retrieval (Recherche d'Information et ses Applications - RIA 1988), pages 609-624, Cambridge, MA, USA, Mar. CID.

Kenneth Church and Patrick Hanks. 1990. Word association norms mutual information, and lexicography. Comp. Ling., 16(1):22-29.

2015. Proc. of the 11th Workshop on MWEs (MWE 2015), Denver, Colorado, USA. ACL.

2010. Lang. Res. & Eval. Special Issue on Multiword expression: hard going or plain sailing, 44(1-2), Apr.

Christopher D. Manning and Hinrich Schu¨tze. 1999. Foundations of statistical natural language processing. MIT Press, Cambridge, USA. 620 p.

I. Dan Melamed. 1997. Automatic discovery of non-compositional compounds in parallel data. In Proc. of the 2nd EMNLP (EMNLP-2), pages 97- 108, Brown University, RI, USA, Aug. ACL. [OpenAIRE]

Emmanuel Morin and Be´atrice Daille. 2010. Compositionality and lexical alignment of multi-word terms. In Lang. Res. & Eval. Special Issue on Multiword expression: hard going or plain sailing (jou, 2010), pages 79-95. [OpenAIRE]

Carlos Ramisch, Laurent Besacier, and Oleksandr Kobzar. 2013. How hard is it to automatically translate phrasal verbs from English to French? In Ruslan Mitkov, Johanna Monti, Gloria Corpas Pastor, and Violeta Seretan, editors, Proc. of the MT Summit 2013 MUMTTT workshop (MUMTTT 2013), pages 53-61, Nice, France, Sep.

25 references, page 1 of 2
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue
publication . Conference object . 2017

Discovering Light Verb Constructions and their Translations from Parallel Corpora without Word Alignment

Vargas, Natalie; Ramisch, Carlos; Caseli, Helena;