publication . Preprint . 2018

Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks

Zhelezniak, Vitalii; Busbridge, Dan; Shen, April; Smith, Samuel L.; Hammerla, Nils Y.;
Open Access English
  • Published: 09 May 2018
Abstract
Experimental evidence indicates that simple models outperform complex deep networks on many unsupervised similarity tasks. We provide a simple yet rigorous explanation for this behaviour by introducing the concept of an optimal representation space, in which semantically close symbols are mapped to representations that are close under a similarity measure induced by the model's objective function. In addition, we present a straightforward procedure that, without any retraining or architectural modifications, allows deep recurrent models to perform equally well (and sometimes better) when compared to shallow models. To validate our analysis, we conduct a set of c...
Subjects
free text keywords: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Learning
Download from
53 references, page 1 of 4

Yossi Adi, Einat Kermany, Yonatan Belinkov, Ofer Lavi, and Yoav Goldberg. Fine Grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks. ICLR, 44(3):1-12, mar 2017. URL http://stroke.ahajournals.org/cgi/doi/10.1161/STR. 0b013e318284056a.

Eneko Agirre. SemEval-2015 Task 2: Semantic Textual Similarity, English, Spanish and Pilot on Interpretability. SemEval2015, (SemEval):252-263, 2015.

Eneko Agirre, Daniel Cer, Mona Diab, and Aitor Gonzalez-Agirre. SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity. Proc. 6th Int. Work. Semant. Eval. (SemEval 2012), conjunction with First Jt. Conf. Lex. Comput. Semant. (* SEM 2012), (3):385-393, 2012.

Eneko Agirre, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, and Weiwei Guo. SEM 2013 shared task : Semantic Textual Similarity. Second Jt. Conf. Lex. Comput. Semant. (*SEM 2013), 1: 32-43, 2013.

Eneko Agirre, Carmen Banea, Claire Cardie, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, Weiwei Guo, Rada Mihalcea, German Rigau, and Janyce Wiebe. SemEval-2014 Task 10: Multilingual Semantic Textual Similarity. Proc. 8th Int. Work. Semant. Eval. (SemEval 2014), (SemEval): 81-91, 2014.

Eneko Agirre, Carmen Banea, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, Rada Mihalcea, German Rigau, and Janyce Wiebe. SemEval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation. Proc. 10th Int. Work. Semant. Eval., pp. 497-511, 2016. URL http://aclweb.org/anthology/S16-1081.

Amjad Almahairi, Kyle Kastner, Kyunghyun Cho, and Aaron Courville. Learning Distributed Representations from Reviews for Collaborative Filtering. In Proc. 9th ACM Conf. Recomm. Syst. - RecSys '15, pp. 147-154, New York, New York, USA, 2015. ACM Press.

Sanjeev Arora, Yingyu Liang, and Tengyu Ma. A Simple but Tough-to-Beat Baseline for Sentence Embeddings. Int. Conf. Learn. Represent., pp. 1-14, 2017.

Jimmy Lei Ba, Ryan Kiros, and Geoffrey E. Hinton. Layer Normalization. jul 2016. ISSN 1607.06450. URL http://arxiv.org/abs/1607.06450.

Marco Baroni, Georgiana Dinu, and Germa´n Kruszewski. Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In Proc. 52nd Annu. Meet. Assoc. Comput. Linguist. (Volume 1 Long Pap., pp. 238-247, Stroudsburg, PA, USA, 2014. Association for Computational Linguistics. URL http://aclweb.org/anthology/ P14-1023.

Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. Enriching Word Vectors with Subword Information. jul 2016. URL http://arxiv.org/abs/1607.04606.

Daniel Cer, Mona Diab, Eneko Agirre, In˜igo Lopez-Gazpio, and Lucia Specia. SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation. Proc. 11th Int. Work. Semant. Eval., pp. 1-14, jul 2017. [OpenAIRE]

Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning Phrase Representations using RNN EncoderDecoder for Statistical Machine Translation. In Proc. 2014 Conf. Empir. Methods Nat. Lang. Process., pp. 1724-1734, Stroudsburg, PA, USA, 2014. Association for Computational Linguistics. URL http://arxiv.org/abs/1406.1078.

Alexis Conneau, Douwe Kiela, Holger Schwenk, Loic Barrault, and Antoine Bordes. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data. may 2017. URL http://arxiv.org/abs/1705.02364. [OpenAIRE]

Bill Dolan, Chris Quirk, and Chris Brockett. Unsupervised construction of large paraphrase corpora. In Proc. 20th Int. Conf. Comput. Linguist. - COLING '04, pp. 350-es, Morristown, NJ, USA, 2004. Association for Computational Linguistics.

53 references, page 1 of 4
Abstract
Experimental evidence indicates that simple models outperform complex deep networks on many unsupervised similarity tasks. We provide a simple yet rigorous explanation for this behaviour by introducing the concept of an optimal representation space, in which semantically close symbols are mapped to representations that are close under a similarity measure induced by the model's objective function. In addition, we present a straightforward procedure that, without any retraining or architectural modifications, allows deep recurrent models to perform equally well (and sometimes better) when compared to shallow models. To validate our analysis, we conduct a set of c...
Subjects
free text keywords: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Learning
Download from
53 references, page 1 of 4

Yossi Adi, Einat Kermany, Yonatan Belinkov, Ofer Lavi, and Yoav Goldberg. Fine Grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks. ICLR, 44(3):1-12, mar 2017. URL http://stroke.ahajournals.org/cgi/doi/10.1161/STR. 0b013e318284056a.

Eneko Agirre. SemEval-2015 Task 2: Semantic Textual Similarity, English, Spanish and Pilot on Interpretability. SemEval2015, (SemEval):252-263, 2015.

Eneko Agirre, Daniel Cer, Mona Diab, and Aitor Gonzalez-Agirre. SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity. Proc. 6th Int. Work. Semant. Eval. (SemEval 2012), conjunction with First Jt. Conf. Lex. Comput. Semant. (* SEM 2012), (3):385-393, 2012.

Eneko Agirre, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, and Weiwei Guo. SEM 2013 shared task : Semantic Textual Similarity. Second Jt. Conf. Lex. Comput. Semant. (*SEM 2013), 1: 32-43, 2013.

Eneko Agirre, Carmen Banea, Claire Cardie, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, Weiwei Guo, Rada Mihalcea, German Rigau, and Janyce Wiebe. SemEval-2014 Task 10: Multilingual Semantic Textual Similarity. Proc. 8th Int. Work. Semant. Eval. (SemEval 2014), (SemEval): 81-91, 2014.

Eneko Agirre, Carmen Banea, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, Rada Mihalcea, German Rigau, and Janyce Wiebe. SemEval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation. Proc. 10th Int. Work. Semant. Eval., pp. 497-511, 2016. URL http://aclweb.org/anthology/S16-1081.

Amjad Almahairi, Kyle Kastner, Kyunghyun Cho, and Aaron Courville. Learning Distributed Representations from Reviews for Collaborative Filtering. In Proc. 9th ACM Conf. Recomm. Syst. - RecSys '15, pp. 147-154, New York, New York, USA, 2015. ACM Press.

Sanjeev Arora, Yingyu Liang, and Tengyu Ma. A Simple but Tough-to-Beat Baseline for Sentence Embeddings. Int. Conf. Learn. Represent., pp. 1-14, 2017.

Jimmy Lei Ba, Ryan Kiros, and Geoffrey E. Hinton. Layer Normalization. jul 2016. ISSN 1607.06450. URL http://arxiv.org/abs/1607.06450.

Marco Baroni, Georgiana Dinu, and Germa´n Kruszewski. Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In Proc. 52nd Annu. Meet. Assoc. Comput. Linguist. (Volume 1 Long Pap., pp. 238-247, Stroudsburg, PA, USA, 2014. Association for Computational Linguistics. URL http://aclweb.org/anthology/ P14-1023.

Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. Enriching Word Vectors with Subword Information. jul 2016. URL http://arxiv.org/abs/1607.04606.

Daniel Cer, Mona Diab, Eneko Agirre, In˜igo Lopez-Gazpio, and Lucia Specia. SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation. Proc. 11th Int. Work. Semant. Eval., pp. 1-14, jul 2017. [OpenAIRE]

Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning Phrase Representations using RNN EncoderDecoder for Statistical Machine Translation. In Proc. 2014 Conf. Empir. Methods Nat. Lang. Process., pp. 1724-1734, Stroudsburg, PA, USA, 2014. Association for Computational Linguistics. URL http://arxiv.org/abs/1406.1078.

Alexis Conneau, Douwe Kiela, Holger Schwenk, Loic Barrault, and Antoine Bordes. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data. may 2017. URL http://arxiv.org/abs/1705.02364. [OpenAIRE]

Bill Dolan, Chris Quirk, and Chris Brockett. Unsupervised construction of large paraphrase corpora. In Proc. 20th Int. Conf. Comput. Linguist. - COLING '04, pp. 350-es, Morristown, NJ, USA, 2004. Association for Computational Linguistics.

53 references, page 1 of 4
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue