

You have already added 0 works in your ORCID record related to the merged Research product.
You have already added 0 works in your ORCID record related to the merged Research product.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
You have already added 0 works in your ORCID record related to the merged Research product.
You have already added 0 works in your ORCID record related to the merged Research product.
Identification of Adjective-Noun Neologisms using Pretrained Language Models
doi: 10.18653/v1/w19-5116
Identification of Adjective-Noun Neologisms using Pretrained Language Models
Neologism detection is a key task in the constructing of lexical resources and has wider implications for NLP, however the identification of multiword neologisms has received little attention. In this paper, we show that we can effectively identify the distinction between compositional and non-compositional adjective-noun pairs by using pretrained language models and comparing this with individual word embeddings. Our results show that the use of these models significantly improves over baseline linguistic features, however the combination with linguistic features still further improves the results, suggesting the strength of a hybrid approach.
Microsoft Academic Graph classification: business.industry Computer science computer.software_genre Task (project management) Identification (information) Noun Artificial intelligence Language model business Adjective computer Neologism Natural language processing Word (computer architecture)
Microsoft Academic Graph classification: business.industry Computer science computer.software_genre Task (project management) Identification (information) Noun Artificial intelligence Language model business Adjective computer Neologism Natural language processing Word (computer architecture)
29 references, page 1 of 3
Nikita Astrakhantsev. 2014. Automatic term acquisition from domain-specific text collection by using Wikipedia. Proceedings of the institute for system programming, 26(4):7-20.
Marco Baroni and Roberto Zamparelli. 2010. Nouns are vectors, adjectives are matrices: Representing adjective-noun constructions in semantic space. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 1183-1193. Association for Computational Linguistics.
Gemma Boleda, Marco Baroni, Louise McNally, et al. 2013. Intensionality was only alleged: On adjectivenoun composition in distributional semantics. In Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013)-Long Papers, pages 35-46.
James Breen. 2010. Identification of neologisms in Japanese by corpus analysis. E-lexicography in the 21st Century: New Challenges, New Applications: Proceedings of ELex 2009, Louvain-la Neuve, pages 13-21.
Paul Buitelaar, Georgeta Bordea, and Tamara Polajnar. 2013. Domain-independent term extraction through domain modelling. In The 10th international conference on terminology and artificial intelligence (TIA 2013), Paris, France. 10th International Conference on Terminology and Artificial Intelligence.
Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, et al. 2018. Universal sentence encoder. arXiv preprint arXiv:1803.11175.
Damien Cram and Be´atrice Daille. 2016. Terminology extraction with term variant detection. Proceedings of ACL-2016 System Demonstrations, pages 13-18. [OpenAIRE]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Ingrid Falk, Delphine Bernhard, and Christophe Ge´rard. 2014. From non word to new word: Automatically identifying neologisms in French newspapers. In LREC-The 9th edition of the Language Resources and Evaluation Conference.
Christiane Fellbaum. 2012. Wordnet. The Encyclopedia of Applied Linguistics.
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).2 popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.Average influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).Average impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.Average visibility views 61 download downloads 75 citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).2 popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.Average influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).Average impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.Average Powered byBIP!
- 61views75downloads



- Funder: European Commission (EC)
- Project Code: 731015
- Funding stream: H2020 | RIA
- Funder: European Commission (EC)
- Project Code: 825182
- Funding stream: H2020 | RIA
Neologism detection is a key task in the constructing of lexical resources and has wider implications for NLP, however the identification of multiword neologisms has received little attention. In this paper, we show that we can effectively identify the distinction between compositional and non-compositional adjective-noun pairs by using pretrained language models and comparing this with individual word embeddings. Our results show that the use of these models significantly improves over baseline linguistic features, however the combination with linguistic features still further improves the results, suggesting the strength of a hybrid approach.