LMMS Wordnet Embeddings for SemCor corpus

This dataset contains word vectors generated after training the LMMS Language Modelling Makes Sense (ACL 2019) model with the whole train set of SemCor, adapted by rdenaux. The main modifications include: support for transformers backend ** this makes it possible to experiment with other transformer architectures besides BERT, e.g. XLNet, XLM, RoBERTa ** optimised training since we no longer have to pad sequences to 512 wordpiece tokens Introduced SentenceEncoder which is an experimental generalisation of bert-as-service like encoding services using the transformers backend ** allows to extract various types of embeddings from a single execution of a batch of sequences rolling cosine similarity metrics during training phase The original repository includes the code to replicate the experiments in the "Language Modelling Makes Sense (ACL 2019)" paper. This project is designed to be modular so that others can easily modify or reuse the portions that are relevant for them. Its composed of a series of scripts that when run in sequence produce most of the work described in the paper (for simplicity, we've focused this release on BERT, let us know if you need ELMo). The code is available here.

Funding: ELG (EU H2020 project, grant number: 825627) and Co-Inform (EU H2020 project, grant number: 770302)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average