Powered by OpenAIRE graph
Found an issue? Give us feedback
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

A Weakly supervised word sense disambiguation for Polish using rich lexical resources

Authors: Arkadiusz Janz; Maciej Piasecki;

A Weakly supervised word sense disambiguation for Polish using rich lexical resources

Abstract

Abstract Automatic word sense disambiguation (WSD) has proven to be an important technique in many natural language processing tasks. For many years the problem of sense disambiguation has been approached with a wide range of methods, however, it is still a challenging problem, especially in the unsupervised setting. One of the well-known and successful approaches to WSD are knowledge-based methods leveraging lexical knowledge resources such as wordnets. As the knowledge-based approaches mostly do not use any labelled training data their performance strongly relies on the structure and the quality of used knowledge sources. However, a pure knowledge-base such as a wordnet cannot reflect all the semantic knowledge necessary to correctly disambiguate word senses in text. In this paper we explore various expansions to plWordNet as knowledge-bases for WSD. Semantic links extracted from a large valency lexicon (Walenty), glosses and usage examples, Wikipedia articles and SUMO ontology are combined with plWordNet and tested in a PageRank-based WSD algorithm. In addition, we analyse also the influence of lexical semantics vector models extracted with the help of the distributional semantics methods. Several new Polish test data sets for WSD are also introduced. All the resources, methods and tools are available on open licences.

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    7
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Top 10%
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
7
Top 10%
Average
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!