Knowledge-based Word Sense Disambiguation using Topic Models

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 27 Apr 2018Embargo end date: 01 Jan 2018Publisher:Association for the Advancement of Artificial Intelligence (AAAI)Journal:Proceedings of the AAAI Conference on Artificial Intelligence, volume 32 (issn: 2159-5399, eissn: 2374-3468,

Copyright policy )

Authors: Chaplot, Devendra Singh; Salakhutdinov, Ruslan;

doi: 10.1609/aaai.v32i1.12027 , 10.48550/arxiv.1801.01900

arXiv: 1801.01900

Knowledge-based Word Sense Disambiguation using Topic Models

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

Word Sense Disambiguation is an open problem in Natural Language Processing which is particularly challenging and useful in the unsupervised setting where all the words in any given text need to be disambiguated without using any labeled data. Typically WSD systems use the sentence or a small window of words around the target word as the context for disambiguation because their computational complexity scales exponentially with the size of the context. In this paper, we leverage the formalism of topic model to design a WSD system that scales linearly with the number of words in the context. As a result, our system is able to utilize the whole document as the context for a word to be disambiguated. The proposed method is a variant of Latent Dirichlet Allocation in which the topic proportions for a document are replaced by synset proportions. We further utilize the information in the WordNet by assigning a non-uniform prior to synset distribution over words and a logistic-normal prior for document distribution over synsets. We evaluate the proposed method on Senseval-2, Senseval-3, SemEval-2007, SemEval-2013 and SemEval-2015 English All-Word WSD datasets and show that it outperforms the state-of-the-art unsupervised knowledge-based WSD system by a significant margin.

Related Organizations

Carnegie Mellon University
United States

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Computation and Language, Computation and Language (cs.CL), Machine Learning (cs.LG)

1 Research products, page 1 of 1

Word Sense Disambiguation based on Sequence Topic Model using sense dependency
2021IsAmongTopNSimilarDocuments

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	36
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 1%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

36

Top 1%

Top 10%

Green

Fields of Science (4) View all

natural sciences

Fields of Science

natural sciences

View all

Knowledge-based Word Sense Disambiguation using Topic Models

Knowledge-based Word Sense Disambiguation using Topic Models

1 Research products, page 1 of 1

Word Sense Disambiguation based on Sequence Topic Model using sense dependency