Powered by OpenAIRE graph
Found an issue? Give us feedback
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Word Sense Disambiguation for Assamese

Authors: Shikhar Kr. Sarma; Jumi Sarmah;

Word Sense Disambiguation for Assamese

Abstract

Word Sense Disambiguation (WSD) is the process ofidentifying the proper sense of an ambiguous word depending onthe particular context. It is to find the accurate sense si among theset of senses {s1, s2, , sn}. This task was motivated by itsinterpretation in various Natural Language Processing (NLP) applications like IR, MT, QA, TC, SP etc. In this paper, machinelearning technique - Naive Bayes Classifier was used forautomatic disambiguation task. Training data was prepared withsense annotated features. For preparing sense annotated data wetook help of the sense inventory. Currently, about 160 ambiguouswords are present in the sense inventory derived from 18K and25K words from Assamese Corpus and WordNet. The system isimplemented in two phases. In the first phase, a total of 2.7Ksense annotated training data and 800 test data were taken and aresult of 71% accuracy was found. Analyzing the result depictsthat accuracy improves as the training data size graduallyincreases and by the learned model generated in the previousiteration. In second phase we manually validate the outcomes offirst-phase and we add those clean sense tagged data to previoustraining data set. Than we train our system with our incresingtraining data (3.5K) which enhance the result accuracy. Aniterative learning is adopted by the system and more accuracy of7% is achieved. This paper aims to implement Assamese WSDsystem by NB classifier using lexical features and enhancement ofthe baseline method turns out in improving the classifieraccuracy to 78%.

Related Organizations
  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    9
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Top 10%
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Top 10%
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
9
Top 10%
Top 10%
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!