
doi: 10.1109/iacc.2016.36
Word Sense Disambiguation (WSD) is the process ofidentifying the proper sense of an ambiguous word depending onthe particular context. It is to find the accurate sense si among theset of senses {s1, s2, , sn}. This task was motivated by itsinterpretation in various Natural Language Processing (NLP) applications like IR, MT, QA, TC, SP etc. In this paper, machinelearning technique - Naive Bayes Classifier was used forautomatic disambiguation task. Training data was prepared withsense annotated features. For preparing sense annotated data wetook help of the sense inventory. Currently, about 160 ambiguouswords are present in the sense inventory derived from 18K and25K words from Assamese Corpus and WordNet. The system isimplemented in two phases. In the first phase, a total of 2.7Ksense annotated training data and 800 test data were taken and aresult of 71% accuracy was found. Analyzing the result depictsthat accuracy improves as the training data size graduallyincreases and by the learned model generated in the previousiteration. In second phase we manually validate the outcomes offirst-phase and we add those clean sense tagged data to previoustraining data set. Than we train our system with our incresingtraining data (3.5K) which enhance the result accuracy. Aniterative learning is adopted by the system and more accuracy of7% is achieved. This paper aims to implement Assamese WSDsystem by NB classifier using lexical features and enhancement ofthe baseline method turns out in improving the classifieraccuracy to 78%.
| citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 9 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
