Word Sense Disambiguation in the Biomedical Domain: An Overview

descriptionPublicationkeyboard_double_arrow_right Article 01 Jun 2005 English Publisher:Mary Ann Liebert IncJournal:Journal of Computational Biology, volume 12, pages 554-565 (issn: 1066-5277, eissn: 1557-8666,

Copyright policy )

Authors: Schuemie, Martijn; Kors, Jan; Mons, B;

doi: 10.1089/cmb.2005.12.554

pmid: 15952878

Word Sense Disambiguation in the Biomedical Domain: An Overview

- Summary
- Subjects
- Metrics

Abstract

There is a trend towards automatic analysis of large amounts of literature in the biomedical domain. However, this can be effective only if the ambiguity in natural language is resolved. In this paper, the current state of research in word sense disambiguation (WSD) is reviewed. Several methods for WSD have already been proposed, but many systems have been tested only on evaluation sets of limited size. There are currently only very few applications of WSD in the biomedical domain. The current direction of research points towards statistically based algorithms that use existing curated data and can be applied to large sets of biomedical literature. There is a need for manually tagged evaluation sets to test WSD algorithms in the biomedical domain. WSD algorithms should preferably be able to take into account both known and unknown senses of a word. Without WSD, automatic metaanalysis of large corpora of text will be error prone.

Related Organizations

Erasmus University Rotterdam
Netherlands

Keywords

EMC NIHES-03-77-01, Information Storage and Retrieval, Databases, Bibliographic, Algorithms

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	84
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 1%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%