Similarity of Semantic Relations

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Other literature type 01 Sep 2006Embargo end date: 01 Jan 2006 Canada English Publisher:MIT Press - JournalsJournal:Computational Linguistics, volume 32, pages 379-416 (issn: 0891-2017, eissn: 1530-9312,

Copyright policy )

Authors: Turney, Peter D.;

doi: 10.1162/coli.2006.32.3.379 , 10.48550/arxiv.cs/0608100

arXiv: cs/0608100

Similarity of Semantic Relations

- Summary
- Subjects
- Metrics

Abstract

There are at least two kinds of similarity. Relational similarity is correspondence between relations, in contrast with attributional similarity, which is correspondence between attributes. When two words have a high degree of attributional similarity, we call them synonyms. When two pairs of words have a high degree of relational similarity, we say that their relations are analogous. For example, the word pair mason:stone is analogous to the pair carpenter:wood. This article introduces Latent Relational Analysis (LRA), a method for measuring relational similarity. LRA has potential applications in many areas, including information extraction, word sense disambiguation, and information retrieval. Recently the Vector Space Model (VSM) of information retrieval has been adapted to measuring relational similarity, achieving a score of 47% on a collection of 374 college-level multiple-choice word analogy questions. In the VSM approach, the relation between a pair of words is characterized by a vector of frequencies of predefined patterns in a large corpus. LRA extends the VSM approach in three ways: (1) The patterns are derived automatically from the corpus, (2) the Singular Value Decomposition (SVD) is used to smooth the frequency data, and (3) automatically generated synonyms are used to explore variations of the word pairs. LRA achieves 56% on the 374 analogy questions, statistically equivalent to the average human score of 57%. On the related problem of classifying semantic relations, LRA achieves similar gains over the VSM.

Country

Canada

Related Organizations

National Research Council Canada
Canada
National Academies of Sciences, Engineering, and Medicine
United States

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Computation and Language, I.2.6, Natural language processing, I.2.7, Computer Science - Information Retrieval, Machine Learning (cs.LG), Computational linguistics. Natural language processing, P98-98.5, H.3.1, Computation and Language (cs.CL), H.3.1; I.2.6; I.2.7, Information Retrieval (cs.IR)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	168
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 1%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 1%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

168

Top 1%

Top 10%

Green

gold

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Related to Research communities

Digital Humanities and Cultural Heritage