Fast approximate string matching

descriptionPublicationkeyboard_double_arrow_right Article 01 Apr 1988 English Publisher:WileyJournal:Software: Practice and Experience, volume 18, pages 387-393 (issn: 0038-0644, eissn: 1097-024X,

Copyright policy )

Authors: Olumide Owolabi; Douglas R. McGregor;

doi: 10.1002/spe.4380180407

Fast approximate string matching

- Summary
- Metrics

Abstract

AbstractApproximate string matching is an important operation in information systems because an input string is often an inexact match to the strings already stored. Commonly known accurate methods are computationally expensive as they compare the input string to every entry in the stored dictionary. This paper describes a two‐stage process. The first uses a very compact n‐gram table to preselect sets of roughly similar strings. The second stage compares these with the input string using an accurate method to give an accurately matched set of strings. A new similarity measure based on the Levenshtein metric is defined for this comparison. The resulting method is both computationally fast and storage‐efficient.

Related Organizations

University of Strathclyde
United Kingdom

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	39
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 1%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average