A survey of OpenRefine reconciliation services

Preprint English OPEN
Delpeuch, Antonin;
(2019)
  • Subject: Computer Science - Information Retrieval | Computer Science - Databases

We review the services implementing the OpenRefine reconciliation API, comparing their design to the state of the art in record linkage. Due to the design of the API, the matching scores returned by the services are of little help to guide matching decisions. This sugge... View more
  • References (18)
    18 references, page 1 of 2

    [1] Elasticsearch from the Bottom Up, Part https://www.elastic.co/blog/found-elasticsearch-from-the-bottom-up, September 2013.

    [2] Reconciliation Service API. https://github.com/OpenRefine/OpenRefine, November 2018.

    [3] Arvind Arasu, Michaela Götz, and Raghav Kaushik. On active learning of record matching packages. In Proceedings of the 2010 International Conference on Management of Data - SIGMOD '10, page 783, Indianapolis, Indiana, USA, 2010. ACM Press.

    [4] Rohan Baxter, Peter Christen, and Tim Churches. A Comparison of Fast Blocking Methods for Record Linkage. page 6, 2003.

    [5] Omar Benjelloun, Hector Garcia-Molina, David Menestrina, Qi Su, Steven Euijong Whang, and Jennifer Widom. Swoosh: A generic approach to entity resolution. The VLDB Journal, 18(1):255-276, January 2009.

    [6] Peter Christen. Automatic record linkage using seeded nearest neighbour and support vector machine classification. In Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 08, page 151, Las Vegas, Nevada, USA, 2008. ACM Press.

    [7] Peter Christen. Data Matching: Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. Springer Science & Business Media, 2012.

    [8] Peter Christen. A Survey of Indexing Techniques for Scalable Record Linkage and Deduplication. IEEE Transactions on Knowledge and Data Engineering, 24(9):1537-1555, September 2012.

    [9] Munir Cochinwala, Verghese Kurien, Gail Lalk, and Dennis Shasha. Efficient data reconciliation. Information Sciences, 137(1):1-15, September 2001.

    [10] William W Cohen, Pradeep Ravikumar, and Stephen E Fienberg. A Comparison of String Metrics for Matching Names and Records. page 6, 2003.

  • Related Research Results (3)
  • Metrics
Share - Bookmark