descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Oct 2022Embargo end date: 01 Jan 2022 English Publisher:Association for Computing Machinery (ACM)Journal:Proceedings of the VLDB Endowment, volume 16, pages 369-378 (issn: 2150-8097,

Authors: Wang, Pengfei; Zeng, Xiaocan; Chen, Lu; Ye, Fan; Mao, Yuren; Zhu, Junhao; Gao, Yunjun;

doi: 10.14778/3565816.3565836 , 10.48550/arxiv.2207.04802

arXiv: 2207.04802

PromptEM

- Summary
- Subjects
- Related research
  (12)
- Metrics

Abstract

Entity Matching (EM), which aims to identify whether two entity records from two relational tables refer to the same real-world entity, is one of the fundamental problems in data management. Traditional EM assumes that two tables are homogeneous with the aligned schema, while it is common that entity records of different formats (e.g., relational, semi-structured, or textual types) involve in practical scenarios. It is not practical to unify their schemas due to the different formats. To support EM on format-different entity records, Generalized Entity Matching (GEM) has been proposed and gained much attention recently. To do GEM, existing methods typically perform in a supervised learning way, which relies on a large amount of high-quality labeled examples. However, the labeling process is extremely labor-intensive, and frustrates the use of GEM. Low-resource GEM, i.e., GEM that only requires a small number of labeled examples, becomes an urgent need. To this end, this paper, for the first time, focuses on the low-resource GEM and proposes a novel low-resource GEM method, termed as PromptEM. PromptEM has addressed three challenging issues (i.e., designing GEM-specific prompt-tuning, improving pseudo-labels quality, and running efficient self-training) in low-resource GEM. Extensive experimental results on eight real benchmarks demonstrate the superiority of PromptEM in terms of effectiveness and efficiency.

Related Organizations

Zhejiang University
China (People's Republic of)
Zhejiang Ocean University
China (People's Republic of)

Keywords

FOS: Computer and information sciences, Computer Science - Databases, Databases (cs.DB)

12 Research products, page 1 of 2

Prompt-tuned Code Language Model as a Neural Knowledge Base for Type Inference in Statically-Typed Partial Code
2022IsAmongTopNSimilarDocuments
Continuous Detection, Rapidly React: Unseen Rumors Detection based on Continual Prompt-Tuning
2022IsAmongTopNSimilarDocuments
Prompt-tuning in ASR systems for efficient domain-adaptation
2021IsAmongTopNSimilarDocuments
Eliciting Knowledge from Pretrained Language Models for Prototypical Prompt Verbalizer
2022IsAmongTopNSimilarDocuments
Automating Method Naming with Context-Aware Prompt-Tuning
2023IsAmongTopNSimilarDocuments
Input-Tuning: Adapting Unfamiliar Inputs to Frozen Pretrained Models
2022IsAmongTopNSimilarDocuments
deepmatcher software on GitHub
IsRelatedTo
hotstuff software on GitHub
IsRelatedTo
PromptMR software on GitHub
IsRelatedTo
DADER software on GitHub
IsRelatedTo

chevron_left
1
2
chevron_right

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	23
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

Top 10%

Green

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering