Distant Supervision for Relation Extraction with Matrix Completion

descriptionPublicationkeyboard_double_arrow_right Article , Conference object 01 Jan 2014Publisher:Association for Computational Linguistics (ACL)Journal:Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Authors: Miao Fan; Deli Zhao; Qiang Zhou; Zhiyuan Liu 0001; Thomas Fang Zheng; Edward Y. Chang;

doi: 10.3115/v1/p14-1079

Distant Supervision for Relation Extraction with Matrix Completion

- Summary
- Metrics

Abstract

The essence of distantly supervised relation extraction is that it is an incomplete multi-label classification problem with sparse and noisy features. To tackle the sparsity and noise challenges, we propose solving the classification problem using matrix completion on factorized matrix of minimized rank. We formulate relation classification as completing the unknown labels of testing items (entity pairs) in a sparse matrix that concatenates training and testing textual features with training labels. Our algorithmic framework is based on the assumption that the rank of item-byfeature and item-by-label joint matrix is low. We apply two optimization models to recover the underlying low-rank matrix leveraging the sparsity of feature-label matrix. The matrix completion problem is then solved by the fixed point continuation (FPC) algorithm, which can find the global optimum. Experiments on two widely used datasets with different dimensions of textual features demonstrate that our low-rank matrix completion approach significantly outperforms the baseline and the state-of-the-art methods.

Related Organizations

The Chinese University of Hong kong
Hong Kong
Chinese University of Hong Kong
China (People's Republic of)
Google (United States)
United States
Tsinghua University
China (People's Republic of)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	18
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%