Integrating Semantics and Neighborhood Information with Graph-Driven Generative Models for Document Retrieval

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type , Preprint , Conference object 01 Jan 2021Embargo end date: 01 Jan 2021Publisher:Association for Computational Linguistics (ACL)Journal:Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Authors: Zijing Ou; Qinliang Su; Jianxing Yu; Bang Liu; Jingwen Wang; Ruihui Zhao; Changyou Chen; +1 Authors

doi: 10.18653/v1/2021.acl-long.174 , 10.48550/arxiv.2105.13066 , 10.60692/kwcnw-nrs71 , 10.60692/9td5k-b0448

arXiv: 2105.13066

Integrating Semantics and Neighborhood Information with Graph-Driven Generative Models for Document Retrieval

- Summary
- Subjects
- Related research
  (10)
- Metrics

Abstract

With the need of fast retrieval speed and small memory footprint, document hashing has been playing a crucial role in large-scale information retrieval. To generate high-quality hashing code, both semantics and neighborhood information are crucial. However, most existing methods leverage only one of them or simply combine them via some intuitive criteria, lacking a theoretical principle to guide the integration process. In this paper, we encode the neighborhood information with a graph-induced Gaussian distribution, and propose to integrate the two types of information with a graph-driven generative model. To deal with the complicated correlations among documents, we further propose a tree-structured approximation method for learning. Under the approximation, we prove that the training objective can be decomposed into terms involving only singleton or pairwise documents, enabling the model to be trained as efficiently as uncorrelated ones. Extensive experimental results on three benchmark datasets show that our method achieves superior performance over state-of-the-art methods, demonstrating the effectiveness of the proposed model for simultaneously preserving semantic and neighborhood information.\

Related Organizations

Tencent (China)
China (People's Republic of)
Universite De Montreal
Sun Yat-sen University
China (People's Republic of)
Université de Montréal
University of Montreal
Canada

View all View all

Keywords

FOS: Computer and information sciences, Artificial intelligence, Computer Science - Artificial Intelligence, Generative grammar, Chen, Representation Learning, Graph, Computer Science - Information Retrieval, Theoretical computer science, Artificial Intelligence, Information retrieval, Signal Processing on Graphs, Biology, Natural Language Processing, Natural language processing, Knowledge Graph Embedding, Paleontology, Semantics (computer science), Statistical Machine Translation and Natural Language Processing, Named Entity Recognition, Computer science, Programming language, Artificial Intelligence (cs.AI), Computer Science, Physical Sciences, Information Retrieval, Graph Neural Network Models and Applications, Information Retrieval (cs.IR)

10 Research products, page 1 of 1

Block Model Guided Unsupervised Feature Selection
2020IsAmongTopNSimilarDocuments
Open architecture of CNC system research based on CAD graph-driven technology
2010IsAmongTopNSimilarDocuments
Graph-Driven Diffusion and Random Walk Schemes for Image Segmentation
2017IsAmongTopNSimilarDocuments
Integrating Semantics and Neighborhood Information with Graph-Driven Generative Models for Document Retrieval
2021IsAmongTopNSimilarDocuments
Parity Graph-Driven Read-Once Branching Programs and An Exponential Lower Bound for Integer Multiplication
2002IsAmongTopNSimilarDocuments
A Lower Bound Technique for Nondeterministic Graph-Driven Read-Once-Branching Programs and Its Applications
2002IsAmongTopNSimilarDocuments
Complexity Theoretical Results on Nondeterministic Graph-driven Read-Once Branching Programs
2003IsAmongTopNSimilarDocuments
Graph Convolutional Networks for Multi-modality Medical Imaging: Methods, Architectures, and Clinical Applications
2022IsAmongTopNSimilarDocuments
RBSH software on GitHub
IsRelatedTo
SNUH software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	1
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average