
With the need of fast retrieval speed and small memory footprint, document hashing has been playing a crucial role in large-scale information retrieval. To generate high-quality hashing code, both semantics and neighborhood information are crucial. However, most existing methods leverage only one of them or simply combine them via some intuitive criteria, lacking a theoretical principle to guide the integration process. In this paper, we encode the neighborhood information with a graph-induced Gaussian distribution, and propose to integrate the two types of information with a graph-driven generative model. To deal with the complicated correlations among documents, we further propose a tree-structured approximation method for learning. Under the approximation, we prove that the training objective can be decomposed into terms involving only singleton or pairwise documents, enabling the model to be trained as efficiently as uncorrelated ones. Extensive experimental results on three benchmark datasets show that our method achieves superior performance over state-of-the-art methods, demonstrating the effectiveness of the proposed model for simultaneously preserving semantic and neighborhood information.\
FOS: Computer and information sciences, Artificial intelligence, Computer Science - Artificial Intelligence, Generative grammar, Chen, Representation Learning, Graph, Computer Science - Information Retrieval, Theoretical computer science, Artificial Intelligence, Information retrieval, Signal Processing on Graphs, Biology, Natural Language Processing, Natural language processing, Knowledge Graph Embedding, Paleontology, Semantics (computer science), Statistical Machine Translation and Natural Language Processing, Named Entity Recognition, Computer science, Programming language, Artificial Intelligence (cs.AI), Computer Science, Physical Sciences, Information Retrieval, Graph Neural Network Models and Applications, Information Retrieval (cs.IR)
FOS: Computer and information sciences, Artificial intelligence, Computer Science - Artificial Intelligence, Generative grammar, Chen, Representation Learning, Graph, Computer Science - Information Retrieval, Theoretical computer science, Artificial Intelligence, Information retrieval, Signal Processing on Graphs, Biology, Natural Language Processing, Natural language processing, Knowledge Graph Embedding, Paleontology, Semantics (computer science), Statistical Machine Translation and Natural Language Processing, Named Entity Recognition, Computer science, Programming language, Artificial Intelligence (cs.AI), Computer Science, Physical Sciences, Information Retrieval, Graph Neural Network Models and Applications, Information Retrieval (cs.IR)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
