Distributed top-k query processing on multi-dimensional data with keywords

descriptionPublicationkeyboard_double_arrow_right Article 29 Jun 2015Publisher:ACMJournal:Proceedings of the 27th International Conference on Scientific and Statistical Database Management

Authors: Daichi Amagata; Takahiro Hara; Shojiro Nishio;

doi: 10.1145/2791347.2791355

Distributed top-k query processing on multi-dimensional data with keywords

- Summary
- Metrics

Abstract

As we are in the big data era, techniques for retrieving only user-desirable data objects from massive and diverse datasets is being required. Ranking queries, e.g., top-k queries, which rank data objects based on a user-specified scoring function, enable to find such interesting data for users, and have received significant attention due to its wide range of applications. While many techniques for both centralized and distributed top-k query processing have been developed, they do not consider query keywords, i.e., simply retrieving k data with the best score. Utilizing keywords, on the other hand, is a common approach in data (and information) retrieval. Despite of this fact, there is no study on retrieving top-k data containing all query keywords. We define, in this paper, a new query which enriches the conventional top-k queries, and propose some algorithms to solve the novel problem of how to efficiently retrieve k data objects with the best score and all query from distributed databases. Extensive experiments on both real and synthetic data have demonstrated the efficiency and scalability of our algorithms in terms of communication cost and running time.

Related Organizations

Osaka University
Japan

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	10
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average