Threshold queries in theory and in the wild

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Jan 2022Embargo end date: 01 Jan 2021 English Publisher:Springer Science and Business Media LLCJournal:The VLDB Journal, volume 34 (issn: 1066-8888, eissn: 0949-877X,

Copyright policy )Funded by:EC | SmartDataLake, DFG | unidentified, ANR | VeriGraph

Authors: Angela Bonifati; Stefania Dumbrava; George Fletcher; Jan Hidders; Matthias Hofer; Wim Martens; Filip Murlak; +3 Authors

doi: 10.1007/s00778-025-00916-w , 10.14778/3510397.3510407 , 10.48550/arxiv.2106.15703

arXiv: 2106.15703

Threshold queries in theory and in the wild

- Summary
- Subjects
- Related research
  (4)
- Metrics

Abstract

Threshold queries are an important class of queries that only require computing or counting answers up to a specified threshold value. To the best of our knowledge, threshold queries have been largely disregarded in the research literature, which is surprising considering how common they are in practice. In this paper, we present a deep theoretical analysis of threshold query evaluation and show that thresholds can be used to significantly improve the asymptotic bounds of state-of-the-art query evaluation algorithms. We also empirically show that threshold queries are significant in practice. In surprising contrast to conventional wisdom, we found important scenarios in real-world data sets in which users are interested in computing the results of queries up to a certain threshold, independent of a ranking function that orders the query results.

Related Organizations

University of Lyon System
France
French National Centre for Scientific Research
France
Institut National des Sciences Appliquées de Lyon
France
University of Białystok
Poland
Claude Bernard University Lyon 1
France

View all View all

Keywords

FOS: Computer and information sciences, Theory and algorithms for application domains, Database query processing, Database theory, Databases (cs.DB), Database management system engines, Computer Science - Databases, Query languages, Information systems, [INFO.INFO-DB] Computer Science [cs]/Databases [cs.DB], Data management systems, Theory of computation

4 Research products, page 1 of 1

community software on GitHub
IsRelatedTo
covidgraph_org software on GitHub
IsRelatedTo
TQs software on GitHub
IsRelatedTo
detkdecomp software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	8
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%