Powered by OpenAIRE graph
Found an issue? Give us feedback
addClaim

Approximate Nearest Neighbor Search with Filters

Authors: Landrum, Benjamin Thomas;

Approximate Nearest Neighbor Search with Filters

Abstract

Approximate nearest neighbor search (ANNS) on high-dimensional vectors is a fundamental primitive used widely for search over neural embeddings of unstructured data. Prior work on ANNS has produced indices which provide fast and accurate search on datasets up to billions of points, but are not well suited to queries restricted to some subset of the original dataset. Filtered ANNS is a formulation of the problem which adds metadata to points in the dataset which can be used to filter points at query time. This setting requires indexing a dataset in a metadata-aware way to support filtered queries. Filtered ANNS is important for applications such as product and image search, and necessary to give recently popular `vector databases' functionality similar to more traditional tabular databases. This work concerns two versions of the filtered ANNS problem. The most popular formulation in prior work associates points with boolean metadata in the form of labels and filters queries using a boolean predicate on these labels. In this setting, we present a novel index with state-of-the-art performance for queries with filters requiring either one label or both of a pair of labels which won a large benchmarking competition's track focused on the problem. We also introduce a novel formulation of filtered ANNS called `window filtered' ANNS, in which points are associated with a continuous metadata value (in practical use, this corresponds to a timestamp, measure of popularity, etc.), and queries are filtered to a range of metadata values. In addition to describing the problem, we present a practical and theoretically motivated index which handily outperforms baselines.

Keywords

020, filtered anns, anns, vector search, Computer science, ann, filtered vector search, 004, approximate nearest neighbors

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!