NAE-SAT-based probabilistic membership filters

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Jan 2018Embargo end date: 01 Jan 2018Publisher:arXivJournal:CoRR, volume abs/1801.06232Funded by:NSF | CAREER: Designing quantum...

Authors: Chao Fang; Zheng Zhu; Helmut G. Katzgraber;

doi: 10.48550/arxiv.1801.06232

arXiv: 1801.06232

NAE-SAT-based probabilistic membership filters

- Summary
- Subjects
- Related research
  (2)
- Metrics

Abstract

Probabilistic membership filters are a type of data structure designed to quickly verify whether an element of a large data set belongs to a subset of the data. While false negatives are not possible, false positives are. Therefore, the main goal of any good probabilistic membership filter is to have a small false-positive rate while being memory efficient and fast to query. Although Bloom filters are fast to construct, their memory efficiency is bounded by a strict theoretical upper bound. Weaver et al. introduced random satisfiability-based filters that significantly improved the efficiency of the probabilistic filters, however, at the cost of solving a complex random satisfiability (SAT) formula when constructing the filter. Here we present an improved SAT filter approach with a focus on reducing the filter building times, as well as query times. Our approach is based on using not-all-equal (NAE) SAT formulas to build the filters, solving these via a mapping to random SAT using traditionally-fast random SAT solvers, as well as bit packing and the reduction of the number of hash functions. Paired with fast hardware, NAE-SAT filters could result in enterprise-size applications.

13 pages, 4 figures, 3 pages

Related Organizations

Texas A&M University
United States
Texas A&M University
United States
Santa Fe Institute
United States
The University of Texas System
United States
Department of Physics and Astronomy
United States

View all View all

Keywords

FOS: Computer and information sciences, Computer Science - Cryptography and Security, Statistical Mechanics (cond-mat.stat-mech), Computer Science - Data Structures and Algorithms, FOS: Physical sciences, Data Structures and Algorithms (cs.DS), Cryptography and Security (cs.CR), Condensed Matter - Statistical Mechanics

2 Research products, page 1 of 1

Approximating Max NAE-k-SAT by anonymous local search
2017IsAmongTopNSimilarDocuments
smhasher software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average