Cache-Oblivious Peeling of Random Hypergraphs

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object , Report , Part of book or chapter of book 01 Mar 2014Embargo end date: 01 Jan 2013 Italy Publisher:IEEEJournal:2014 Data Compression ConferenceFunded by:EC | MIDAS, EC | NADINE

Authors: Djamal Belazzougui; Paolo Boldi; Giuseppe Ottaviano; Rossano Venturini; Sebastiano Vigna;

doi: 10.1109/dcc.2014.48 , 10.48550/arxiv.1312.0526

arXiv: 1312.0526

handle: 20.500.14243/245103 , 2434/243463 , 11568/753937

Cache-Oblivious Peeling of Random Hypergraphs

- Summary
- Subjects
- Metrics

Abstract

The computation of a peeling order in a randomly generated hypergraph is the most time-consuming step in a number of constructions, such as perfect hashing schemes, random $r$-SAT solvers, error-correcting codes, and approximate set encodings. While there exists a straightforward linear time algorithm, its poor I/O performance makes it impractical for hypergraphs whose size exceeds the available internal memory. We show how to reduce the computation of a peeling order to a small number of sequential scans and sorts, and analyze its I/O complexity in the cache-oblivious model. The resulting algorithm requires $O(\mathrm{sort}(n))$ I/Os and $O(n \log n)$ time to peel a random hypergraph with $n$ edges. We experimentally evaluate the performance of our implementation of this algorithm in a real-world scenario by using the construction of minimal perfect hash functions (MPHF) as our test case: our algorithm builds a MPHF of $7.6$ billion keys in less than $21$ hours on a single machine. The resulting data structure is both more space-efficient and faster than that obtained with the current state-of-the-art MPHF construction for large-scale key sets.

Country

Italy

Related Organizations

View all View all

Keywords

FOS: Computer and information sciences, Perfect hash functions, Computer Networks and Communications, Computer Science - Data Structures and Algorithms, Hypergraph algorithms, Data Structures and Algorithms (cs.DS), Cache-oblivious algorithms, hypergraph peeling; minimal perfect hash function

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	8
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average