Buffered Streaming Graph Partitioning

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 21 Oct 2022Embargo end date: 01 Jan 2021 English Publisher:Association for Computing Machinery (ACM)Journal:ACM Journal of Experimental Algorithmics, volume 27, pages 1-26 (issn: 1084-6654, eissn: 1084-6654,

Copyright policy )

Authors: Marcelo Fonseca Faraj; Christian Schulz 0003;

doi: 10.1145/3546911 , 10.48550/arxiv.2102.09384

arXiv: 2102.09384

Buffered Streaming Graph Partitioning

- Summary
- Subjects
- Metrics

Abstract

Partitioning graphs into blocks of roughly equal size is a widely used tool when processing large graphs. Currently, there is a gap observed in the space of available partitioning algorithms. On the one hand, there are streaming algorithms that have been adopted to partition massive graph data on small machines. In the streaming model, vertices arrive one at a time including their neighborhood, and then have to be assigned directly to a block. These algorithms can partition huge graphs quickly with little memory, but they produce partitions with low solution quality. On the other hand, there are offline (shared-memory) multilevel algorithms that produce partitions with high-quality but also need a machine with enough memory to partition huge networks. In this work, we make a first step to close this gap by presenting an algorithm that computes significantly improved partitions of huge graphs using a single machine with little memory in a streaming setting. First, we adopt the buffered streaming model which is a more reasonable approach in practice. In this model, a processing element can store a buffer of nodes alongside with their edges before making assignment decisions. When our algorithm receives a batch of nodes, we build a model graph that represents the nodes of the batch and the already present partition structure. This model enables us to apply multilevel algorithms and in turn, on cheap machines, compute much higher quality solutions of huge graphs than previously possible. To partition the model graph, we develop a multilevel algorithm that optimizes an objective function that has previously been shown to be effective for the streaming setting. Surprisingly, this also removes the dependency on the number of blocks k from the running time compared to the previous state-of-the-art. Overall, our algorithm computes, on average, 75.9% better solutions than Fennel [ 35 ] using a very small buffer size. In addition, for large values of k our algorithm becomes faster than Fennel .

Related Organizations

Keywords

FOS: Computer and information sciences, streaming algorithms, graph partitioning, Edge subsets with special properties (factorization, matching, partitioning, covering and packing, etc.), Graph theory (including graph drawing) in computer science, Graph algorithms (graph-theoretic aspects), Computer Science - Data Structures and Algorithms, Online algorithms; streaming algorithms, Data Structures and Algorithms (cs.DS), multilevel algorithms

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	11
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

11

Top 10%

Average

Top 10%

Green

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering