Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object , Other literature type 11 Jul 2018Embargo end date: 01 Jan 2018 United States Publisher:ACMJournal:Proceedings of the 30th on Symposium on Parallelism in Algorithms and ArchitecturesFunded by:NSF | XPS: FULL: FP: Write-Effi..., NSF | CAREER: Parallel Algorith..., NSF | SHF: Medium: Collaborativ...

Authors: Laxman Dhulipala; Guy E. Blelloch; Julian Shun;

doi: 10.1145/3210377.3210414 , 10.1145/3434393 , 10.48550/arxiv.1805.05208

arXiv: 1805.05208

handle: 1721.1/143885 , 1721.1/135027

Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

- Summary
- Subjects
- Related research
  (2)
- Metrics

Abstract

There has been significant recent interest in parallel graph processing due to the need to quickly analyze the large graphs available today. Many graph codes have been designed for distributed memory or external memory. However, today even the largest publicly-available real-world graph (the Hyperlink Web graph with over 3.5 billion vertices and 128 billion edges) can fit in the memory of a single commodity multicore server. Nevertheless, most experimental work in the literature report results on much smaller graphs, and the ones for the Hyperlink graph use distributed or external memory. Therefore, it is natural to ask whether we can efficiently solve a broad class of graph problems on this graph in memory. This paper shows that theoretically-efficient parallel graph algorithms can scale to the largest publicly-available graphs using a single machine with a terabyte of RAM, processing them in minutes. We give implementations of theoretically-efficient parallel algorithms for 20 important graph problems. We also present the interfaces, optimizations, and graph processing techniques that we used in our implementations, which were crucial in enabling us to process these large graphs quickly. We show that the running times of our implementations outperform existing state-of-the-art implementations on the largest real-world graphs. For many of the problems that we consider, this is the first time they have been solved on graphs at this scale. We have made the implementations developed in this work publicly-available as the Graph Based Benchmark Suite (GBBS).

Country

United States

Related Organizations

Massachusetts Institute of Technology
United States
Carnegie Mellon University
United States
Carnegie Mellon University
CARNEGIE-MELLON UNIVERSITY
Carnegie Mellon University

View all View all

Keywords

FOS: Computer and information sciences, Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Data Structures and Algorithms, Data Structures and Algorithms (cs.DS), Distributed, Parallel, and Cluster Computing (cs.DC)

2 Research products, page 1 of 1

gbbs software on GitHub
IsRelatedTo
gbbs software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	114
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 1%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 1%