
arXiv: 1503.02353
handle: 11577/3300961 , 11577/3290299
Motivated by the increasing need to understand the algorithmic foundations of distributed large-scale graph computations, we study a number of fundamental graph problems in a message-passing model for distributed computing where k ≥2 machines jointly perform computations on graphs with n nodes (typically, n > k ). The input graph is assumed to be initially randomly partitioned among the k machines, a common implementation in many real-world systems. Communication is point-to-point, and the goal is to minimize the number of communication rounds of the computation. Our main result is an (almost) optimal distributed randomized algorithm for graph connectivity. Our algorithm runs in Õ( n / k 2 ) rounds (Õ notation hides a polylog( n ) factor and an additive polylog( n ) term). This improves over the best previously known bound of Õ( n / k ) [Klauck et al., SODA 2015] and is optimal (up to a polylogarithmic factor) in light of an existing lower bound of Ω ˜ ( n / k 2 ). Our improved algorithm uses a bunch of techniques, including linear graph sketching, that prove useful in the design of efficient distributed graph algorithms. Using the connectivity algorithm as a building block, we then present fast randomized algorithms for computing minimum spanning trees, (approximate) min-cuts, and for many graph verification problems. All these algorithms take Õ( n / k 2 ) rounds and are optimal up to polylogarithmic factors. We also show an almost matching lower bound of Ω ˜ ( n / k 2 ) rounds for many graph verification problems by leveraging lower bounds in random-partition communication complexity.
FOS: Computer and information sciences, Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Data Structures and Algorithms, C.2.4, C.2.4; F.1.1, Data Structures and Algorithms (cs.DS), Distributed graph algorithms; Graph connectivity; Graph sketching; Massive graphs; Minimum spanning trees; Software; Theoretical Computer Science; Hardware and Architecture, Distributed, Parallel, and Cluster Computing (cs.DC), F.1.1, Distributed graph algorithms; Graph connectivity; Graph sketching; Minimum spanning trees; Computational Theory and Mathematics; Hardware and Architecture; Software; Modeling and Simulation
FOS: Computer and information sciences, Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Data Structures and Algorithms, C.2.4, C.2.4; F.1.1, Data Structures and Algorithms (cs.DS), Distributed graph algorithms; Graph connectivity; Graph sketching; Massive graphs; Minimum spanning trees; Software; Theoretical Computer Science; Hardware and Architecture, Distributed, Parallel, and Cluster Computing (cs.DC), F.1.1, Distributed graph algorithms; Graph connectivity; Graph sketching; Minimum spanning trees; Computational Theory and Mathematics; Hardware and Architecture; Software; Modeling and Simulation
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 17 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
