
arXiv: 1508.04265
Component-centric distributed graph processing platforms that use a bulk synchronous parallel (BSP) programming model have gained traction. These address the short-comings of Big Data abstractions/platforms like MapReduce/Hadoop for large-scale graph processing. However, there is limited literature on foundational aspects of the behavior of these component-centric abstractions for different graphs, graph partitioning, and graph algorithms. Here, we propose a analytical approach based on a meta-graph sketch to examine the characteristics of component-centric graph programming models at a coarse granularity. In particular, we apply this sketch to subgraph- and block-centric abstractions, and draw a comparison with vertex-centric models like Google's Pregel. First, we explore the impact of various graph partitioning techniques on the meta-graph, and next consider the impact of the meta-graph on graph algorithms. This decouples the unwieldy large graph and their partitioning specific artifacts from their algorithmic analysis. We use 5 spatial and powerlaw graphs as exemplars, four different partitioning strategies, and PageRank and Breadth First Search as canonical algorithms. These analysis over the meta-graphs provide a reliable measure of the expected number of supersteps, and the communication and computational complexity of the algorithms for various graphs, and the relative merits of subgraph-centric models over vertex-centric ones.
FOS: Computer and information sciences, Computer Science - Distributed, Parallel, and Cluster Computing, Distributed, Parallel, and Cluster Computing (cs.DC)
FOS: Computer and information sciences, Computer Science - Distributed, Parallel, and Cluster Computing, Distributed, Parallel, and Cluster Computing (cs.DC)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 6 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
