PARTIAL KEY GROUPING: Load-Balanced Partitioning of Distributed Streams

Preprint English OPEN
Nasir, Muhammad Anis Uddin; Morales, Gianmarco De Francisci; Garcia-Soriano, David; Kourtellis, Nicolas; Serafini, Marco;
  • Publisher: Qatar Computing Research Institute
  • Subject: Computer Science - Distributed, Parallel, and Cluster Computing | Load balancing | stream processing | Computer Systems | Datorsystem | stream grouping | power of both choices

We study the problem of load balancing in distributed stream processing engines, which is exacerbated in the presence of skew. We introduce PARTIAL KEY GROUPING (PKG), a new stream partitioning scheme that adapts the classical “power of two choices” to a distributed str... View more
  • References (33)
    33 references, page 1 of 4

    [1] Y. Ben-Haim and E. Tom-Tov, “A Streaming Parallel Decision Tree Algorithm,” JMLR, vol. 11, pp. 849-872, 2010.

    [2] R. Berinde, P. Indyk, G. Cormode, and M. J. Strauss, “Space-optimal heavy hitters with strong error bounds,” ACM Trans. Database Syst., vol. 35, no. 4, pp. 1-28, 2010.

    [3] G. H. Gonnet, “Expected length of the longest probe sequence in hash code searching,” J. ACM, vol. 28, no. 2, pp. 289-304, 1981.

    [4] M. Mitzenmacher, “The power of two choices in randomized load balancing,” IEEE Trans. Parallel Distrib. Syst., vol. 12, no. 10, pp. 1094-1104, 2001.

    [5] M. A. Uddin Nasir, G. De Francisci Morales, D. Garcia-Soriano, N. Kourtellis, and M. Serafini, “The Power of Both Choices: Practical Load Balancing for Distributed Stream Processing Engines,” in ICDE, 2015.

    [6] M. A. Shah, J. M. Hellerstein, S. Chandrasekaran, and M. J. Franklin, “Flux: An adaptive partitioning operator for continuous query systems,” in ICDE, 2003, pp. 25-36.

    [7] M. Cherniack, H. Balakrishnan, M. Balazinska, D. Carney, U. Cetintemel, Y. Xing, and S. B. Zdonik, “Scalable distributed stream processing,” in CIDR, vol. 3, 2003, pp. 257-268.

    [8] Y. Xing, S. Zdonik, and J.-H. Hwang, “Dynamic load distribution in the borealis stream processor,” in ICDE, 2005, pp. 791-802.

    [9] B. Gedik, “Partitioning functions for stateful data parallelism in stream processing,” The VLDB Journal, pp. 1-23, 2013.

    [10] C. Balkesen, N. Tatbul, and M. T. O¨zsu, “Adaptive input admission and management for parallel stream processing,” in DEBS. ACM, 2013, pp. 15-26.

  • Related Research Results (1)
  • Metrics
Share - Bookmark