PARTIAL KEY GROUPING: Load-Balanced Partitioning of Distributed Streams

Preprint English
Nasir, Muhammad Anis Uddin; Morales, Gianmarco De Francisci; Garcia-Soriano, David; Kourtellis, Nicolas; Serafini, Marco;
  • Publisher: Qatar Computing Research Institute
  • Subject: Computer Science - Distributed, Parallel, and Cluster Computing | Load balancing | stream processing | Computer Systems | Datorsystem | stream grouping | power of both choices

We study the problem of load balancing in distributed stream processing engines, which is exacerbated in the presence of skew. We introduce PARTIAL KEY GROUPING (PKG), a new stream partitioning scheme that adapts the classical “power of two choices” to a distributed str... View more
