
One of the key advantages of cloud computing is the elasticity in which computing resources such as virtual machines can be increased or decreased. Current state-of-the-art shared-nothing parallel SQL processing systems, on the other hand, are often designed and optimized for a fixed number of database nodes. To take advantage of the elasticity afforded by cloud computing, cloud-based SQL processing systems need the ability to repartition the data easily when the number of database nodes is scaled up or down. In this paper, we investigate the problem of supporting elastic partitioning of data in cloud-based parallel SQL processing systems. We propose several algorithms and associated data organization techniques that minimizes the re-partitioning of tuples and the movement of data between nodes. Our experimental evaluation demonstrates the effectiveness of the proposed methods.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 5 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
