
A scale-out system is a cluster of commodity machines, and offers a good platform to support steadily increasing workloads that process growing data sets. Sharding [4] is a method of partitioning data and processing a computation on a scale-out system. In a database system, a large table can be partitioned into small tables so each node can process its part of the computation. The sharding approach in a large batch transaction processing, which is important in financial area, presents two hard problems to programmers. Programmers have to write complex code (1) to transfer the input data so as to align the computations with the data partitions, and (2) to manage the distributed transactions. This paper presents a new parallel programming framework that makes parallel transactional programming easier by specifying transaction scopes and partitioners to simplify the code. Transaction scopes include series of subtransactions, each of which performs local operations. The system manages the distributed transactions automatically. A partitioner represents how the computation should be decomposed and aligned with the data partitions to avoid remote database accesses. Between paired of subtransactions, the system handles the data shuffling across the network. We implemented our parallel programming framework as a new Java class library. We hide all of the complex details of data transfer and distributed transaction management in the library. Our programming framework can eliminate almost 66% of the lines of code compared to a current programming approach without programming framework support. We also confirmed good scalability, with a scaling factor of 20.6 on 24 nodes using our modified batch program for the TPC-C benchmark.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
