Subject: Computer Science - Distributed, Parallel, and Cluster Computing | Computer Science - Data Structures and Algorithms
When handling large datasets that exceed the capacity of the main memory, movement of data between main memory and external memory (disk), rather than actual (CPU) computation time, is often the bottleneck in the computation. Since data is moved between disk and main me... View more
 Alok Aggarwal and Jeffrey Scott Vitter. The input/output complexity of sorting and related problems. Communications of the ACM, pages 1116-1127, 1988.
 Deepak Ajwani, Roman Dementiev, Ulrich Meyer, and Vitaly Osipov. Breadth first search on massive graphs. In 9th DIMACS Implementation Challenge Workshop: Shortest Paths, 2006.
 Alexander Alexandrov, Rico Bergmann, Stephan Ewen, Johann-Christoph Freytag, Fabian Hueske, Arvid Heise, Odej Kao, Marcus Leich, Ulf Leser, Volker Markl, et al. The stratosphere platform for big data analytics. The VLDB Journal, 23(6):939-964, 2014.
 Lars Arge. External Memory Data Structures, pages 313-357. Springer US, Boston, MA, 2002.
 Lars Arge, Gerth Stølting Brodal, Jakob Truelsen, and Constantinos Tsirogiannis. An optimal and practical cacheoblivious algorithm for computing multiresolution rasters. In Proc. 21st European Symp. Alg., 2013.
 Lars Arge, Jakob Truelsen, and Jungwoo Yang. Simplifying massive planar subdivisions. In Proc. 16th Workshop on Alg. Eng. Exp., pages 20-30. SIAM, 2014.
 Andreas Beckmann, Roman Dementiev, and Johannes Singler. Building a parallel pipelined external memory algorithm library. Proc. 2009 Intl. Parallel and Distributed Processing Symp., 2009.
 Timo Bingmann, Michael Axtmann, Emanuel Jöbstl, Sebastian Lamm, Huyen Chau Nguyen, Alexander Noe, Sebastian Schlag, Matthias Stumpp, Tobias Sturm, and Peter Sanders. Thrill: High-performance algorithmic distributed batch data processing with C++. In 2016 IEEE Intl. Conf. Big Data, pages 172-183. IEEE, 2016.
 Jeffrey Dean and Sanjay Ghemawat. Mapreduce: simplified data processing on large clusters. Comm. ACM, 51(1):107- 113, 2008.
 Roman Dementiev, Lutz Kettner, and Peter Sanders. STXXL: standard template library for XXL data sets. Softw., Pract. Exper., 38(6):589-637, 2008.
When handling large datasets that exceed the capacity of the main memory, movement of data between main memory and external memory (disk), rather than actual (CPU) computation time, is often the bottleneck in the computation. Since data is moved between disk and main me...
External Memory Pipelining Made Easy With TPIE
Contribution for newspaper or weekly magazineEnglishOPEN