Block-parallel data analysis with DIY2

descriptionPublicationkeyboard_double_arrow_right Article , Report , Conference object 01 Oct 2016 United States Publisher:IEEEJournal:2016 IEEE 6th Symposium on Large Data Analysis and Visualization (LDAV)

Authors: Morozov, Dmitriy; Peterka, Tom;

doi: 10.1109/ldav.2016.7874307 , 10.2172/1377403

Block-parallel data analysis with DIY2

- Summary
- Metrics

Abstract

DIY2 is a programming model and runtime for block-parallel analytics on distributed-memory machines. Its main abstraction is block-structured data parallelism: data are decomposed into blocks; blocks are assigned to processing elements (processes or threads); computation is described as iterations over these blocks, and communication between blocks is defined by reusable patterns. By expressing computation in this general form, the DIY2 runtime is free to optimize the movement of blocks between slow and fast memories (disk and flash vs. DRAM) and to concurrently execute blocks residing in memory with multiple threads. This enables the same program to execute in-core, out-of-core, serial, parallel, single-threaded, multithreaded, or combinations thereof. This paper describes the implementation of the main features of the DIY2 programming model and optimizations to improve performance. DIY2 is evaluated on complete analysis codes.

Country

United States

Related Organizations

Lawrence Livermore National Laboratory
United States
Lawrence Berkeley National Laboratory
ARGONNE NATIONAL LABORATORY
Argonne National Laboratory
United States
Argonne National Laboratory

View all View all

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	33
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

33

Top 10%

Green

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering