publication . Research . Preprint . Other literature type . 2017

A Formal Semantics for Data Analytics Pipelines

Drocco, Maurizio; Misale, Claudia; Tremblay, Guy; Aldinucci, Marco;
Open Access
  • Published: 03 May 2017
Abstract
Comment: 24 pages
Subjects
free text keywords: Big Data analytics, Types, Parallel computing, Distributed computing, Computer Science - Programming Languages, D.1.3, D.3.2, D.2.4
Funded by
EC| TOREADOR
Project
TOREADOR
TrustwOrthy model-awaRE Analytics Data platfORm
  • Funder: European Commission (EC)
  • Project Code: 688797
  • Funding stream: H2020 | RIA
,
EC| RePhrase
Project
RePhrase
REfactoring Parallel Heterogeneous Resource-Aware Applications - a Software Engineering Approach
  • Funder: European Commission (EC)
  • Project Code: 644235
  • Funding stream: H2020 | RIA
Download fromView all 3 versions
ZENODO
Research . 2017
Provider: ZENODO
Zenodo
Other literature type . 2017
Provider: Datacite

[1] T. Akidau, R. Bradshaw, C. Chambers, S. Chernyak, R. J. Ferna`ndezMoctezuma, R. Lax, S. McVeety, D. Mills, F. Perry, E. Schmidt, and S. Whittle. The dataflow model: A practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. Proc. VLDB Endow., 8(12):1792-1803, Aug. 2015.

[2] Flink. Apache Flink website. https://flink.apache.org/.

[3] Flink. Flink streaming examples, 2015. [Online; accessed 16-November2016].

[7] E. A. Lee and T. M. Parks. Dataflow process networks. Proc. of the IEEE, 83(5):773-801, 1995.

[8] C. Misale, M. Drocco, M. Aldinucci, and G. Tremblay. A comparison of big data frameworks on a layered dataflow model. In Proc. of HLPP2016: Intl. Workshop on High-Level Parallel Programming, pages 1-19, Muenster, Germany, July 2016. arXiv.org.

[9] C. Misale, M. Drocco, M. Aldinucci, and G. Tremblay. A comparison of big data frameworks on a layered dataflow model. Parallel Processing Letters, 27(01):1740003, 2017.

[10] M. A. U. Nasir, G. D. F. Morales, D. Garc´ıa-Soriano, N. Kourtellis, and M. Serafini. The power of both choices: Practical load balancing for distributed stream processing engines. CoRR, abs/1504.00788, 2015.

[11] M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient Distributed Datasets: A Faulttolerant Abstraction for In-memory Cluster Computing. In Proc. of the 9th USENIX Conference on Networked Systems Design and Implementation, NSDI'12, Berkeley, CA, USA, 2012. USENIX.

Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue
publication . Research . Preprint . Other literature type . 2017

A Formal Semantics for Data Analytics Pipelines

Drocco, Maurizio; Misale, Claudia; Tremblay, Guy; Aldinucci, Marco;