Distributed Computing

descriptionPublicationkeyboard_double_arrow_right Article 01 Jun 2019Publisher:IEEEJournal:2019 14th Iberian Conference on Information Systems and Technologies (CISTI)

Authors: Rui Santos Cruz; Miguel Casquilho;

doi: 10.23919/cisti.2019.8760827

Distributed Computing

- Summary
- Metrics

Abstract

This study is based on a concrete problem in a fertilizer factory about the estimation of process parameters: to calculate the mean and standard deviation from weights (sums only)of loads of unequal (known)number of bags (“equal” case being trivial). With many distribution depots., the data for each depot must be collected for processing. These are addressed in a Cloud Computing., big-data framework. The use of Apache Spark is described and adopted., as advantageous over Hadoop due to “in-memory computation” and Resilient Distributed Dataset. The computation uses Terraform and Ansible as configuration tool, and is deployed on the Google Cloud Platform. The evaluation preliminary tests confirmed good accuracy and produced low runtimes.

Related Organizations

University of Lisbon
Portugal

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average