Analysis of data processing efficiency with use of Apache Hive and Apache Pig in Hadoop environment

descriptionPublicationkeyboard_double_arrow_right Article 20 Mar 2024Publisher:Politechnika LubelskaJournal:Journal of Computer Sciences Institute, volume 30, pages 1-8 (eissn: 2544-0764,

Copyright policy )

Authors: Mikołaj Skrzypczyński; Piotr Muryjas;

doi: 10.35784/jcsi.4060

Analysis of data processing efficiency with use of Apache Hive and Apache Pig in Hadoop environment

- Summary
- Subjects
- Metrics

Abstract

The aim of this paper is the analysis of data processing efficiency with use of Apache Hive and Apache Pig in Hadoop environment. The analysis was based on comparison between both mentioned tools with use of large data set, represented by 28 million records. Research was provided with use of scripts and queries destined for Apache Hive and Apache Pig, and then executed 10 times on environment brought by created virtual machine. Those methods were performed on the same data sets for 16 times according to previously prepared research scenarios. As the conclusion, authors had observed that Apache Hive is more efficient tool, than Apache Pig.

Related Organizations

Keywords

Hadoop, Electronic computers. Computer science, Apache Pig, Information technology, QA75.5-76.95, Apache Hive, T58.5-58.64, data processing

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Published in a Diamond OA journal

Fields of Science (4) View all

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

View all