Automated Configuration Parameter Classfication Model for Hive Query Plan on the Apache Yarn

descriptionPublicationkeyboard_double_arrow_right Article , Conference object 01 May 2019Publisher:IEEEJournal:2019 IEEE International Conference on Big Data, Cloud Computing, Data Science & Engineering (BCD)

Authors: Jongyeop Kim; Seongsoo Kim; Donghoon Kim; Hong Liu 0022;

doi: 10.1109/bcd.2019.8885220

Automated Configuration Parameter Classfication Model for Hive Query Plan on the Apache Yarn

- Summary
- Metrics

Abstract

This research proposed an automated configuration parameter classification model to arrange optimized Hive Query processing environment on the Apache Hadoop Distributed File System. In this model, the Analysis statistic command issued to measuring expected performance for the Hive tables on the Hadoop yarn platform with varying combinations of parameter configuration. The e-heuristic methodology is applied to effectively shrinking parameter search space during automated tuning process. We controlled the transition between evaluation spaces using one main parameter and one auxiliary parameter that are expected to reach the global optimum in each evaluation space. This model identifies the Hive parameters that access Hive table optimally and expects to improve query execution time by 15% against to the default Hive settings.

Related Organizations

Southern Arkansas University
United States
Indiana University Kokomo
United States
Arkansas State University
United States
Indiana University
United States
Oklahoma State University–Stillwater
United States

View all View all

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now