Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Computer Communicati...arrow_drop_down
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
Computer Communications
Article . 2020 . Peer-reviewed
License: Elsevier TDM
Data sources: Crossref
DBLP
Article . 2025
Data sources: DBLP
versions View all 2 versions
addClaim

Benchmarking big data systems: A survey

Authors: Fuad Bajaber; Sherif Sakr; Omar Batarfi; Abdulrahman H. Altalhi; Ahmed Barnawi;

Benchmarking big data systems: A survey

Abstract

Abstract With the enormous growth on the availability and usage of Big Data storage and processing systems, it has become essential to assess the various performance aspects of these systems so that we can carefully understand their strong and weak aspects. In practice, currently, when an individual/enterprise aims to develop a Big Data storage and processing solution for harnessing the knowledge inside their data, they will get challenged by the availability of several frameworks from which they need to select. This is a challenging task which needs to directed by with good knowledge about various perspectives of such systems. Additionally, the choice normally vary from one scenario to another according to the essential needs of the application. In practice, there is no single benchmark study which can cover the different types of big data processing requirements, systems, application scenarios and metrics. Several benchmarks and benchmarking studies have been developed where each study focuses on some representative type of frameworks and only consider some aspects to cover. In this article, we provide a comprehensive survey and analysis of the state-of-the-art of benchmarking the different types of big data systems (e.g., NoSQL databases, Big SQL engines, Big Streaming engines, Big Graph Processing engines, Big Machine/Deep Learning engines). Additionally, we highlight some of the significant open challenges and missing requirements of current benchmarks of big data systems with suggestions of directions for future extensions and improvements.

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    20
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Top 10%
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Top 10%
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Top 10%
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
20
Top 10%
Top 10%
Top 10%
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!