Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ IEEE Accessarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
IEEE Access
Article . 2020 . Peer-reviewed
License: CC BY
Data sources: Crossref
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
IEEE Access
Article
License: CC BY
Data sources: UnpayWall
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
IEEE Access
Article . 2020
Data sources: DOAJ
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Article . 2020
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
IEEE Access
Article . 2020 . Peer-reviewed
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
https://dx.doi.org/10.60692/rg...
Other literature type . 2020
Data sources: Datacite
https://dx.doi.org/10.60692/65...
Other literature type . 2020
Data sources: Datacite
DBLP
Article . 2021
Data sources: DBLP
versions View all 9 versions
addClaim

DSPBench: A Suite of Benchmark Applications for Distributed Data Stream Processing Systems

DSPBench: مجموعة من التطبيقات المعيارية لأنظمة معالجة تدفق البيانات الموزعة
Authors: Maycon Viana Bordin; Dalvan Griebler; Gabriele Mencagli; Cláudio F. R. Geyer; Luiz Gustavo Leão Fernandes;

DSPBench: A Suite of Benchmark Applications for Distributed Data Stream Processing Systems

Abstract

Las aplicaciones caracterizadas por el procesamiento continuo de grandes flujos de datos han atraído recientemente la atención de la comunidad científica y las partes interesadas industriales. La necesidad de herramientas de programación de alto nivel ha llevado al diseño de sistemas de procesamiento de flujos de datos (DSPS) capaces de facilitar el desarrollo de aplicaciones de transmisión en entornos informáticos distribuidos. Varios sistemas de este tipo se han lanzado y se mantienen actualmente como proyectos de código abierto, como Apache Storm y Spark Streaming. Aunque la comunidad científica a menudo utiliza algunas aplicaciones de referencia para probar y evaluar nuevas técnicas para mejorar el rendimiento y la usabilidad de los DSPS, las suites de referencia disponibles aún carecen de cargas de trabajo representativas provenientes de las diferentes áreas de interés en el dominio de procesamiento de flujos. El objetivo de este documento es presentar una nueva suite de referencia compuesta por 15 aplicaciones provenientes de áreas como Finanzas, Telecomunicaciones, Redes de sensores, Redes sociales y otros. El documento describe en detalle la naturaleza de estas aplicaciones, su caracterización completa de la carga de trabajo en términos de selectividad, costo de procesamiento, tamaño de entrada y ocupación general de la memoria, y proporciona una descripción detallada de la naturaleza de estas aplicaciones. primera evaluación de la utilidad de nuestra suite de referencia para comparar DSPS reales seleccionando Apache Storm y Spark Streaming para este análisis.

Les applications caractérisées par le traitement continu de grands flux de données ont récemment attiré l'attention de la communauté scientifique et des parties prenantes industrielles. Le besoin d'outils de programmation de haut niveau a conduit à la conception de systèmes de traitement de flux de données (DSPS) capables de faciliter le développement d'applications de streaming dans des environnements informatiques distribués. Plusieurs systèmes de ce type ont été publiés et actuellement maintenus en tant que projets open source, comme Apache Storm et Spark Streaming. Bien que certaines applications de référence soient souvent utilisées par la communauté scientifique pour tester et évaluer de nouvelles techniques pour améliorer les performances et la convivialité des DSPS, les suites de référence disponibles manquent encore de charges de travail représentatives provenant des différents domaines d'intérêt dans le domaine du traitement des flux.L' objectif de cet article est de présenter une nouvelle suite de référence composée de 15 applications provenant de domaines tels que la finance, les télécommunications, les réseaux de capteurs, les réseaux sociaux et autres.L' article décrit en détail la nature de ces applications, leur caractérisation complète de la charge de travail en termes de sélectivité, de coût de traitement, de taille des entrées et d'occupation globale de la mémoire, et fournit une première évaluation de l'utilité de notre suite de benchmark pour comparer des DSPS réels en sélectionnant Apache Storm et Spark Streaming pour cette analyse.

Applications characterized by the continuous processing of large data streams have recently attracted the attention of the scientific community and industrial stakeholders.The need of high-level programming tools has led to the design of Data Stream Processing Systems (DSPSs) able to ease the development of streaming applications in distributed computing environments.Several systems of this kind have been released and currently maintained as open source projects, like Apache Storm and Spark Streaming.Although some benchmark applications are often used by the scientific community to test and evaluate new techniques to improve the performance and usability of DSPSs, the available benchmark suites are still lacking of representative workloads coming from the different areas of interest in the stream processing domain.The goal of this paper is to present a new benchmark suite composed of 15 applications coming from areas like Finance, Telecommunication, Sensor Networks, Social Networks and others.The paper describes in detail the nature of these applications, their full workload characterization in terms of selectivity, processing cost, input size and overall memory occupation, and provides a first assessment of the usefulness of our benchmark suite to compare real DSPSs by selecting Apache Storm and Spark Streaming for this analysis.

جذبت التطبيقات التي تتميز بالمعالجة المستمرة لتدفقات البيانات الكبيرة مؤخرًا انتباه المجتمع العلمي وأصحاب المصلحة الصناعيين. أدت الحاجة إلى أدوات برمجة عالية المستوى إلى تصميم أنظمة معالجة تدفق البيانات (DSPSs) القادرة على تسهيل تطوير تطبيقات البث في بيئات الحوسبة الموزعة. تم إصدار العديد من الأنظمة من هذا النوع وصيانتها حاليًا كمشاريع مفتوحة المصدر، مثل Apache Storm و Spark Streaming. على الرغم من أن بعض التطبيقات المعيارية غالبًا ما يستخدمها المجتمع العلمي لاختبار و تقييم التقنيات الجديدة لتحسين أداء وسهولة استخدام DSPSs، لا تزال المجموعات المعيارية المتاحة تفتقر إلى أحمال العمل التمثيلية القادمة من مجالات الاهتمام المختلفة في مجال معالجة التدفق. الهدف من هذه الورقة هو تقديم مجموعة معيارية جديدة تتكون من 15 تطبيقًا قادمًا من مجالات مثل التمويل والاتصالات وشبكات الاستشعار والشبكات الاجتماعية وغيرها. تصف الورقة بالتفصيل طبيعة هذه التطبيقات وتوصيف عبء العمل الكامل من حيث الانتقائية وتكلفة المعالجة وحجم المدخلات ومهنة الذاكرة الإجمالية، وتوفر التقييم الأول لفائدة مجموعة المعايير الخاصة بنا لمقارنة DSPSs الحقيقية من خلال اختيار Apache Storm و Spark Streaming لهذا التحليل.

Country
Italy
Keywords

Big Data, FOS: Computer and information sciences, History, Data Stream Management Systems and Techniques, Computer Networks and Communications, Usability, spark streaming, Data Stream Processing, Cloud Computing and Big Data Technologies, Leverage (statistics), apache storm, big data, Artificial Intelligence, Data Streams, Machine learning, Apache Storm, Suite, benchmarking, Adaptation to Concept Drift in Data Streams, Data mining, Stream Processing, Geography, Streaming Data, Data stream mining, Computer science, Distributed computing, TK1-9971, Data Stream Management, Programming language, Stream processing, Benchmarking, Data stream processing, Operating system, Spark Streaming, Archaeology, Apache Storm; Benchmarking; Big Data; Data Stream Processing; Spark Streaming, Distributed Systems, Streaming data, Computer Science, Physical Sciences, SPARK (programming language), Electrical engineering. Electronics. Nuclear engineering, Benchmark (surveying), Geodesy, Information Systems

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    28
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Top 10%
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Top 10%
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Top 10%
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 8
    download downloads 10
  • 8
    views
    10
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
28
Top 10%
Top 10%
Top 10%
8
10
Green
gold