Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2019
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2019
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2019
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

Task graphs for benchmarking schedulers

Authors: Beránek, Jakub; Böhm, Stanislav; Vojtěch Cima;

Task graphs for benchmarking schedulers

Abstract

Workflow Task Graph Dataset This dataset contains three sets of task graphs representing different types of task workflows: Elementary - contains trivial graph shapes, such as tasks with no dependencies or simple fork-join graphs. This set should test how the scheduler heuristics react to basic graph scenarios that frequently form parts of larger workflows. IRW - is inspired by real-world workflows, such as machine learning cross-validation or map-reduce. Pegasus - is derived from graphs created by Pegasus Synthetic Workflow Generators (https://github.com/pegasus-isi/WorkflowGenerator) All of the provided task graphs are generated and compatible with ESTEE (https://github.com/It4innovations/estee) that allows to simulate their execution on a distributed system using various scheduling heuristics and environment conditions. Data Format Task graphs are stored in {elementary, irw, pegasus}.zip files that contain JSON representation of respective task graphs with the following fields: `graph_name` - Task graph name `graph_id` - Unique task graph identifier `graph` - Task graph representation - list of tasks where each task is represented as a dictionary with the following keys: `d`: Actual task duration in seconds (float value) `e_d`: User estimated task duration in seconds (float value) `cpus`: Task CPU core requirements (integer value) `outputs`: List of task outputs (list of integers indicating sizes of task outputs in MiB) `inputs`: List of task inputs in format of list [task\_id, output\_index]}. Output index is zero-based. For example this task graph: [{'d': 200, 'e_d': 180, 'cpus': 1, 'outputs': [100], 'inputs': []}, {'d': 50, 'e_d': 60, 'cpus': 2, 'outputs': [], 'inputs': [[0, 0]]}] contains two tasks. One requiring no input, single CPU core with estimated duration 180s, actual duration 200s and producing a single output of 100 MiB. And another one requiring as an input task0's 0-th output, requiring 2 CPU cores, producing no output with estimated duration 60s and actual duration 50s. Parsing the data In Python, to load the elementary task graph set run the following snippet: import pandas as pd graphs = pd.read_json("./elementary.zip") If you have Estee installed, you can use its provided `json_deserialize` function to parse the JSON encoded graphs into Estee TaskGraph data structure. from estee.serialization.dask_json import json_deserialize graph_json = graphs.loc[0, "graph"] graph = json_deserialize(graph)

Keywords

task graph, benchmark, workflow, scheduling

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    1
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 22
    download downloads 4
  • 22
    views
    4
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
1
Average
Average
Average
22
4