Downloads provided by UsageCounts
Workflow Task Graph Dataset This dataset contains three sets of task graphs representing different types of task workflows: Elementary - contains trivial graph shapes, such as tasks with no dependencies or simple fork-join graphs. This set should test how the scheduler heuristics react to basic graph scenarios that frequently form parts of larger workflows. IRW - is inspired by real-world workflows, such as machine learning cross-validation or map-reduce. Pegasus - is derived from graphs created by Pegasus Synthetic Workflow Generators (https://github.com/pegasus-isi/WorkflowGenerator) All of the provided task graphs are generated and compatible with ESTEE (https://github.com/It4innovations/estee) that allows to simulate their execution on a distributed system using various scheduling heuristics and environment conditions. Data Format Task graphs are stored in {elementary, irw, pegasus}.zip files that contain JSON representation of respective task graphs with the following fields: `graph_name` - Task graph name `graph_id` - Unique task graph identifier `graph` - Task graph representation - list of tasks where each task is represented as a dictionary with the following keys: `d`: Actual task duration in seconds (float value) `e_d`: User estimated task duration in seconds (float value) `cpus`: Task CPU core requirements (integer value) `outputs`: List of task outputs (list of integers indicating sizes of task outputs in MiB) `inputs`: List of task inputs in format of list [task\_id, output\_index]}. Output index is zero-based. For example this task graph: [{'d': 200, 'e_d': 180, 'cpus': 1, 'outputs': [100], 'inputs': []}, {'d': 50, 'e_d': 60, 'cpus': 2, 'outputs': [], 'inputs': [[0, 0]]}] contains two tasks. One requiring no input, single CPU core with estimated duration 180s, actual duration 200s and producing a single output of 100 MiB. And another one requiring as an input task0's 0-th output, requiring 2 CPU cores, producing no output with estimated duration 60s and actual duration 50s. Parsing the data In Python, to load the elementary task graph set run the following snippet: import pandas as pd graphs = pd.read_json("./elementary.zip") If you have Estee installed, you can use its provided `json_deserialize` function to parse the JSON encoded graphs into Estee TaskGraph data structure. from estee.serialization.dask_json import json_deserialize graph_json = graphs.loc[0, "graph"] graph = json_deserialize(graph)
task graph, benchmark, workflow, scheduling
task graph, benchmark, workflow, scheduling
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 22 | |
| downloads | 4 |

Views provided by UsageCounts
Downloads provided by UsageCounts