<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>

COPY SCRIPT

For further information contact us at helpdesk@openaire.eu

Server-side I/O request arrival traces

Name: Server-side I/O request arrival traces
Keywords: high performance computing, i/o forwarding, parallel I/O, file access, parallel file system, traces, I/O requests

Research datakeyboard_double_arrow_right Dataset 09 Apr 2019Publisher:ZenodoFunded by:EC | DAMA

Authors: ZANON BOITO, Francieli; BEZ, Jean Luca;

doi: 10.5281/zenodo.3340631 , 10.5281/zenodo.3340630

Server-side I/O request arrival traces

- Summary
- Subjects
- Related research
  (10)
- Metrics

Abstract

Dataset generated for the "On server-side file access pattern matching" paper (Boito et al., HPCS 2019). The traces were obtained following the methodology described in the paper. In addition to the two data sets discussed in the paper, we are also making available an extra data set of server traces. Traces from I/O nodes IOnode_traces/output/commands has the list of commands used to generate them. Each test is identified by a label, and the test_info.csv file contains the mapping of labels to access patterns. Some files include information about experiments with 8 I/O nodes, but these were removed from the data set because they had some errors. IOnode_traces/output contains .map files that detail the mapping of clients to I/O nodes for each experiment, and .out files, which contain the output of the benchmark. IOnode_traces/ contains one folder per experiment. Inside this folder, there is one folder per I/O node, and inside these folders there are tracefiles for the read and write portions of the experiments. Due to a mistake during the integration between IOFSL and AGIOS, read requests appear as "W", and writes as "R". Once accounted for when processing the traces, that has no impact on results. pattern_length.csv contains the average pattern length for each experiment and operation (average number of requests per second), obtained with the get_pattern_length.py script. Each line of a trace looks like this: 277004729325 00000000eaffffffffffff1f729db77200000000000000000000000000000000 W 0 262144 The first number is an internal timestamp in nanoseconds, the second value is the file handle, and the third is the type of the request (inverted, "W" for reads and "R" for writes). The last two numbers give the request offset and size in bytes, respectively. Traces from parallel file sytem data servers These traces are inside the server_traces/ folder. Each experiment has two concurrent applications, "app1" and "app2", and its traces are inside a folder named accordingly: NOOP\_app1\_(identification of app1)\_app2\_(identification of app2)\_(repetition)\_pvfstrace/ Each application is identified by: (contig/noncontig)\_(number and size of requests per process)\_(number of processes)\_(number of client machines)\_(nto1/nton regarding the number of files) Inside each folder there are eight trace files, two per data server, one for the read portion and another for the write portion. Each line looks like this: [D 02:54:58.386900] REQ SCHED SCHEDULING, handle: 5764607523034231596, queue_element: 0x2a11360, type: 0, offset: 458752, len: 32768 The part between [] is a timestamp, "handle" gives the file handle, "type" is 0 for reads and 1 for writes, "offset" and "len" (length) are in bytes. server_traces/pattern_length.csv contains the average pattern length for each experiment and operation, obtained with the server_traces/count_pattern_length.py script. Extra traces from data servers These traces were not used for the paper because we do not have performance measurements for them with different scheduling policies, so it would not be possible to estimate the results of using the pattern matching approach to select scheduling policies. Still, we share them in the extra_server_traces/ folder in the hope they will be useful. They were obtained in the same experimental campaign than the other data server traces, and have the same format. The difference is that these traces are for single-application scenarios.

The source code used in the paper to handle these trace files is available in a git repository: https://gitlab.inria.fr/frzanonb/apmatching

Related Organizations

Keywords

high performance computing, i/o forwarding, parallel I/O, file access, parallel file system, traces, I/O requests

10 Research products, page 1 of 1

Herd-level risk factors for chronic pleurisy in finishing pigs: a case-control study
2020IsAmongTopNSimilarDocuments
Diseño e implementación de protocolo de validación de aplicaciones para el aprendizaje del sistema respiratorio
2021IsAmongTopNSimilarDocuments
Aminopeptidase P isozyme expression in human tissues and peripheral blood mononuclear cell fractions
2005IsAmongTopNSimilarDocuments
BindingDB Entry 50008903: Apstatin analogue inhibitors of aminopeptidase P, a bradykinin-degrading enzyme.
2009IsAmongTopNSimilarDocuments
Smartphone Apps Provide a Simple, Accurate Bedside Screening Tool for Orthostatic Tremor
2017IsAmongTopNSimilarDocuments
Biochemical Characterization of the Human RAD51 Protein
2002IsAmongTopNSimilarDocuments
BindingDB Entry 7474: 4-Fluoro-Thio-containing inhibitors of APP2, compositions thereof and method of use
2016IsAmongTopNSimilarDocuments
APP2: automatic tracing of 3D neuron morphology based on hierarchical pruning of a gray-weighted image distance-tree
2013IsAmongTopNSimilarDocuments
Aminopeptidase P2
2004IsAmongTopNSimilarDocuments
Thermal and Mechanical Properties of Zeolite Filled Ethylene Vinyl Acetate Composites
2012IsAmongTopNSimilarDocuments

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average