Dataset Artifact for paper "Root Cause Analysis for Microservice System based on Causal Inference: How Far Are We?"

Artifacts for the paper titled Root Cause Analysis for Microservice System based on Causal Inference: How Far Are We?. This artifact repository contains 9 compressed folders, as follows: ID File Name Description 1 syn_circa.zip CIRCA10, and CIRCA50 datasets for Causal Discovery 2 syn_rcd.zip RCD10, and RCD50 datasets for Causal Discovery 3 syn_causil.zip CausIL10, and CausIL50 datasets for Causal Discovery 4 rca_circa.zip CIRCA10, and CIRCA50 datasets for RCA 5 rca_rcd.zip RCD10, and RCD50 datasets for RCA 6 online-boutique.zip Online Boutique dataset for RCA 7 sock-shop-1.zip Sock Shop 1 dataset for RCA 8 sock-shop-2.zip Sock Shop 2 dataset for RCA 9 train-ticket.zip Train Ticket dataset for RCA Each zip file contains the generated/collected data from the corresponding data generator or microservice benchmark systems (e.g., online-boutique.zip contains metrics data collected from the Online Boutique system). Details about the generation of our datasets 1. Synthetic datasets We use three different synthetic data generators from three previous RCA studies [15, 25, 28] to create the synthetic datasets: CIRCA, RCD, and CausIL data generators. Their mechanisms are as follows:1. CIRCA datagenerator [28] generates a random causal directed acyclic graph (DAG) based on a given number of nodes and edges. From this DAG, time series data for each node is generated using a vector auto-regression (VAR) model. A fault is injected into a node by altering the noise term in the VAR model for two timestamps. 2. RCD data generator [25] uses the pyAgrum package [3] to generate a random DAG based on a given number of nodes, subsequently generating discrete time series data for each node, with values ranging from 0 to 5. A fault is introduced into a node by changing its conditional probability distribution.3. CausIL data generator [15] generates causal graphs and time series data that simulate the behavior of microservice systems. It first constructs a DAG of services and metrics based on domain knowledge, then generates metric data for each node of the DAG using regressors trained on real metrics data. Unlike the CIRCA and RCD data generators, the CausIL data generator does not have the capability to inject faults.To create our synthetic datasets, we first generate 10 DAGs whose nodes range from 10 to 50 for each of the synthetic data generators. Next, we generate fault-free datasets using these DAGs with different seedings, resulting in 100 cases for the CIRCA and RCD generators and 10 cases for the CausIL generator. We then create faulty datasets by introducing ten faults into each DAG and generating the corresponding faulty data, yielding 100 cases for the CIRCA and RCD data generators. The fault-free datasets (e.g. `syn_rcd`, `syn_circa`) are used to evaluate causal discovery methods, while the faulty datasets (e.g. `rca_rcd`, `rca_circa`) are used to assess RCA methods. 2. Data collected from benchmark microservice systems We deploy three popular benchmark microservice systems: Sock Shop [6], Online Boutique [4], and Train Ticket [8], on a four-node Kubernetes cluster hosted by AWS. Next, we use the Istio service mesh [2] with Prometheus [5] and cAdvisor [1] to monitor and collect resource-level and service-level metrics of all services, as in previous works [ 25 , 39, 59 ]. To generate traffic, we use the load generators provided by these systems and customise them to explore all services with 100 to 200 users concurrently. We then introduce five common faults (CPU hog, memory leak, disk IO stress, network delay, and packet loss) into five different services within each system. Finally, we collect metrics data before and after the fault injection operation. An overview of our setup is presented in the Figure below. Code The code to reproduce the experimental results in the paper is available at https://github.com/phamquiluan/RCAEval. References As in our paper.

Related Organizations

RMIT University
Australia
Chongqing University
China (People's Republic of)

Keywords

Microservices, Microservice Systems, AIOps, Root Cause Analysis

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average