
We present a large-scale anomaly detection dataset collected from IBM Cloud's Console over approximately 4.5 months. This high-dimensional dataset captures telemetry data from multiple data centers, specifically designed to aid researchers in developing and benchmarking anomaly detection methods in large-scale cloud environments. It contains 39,365 entries, each representing a 5-minute interval, with 117,448 features/attributes, as interval_start is used as the index. The dataset includes detailed information on request counts, HTTP response codes, and various aggregated statistics. The dataset also includes labeled anomaly events identified through IBM's internal monitoring tools, providing a comprehensive resource for real-world anomaly detection research and evaluation. File Descriptions location_downtime.csv - Details planned and unplanned downtimes for IBM Cloud data centers, including start and end times in ISO 8601 format. unpivoted_data.parquet - Contains raw telemetry data with 413 million+ rows, covering details like location, HTTP status codes, request types, and aggregated statistics (min, max, median response times). anomaly_windows.csv - Ground truth for anomalies, listing start and end times of recorded anomalies, categorized by source (Issue Tracker, Instant Messenger, Test Log). pivoted_data_all.parquet - Pivoted version of the telemetry dataset with 39,365 rows and 117,449 columns, including aggregated statistics across multiple metrics and intervals. demo/demo.[ipynb|html]: This demo file provides examples of how to access data in the Parquet files, available in Jupyter Notebook (.ipynb) and HTML (.html) formats, respectively. Further details of the dataset can be found in Appendix B: Dataset Characteristics of the paper titled "Anomaly Detection in Large-Scale Cloud Systems: An Industry Case and Dataset." Sample code for training anomaly detectors using this data is provided in this package. When using the dataset, please cite it as follows: @misc{islam2024anomaly, title={Anomaly Detection in Large-Scale Cloud Systems: An Industry Case and Dataset}, author={Mohammad Saiful Islam and Mohamed Sami Rakha and William Pourmajidi and Janakan Sivaloganathan and John Steinbacher and Andriy Miranskyy}, year={2024}, eprint={2411.09047}, archivePrefix={arXiv}, url={https://arxiv.org/abs/2411.09047}}
Software Engineering, Anomaly Detection, Deep learning, Cloud Computing, IBM Cloud Console
Software Engineering, Anomaly Detection, Deep learning, Cloud Computing, IBM Cloud Console
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
