TUD Anomaly Detection Model (ONNX)

Model Info This repository contains a trained Autoencoder-based anomaly detection model developed in the context of the MLSysOps project (Machine Learning for Autonomic System Operation in the Heterogeneous Edge-Cloud Continuum), funded by the European Union’s Horizon Europe research and innovation programme under grant agreement No. 101092912. The model is exported in ONNX format for efficient inference on edge or cloud devices. Purpose This model performs unsupervised anomaly detection on node/VM telemetry metrics by learning to reconstruct normal observations. Input: A feature vector of telemetry metrics (float values), normalized with Min-Max scaling. Output: The reconstructed feature vector. Anomaly score: RMSE between input and reconstruction. Decision rule: anomaly if RMSE > threshold (threshold stored in model_config.json). Repository Structure The repository provides the trained model and its configuration for easy deployment. . ├── demo.py # Inference script (ONNXRuntime) ├── model/ │ ├── autoencoder.onnx # ONNX model │ └── model_config.json # Model configuration (features, normalization, threshold) ├── requirements.txt # Python dependencies └── README.md # Documentation Training Data The model was trained on telemetry data representing normal system behavior. The training dataset is not included in this Zenodo record unless explicitly provided in the uploaded files. Important: The inference input must use the same feature ordering as the training data. Features Used (Feature Order) The expected feature order (last dimension of the input tensor) is: cpu_0_idle cpu_0_iowait cpu_0_irq cpu_0_nice cpu_0_softirq cpu_0_steal cpu_0_system cpu_0_user cpu_1_idle cpu_1_iowait cpu_1_irq cpu_1_nice cpu_1_softirq cpu_1_steal cpu_1_system cpu_1_user cpu_2_idle cpu_2_iowait cpu_2_irq cpu_2_nice cpu_2_softirq cpu_2_steal cpu_2_system cpu_2_user cpu_3_idle cpu_3_iowait cpu_3_irq cpu_3_nice cpu_3_softirq cpu_3_steal cpu_3_system cpu_3_user memory_used_bytes node_memory_Buffers_bytes node_memory_Cached_bytes node_memory_MemAvailable_bytes node_memory_MemFree_bytes node_memory_MemTotal_bytes (These names must match model/model_config.json.) Model Architecture This model is a fully-connected Autoencoder with ReLU activations: Encoder dims: feature_size -> int(0.75*feature_size) -> int(0.5*feature_size) -> int(0.25*feature_size) -> int(0.1*feature_size) Decoder dims: symmetric back to feature_size Model Specification Inputs Input name: x Shape: [batch_size, 38] Type: float32 Description: Min-Max normalized feature vector Preprocessing x_norm = (x - min) / (max - min) If a feature has max == min (constant feature in training), normalization must avoid division by zero (recommended: set the normalized feature to 0.0). Optionally clamp x_norm to [0, 1] if desired (configurable via model_config.json). Outputs Output name: reconstruction Shape: [batch_size, 38] Type: float32 Description: Reconstructed feature vector Post-processing (Anomaly Detection) rmse = sqrt(mean((x_norm - reconstruction)^2)) per sample anomaly = 1 if rmse > threshold else 0 threshold is stored in model/model_config.json Limitations Feature order & dimension are fixed: Inputs must have exactly 38 features in the specified order. Normalization is training-dependent: Min/Max parameters are derived from the training data distribution; out-of-distribution inputs may yield unreliable anomaly scores. Constant features: Features with max == min require special handling during normalization (avoid division by zero). ONNX output is reconstruction only: The anomaly score/label is computed in the inference script. Usage Demo 1. Setup Environment python -m venv venv source venv/bin/activate pip install -r requirements.txt 2. Run Inference Script python demo.py --model model/autoencoder.onnx --config model/model_config.json --csv telemetry.csv --row 0 CSV Format Requirements CSV must include a header row. Numeric columns only (or ensure the numeric columns match the 38 features exactly). Column order must match the feature list and model_config.json. Citation If you wish to cite this model, please use the citation generated by Zenodo (located in the right sidebar of this record). Acknowledgement & Funding This work is part of the MLSysOps project, funded by the European Union’s Horizon Europe research and innovation programme under grant agreement No. 101092912. More information about the project is available at https://mlsysops.eu/

Keywords

telemetry, anomaly detection

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Funded by

EC| MLSysOps