Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ World Journal of Adv...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
World Journal of Advanced Research and Reviews
Article . 2025 . Peer-reviewed
Data sources: Crossref
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Article . 2025
License: CC BY
Data sources: ZENODO
SSRN Electronic Journal
Article . 2025 . Peer-reviewed
Data sources: Crossref
ZENODO
Article . 2025
License: CC BY
Data sources: Datacite
ZENODO
Article . 2025
License: CC BY
Data sources: Datacite
versions View all 4 versions
addClaim

AI-driven anomaly detection and root cause analysis: Using machine learning on logs, metrics, and traces to detect subtle performance anomalies, security threats, or failures in complex cloud environments

Authors: Guntupalli, Raviteja;

AI-driven anomaly detection and root cause analysis: Using machine learning on logs, metrics, and traces to detect subtle performance anomalies, security threats, or failures in complex cloud environments

Abstract

Enhanced complexity, together with high service dependencies and dynamic scaling requirements in present-day cloud environments, create both critical and difficult conditions for quick anomaly detection as well as root cause analysis (RCA). The traditional rule-based monitoring framework cannot discover slight and new types of anomalies that occur before system outages or security breaches. The document examines how AI systems alongside Machine Learning (ML) capabilities combined with deep learning processing of logs, metrics, and traces help automatically detect anomalies while performing RCA operations in cloud-native platforms. The paper examines the utilization of supervised learning with unsupervised and reinforcement methods on diverse telemetry information to perform real-time detection of performance dips and, system errors and anomalous usage patterns. These systems can use AI technology to link distributed system incidents while simultaneously pinpointing foundational problems that human personnel cannot match for speed when recommending solutions. The operational effects of these techniques can be seen through real-life applications at Adobe, Uber, Zalando, and LinkedIn. Automated RCA systems face ethical and technical challenges, according to the paper, which details problems like model drift, interpretability of complex models, and observability gaps. The ongoing expansion of cloud systems makes AI-driven anomaly detection essential for maintaining resilience and optimizing performance and cyber defense for both multi-cloud and hybrid cloud systems.

Related Organizations
Keywords

Observability, Cloud resilience, Traces, Security threats, Machine learning, Root cause analysis, Deep learning, Metrics, Anomaly detection, AI operations, Cloud monitoring, Logs

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    3
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Top 10%
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
3
Top 10%
Average
Average
Green
gold