Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Report . 2019
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Report . 2019
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

Anomaly Detection in the Elasticsearch Service

Authors: Andersson, Jennifer;

Anomaly Detection in the Elasticsearch Service

Abstract

The Elasticsearch Service is a distributed search and analytics engine widely used across CERN. Currently, issues in the service are resolved manually after being detected through internal monitoring by service managers. However, the number of clusters and metrics are large which makes them difficult to track, and issues are often discovered and reported by users. This is time consuming and disturbs the workflow of the service users. In light of this, the main objective of this project is to develop a model capable of identifying anomalies in the Elasticsearch Service clusters, in order to predict and eliminate service issues before they cause problems. This is done by analyzing the history of cluster data using machine learning methods. In this way, a single metric signaling service issues can be obtained and used to alarm service managers of upcoming issues. In 2017, a deep neural network model was developed for this purpose. However, several issues were identified with the model, the most severe being convergence issues in the autoencoder. In this project, a revised autoencoder based on long short-term memory neural networks (LSTM’s) is developed, tuned and evaluated. Finally, it is used on new Elasticsearch Service cluster data. The final model shows improved convergence compared to the previous model, and is able to detect real service issues based on the anomaly scores obtained. By combining the anomaly scores with those obtained by a model simply predicting the cluster state as a moving average of preceding states, the rate of false positives is reduced. The conclusion is that that a combined model, reporting anomalies based on a combination of the anomaly scores obtained by the LSTM based model and the moving average model, is the most sensitive to real service issues.

Keywords

summer-student programme, CERN openlab

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 10
    download downloads 34
  • 10
    views
    34
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
10
34
Green