Powered by OpenAIRE graph
Found an issue? Give us feedback
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Robust anomaly detection algorithms for real-time big data: Comparison of algorithms

Authors: Zirije Hasani;

Robust anomaly detection algorithms for real-time big data: Comparison of algorithms

Abstract

Most of the today's world data are streaming, time-series data, where anomalies detection gives significant information of possible critical situations. Yet, detecting anomalies in big streaming data is a difficult task, requiring detectors to acquire and process data in a real-time, as they occur, even before they are stored and instantly alarm on potential threats. Suitable to the need for real-time alarm and unsupervised procedures for massive streaming data anomaly detection, algorithms have to be robust, with low processing time, eventually at the cost of the accuracy. In this work we explore several such fast algorithms like MAD, RunMAD, Boxplot, Twitter ADVec, DBSCAN, Moving Range Technique, Statistical Control Chart Techniques, ARIMA and Moving Average. The algorithms are tested and results are visualized in the system R, on the three Numenta datasets, with known anomalies and own e-dnevnik dataset with unknown anomalies. Evaluation is done by comparing achieved results (the algorithm execution time, CPU usage and the number of anomalies found) with Numenta HTM algorithm that detects all the anomalies in their datasets. Our interest is monitoring of the streaming log data that are generating in the national educational network (e-dnevnk) that acquires a massive number of online queries and to detect anomalies in order to scale up performance, prevent network downs, alarm on possible attacks and similar.

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    18
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Top 10%
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Top 10%
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Top 10%
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
18
Top 10%
Top 10%
Top 10%
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!