publication . Preprint . 2017

On the Runtime-Efficacy Trade-off of Anomaly Detection Techniques for Real-Time Streaming Data

Choudhary, Dhruv; Kejariwal, Arun; Orsini, Francois;
Open Access English
  • Published: 12 Oct 2017
Abstract
Ever growing volume and velocity of data coupled with decreasing attention span of end users underscore the critical need for real-time analytics. In this regard, anomaly detection plays a key role as an application as well as a means to verify data fidelity. Although the subject of anomaly detection has been researched for over 100 years in a multitude of disciplines such as, but not limited to, astronomy, statistics, manufacturing, econometrics, marketing, most of the existing techniques cannot be used as is on real-time data streams. Further, the lack of characterization of performance -- both with respect to real-timeliness and accuracy -- on production data...
Subjects
free text keywords: Statistics - Machine Learning, Computer Science - Information Retrieval, Computer Science - Learning, Electrical Engineering and Systems Science - Signal Processing
Download from
114 references, page 1 of 8

[1] 2006. SPEC: Standard Performance Evaluation Corporation. (2006). http://www. spec.org/.

[2] 2010. Low Latency 101. (2010). http://www.informatix-sol.com/docs/ LowLatency101.pdf.

[3] 2017. Anomaly detection in real-time data streams using Heron. (2017). https://www.slideshare.net/arunkejariwal/ anomaly-detection-in-realtime-data-streams-using-heron.

[4] 2017. QuantQuote. (2017). https://quantquote.com/.

[5] 2017. Satori: Transforming the world with live data. (2017). https://www.satori. com.

[6] G. Agamennoni, J. I. Nieto, and E. M. Nebot. 2011. An outlier-robust Kalman filter. In 2011 IEEE International Conference on Robotics and Automation. 1551-1558. [OpenAIRE]

[7] Charu C. Aggarwal. 2013. Outlier analysis. Springer.

[8] C. C. Aggarwal, T. J. Watson, R. Ctr, J. Han, J. Wang, and P. S. Yu. 2003. A framework for clustering evolving data streams. (2003).

[9] H. Akaike. 1969. Fitting autoregressive models for prediction. Annals of the Inst. of Stat. Math. 21 (1969), 243-247. [OpenAIRE]

[10] H. Akaike. 1986. Use of Statistical Models for Time Series Analysis. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. 3147- 3155.

[11] Krishnamurthy KB Al-Aweel IC. 1999. Post-Ictal Heart Rate Oscillations in Partial Epilepsy. Neurology 53, 7 (October 1999), 1590-1592.

[12] Mennatallah Amer, Markus Goldstein, and Slim Abdennadher. 2013. Enhancing One-class Support Vector Machines for Unsupervised Anomaly Detection. In Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description. ACM, New York, NY, USA, 8-15.

[13] Hesam Amoualian, Marianne Clausel, Eric Gaussier, and Massih-Reza Amini. 2016. Streaming-LDA: A Copula-based Approach to Modeling Topic Dependencies in Document Streams. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '16). ACM, New York, NY, USA, 695-704. https://doi.org/10.1145/2939672.2939781 [OpenAIRE]

[14] Vic Barnett and Toby Lewis. 1994. Outliers in statistical data. Vol. 3. Wiley New York.

[15] Christian Beckel, Wilhelm Kleiminger, Romano Cicchetti, Thorsten Staake, and Silvia Santini. 2014. The ECO Data Set and the Performance of Non-intrusive Load Monitoring Algorithms. In Proceedings of the 1st ACM Conference on Embedded Systems for Energy-Eficient Buildings . 80-89.

114 references, page 1 of 8
Abstract
Ever growing volume and velocity of data coupled with decreasing attention span of end users underscore the critical need for real-time analytics. In this regard, anomaly detection plays a key role as an application as well as a means to verify data fidelity. Although the subject of anomaly detection has been researched for over 100 years in a multitude of disciplines such as, but not limited to, astronomy, statistics, manufacturing, econometrics, marketing, most of the existing techniques cannot be used as is on real-time data streams. Further, the lack of characterization of performance -- both with respect to real-timeliness and accuracy -- on production data...
Subjects
free text keywords: Statistics - Machine Learning, Computer Science - Information Retrieval, Computer Science - Learning, Electrical Engineering and Systems Science - Signal Processing
Download from
114 references, page 1 of 8

[1] 2006. SPEC: Standard Performance Evaluation Corporation. (2006). http://www. spec.org/.

[2] 2010. Low Latency 101. (2010). http://www.informatix-sol.com/docs/ LowLatency101.pdf.

[3] 2017. Anomaly detection in real-time data streams using Heron. (2017). https://www.slideshare.net/arunkejariwal/ anomaly-detection-in-realtime-data-streams-using-heron.

[4] 2017. QuantQuote. (2017). https://quantquote.com/.

[5] 2017. Satori: Transforming the world with live data. (2017). https://www.satori. com.

[6] G. Agamennoni, J. I. Nieto, and E. M. Nebot. 2011. An outlier-robust Kalman filter. In 2011 IEEE International Conference on Robotics and Automation. 1551-1558. [OpenAIRE]

[7] Charu C. Aggarwal. 2013. Outlier analysis. Springer.

[8] C. C. Aggarwal, T. J. Watson, R. Ctr, J. Han, J. Wang, and P. S. Yu. 2003. A framework for clustering evolving data streams. (2003).

[9] H. Akaike. 1969. Fitting autoregressive models for prediction. Annals of the Inst. of Stat. Math. 21 (1969), 243-247. [OpenAIRE]

[10] H. Akaike. 1986. Use of Statistical Models for Time Series Analysis. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. 3147- 3155.

[11] Krishnamurthy KB Al-Aweel IC. 1999. Post-Ictal Heart Rate Oscillations in Partial Epilepsy. Neurology 53, 7 (October 1999), 1590-1592.

[12] Mennatallah Amer, Markus Goldstein, and Slim Abdennadher. 2013. Enhancing One-class Support Vector Machines for Unsupervised Anomaly Detection. In Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description. ACM, New York, NY, USA, 8-15.

[13] Hesam Amoualian, Marianne Clausel, Eric Gaussier, and Massih-Reza Amini. 2016. Streaming-LDA: A Copula-based Approach to Modeling Topic Dependencies in Document Streams. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '16). ACM, New York, NY, USA, 695-704. https://doi.org/10.1145/2939672.2939781 [OpenAIRE]

[14] Vic Barnett and Toby Lewis. 1994. Outliers in statistical data. Vol. 3. Wiley New York.

[15] Christian Beckel, Wilhelm Kleiminger, Romano Cicchetti, Thorsten Staake, and Silvia Santini. 2014. The ECO Data Set and the Performance of Non-intrusive Load Monitoring Algorithms. In Proceedings of the 1st ACM Conference on Embedded Systems for Energy-Eficient Buildings . 80-89.

114 references, page 1 of 8
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue