A Review of Local Outlier Factor Algorithms for Outlier Detection in Big Data Streams

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 29 Dec 2020 English Publisher:MDPI AGJournal:Big Data and Cognitive Computing, volume 5, page 1 (eissn: 2504-2289,

Copyright policy )Funded by:NSF | RII Track-2 FEC: Leveragi...

Authors: Omar Alghushairy; Raed Alsini; Terence Soule; Xiaogang Ma 0001;

doi: 10.3390/bdcc5010001

A Review of Local Outlier Factor Algorithms for Outlier Detection in Big Data Streams

- Summary
- Subjects
- Metrics

Abstract

Outlier detection is a statistical procedure that aims to find suspicious events or items that are different from the normal form of a dataset. It has drawn considerable interest in the field of data mining and machine learning. Outlier detection is important in many applications, including fraud detection in credit card transactions and network intrusion detection. There are two general types of outlier detection: global and local. Global outliers fall outside the normal range for an entire dataset, whereas local outliers may fall within the normal range for the entire dataset, but outside the normal range for the surrounding data points. This paper addresses local outlier detection. The best-known technique for local outlier detection is the Local Outlier Factor (LOF), a density-based technique. There are many LOF algorithms for a static data environment; however, these algorithms cannot be applied directly to data streams, which are an important type of big data. In general, local outlier detection algorithms for data streams are still deficient and better algorithms need to be developed that can effectively analyze the high velocity of data streams to detect local outliers. This paper presents a literature review of local outlier detection algorithms in static and stream environments, with an emphasis on LOF algorithms. It collects and categorizes existing local outlier detection algorithms and analyzes their characteristics. Furthermore, the paper discusses the advantages and limitations of those algorithms and proposes several promising directions for developing improved local outlier detection methods for data streams.

Related Organizations

King Abdulaziz University
Saudi Arabia
Jeddah University
Saudi Arabia
Regent University
United States
University of Idaho
United States

Keywords

Technology, stream data mining, T, genetic algorithm, data science, outlier detection, local outlier factor

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	237
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 0.1%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 1%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 1%