An Unsupervised Feature Selection Method for Data-Driven Anomaly Detection Systems

Name: An Unsupervised Feature Selection Method for Data-Driven Anomaly Detection Systems
Creator: Naif Almusallam
Keywords: 0202 electrical engineering, electronic engineering, information engineering, 02 engineering and technology

Naif Almusallam

Found an issue? Give us feedback

https://doi.org/10.1...arrow_drop_down

https://doi.org/10.1109/wetice...

Article . 2020 . Peer-reviewed

License: IEEE Copyright

Data sources: Crossref

https://dx.doi.org/10.1109/wet...

Article

Data sources: Microsoft Academic Graph

An Unsupervised Feature Selection Method for Data-Driven Anomaly Detection Systems

descriptionPublicationkeyboard_double_arrow_right Article 01 Sep 2020Publisher:IEEEJournal:2020 IEEE 29th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE)

Authors: Naif Almusallam;

doi: 10.1109/wetice49692.2020.00016

An Unsupervised Feature Selection Method for Data-Driven Anomaly Detection Systems

- Summary
- Metrics

Abstract

Feature selection has been widely used as a pre-processing step that helps to optimise the performance of data-driven intrusion/anomaly detection systems in achieving their tasks. For example, when grouping the data into normal and outlier groups, the existence of redundant and non-representative features would reduce the accuracy of classifying the data points and would also increase the processing time. Therefore, feature selection is applied as a pre-processing step for anomaly detection systems in order to optimize their classification accuracy and running time. Most of the existing feature selection methods have limitations when dealing with high-dimensional data, as they search different subsets of features to find accurate representations of all features. Obviously, searching for different combinations of features is computationally very expensive, which makes existing work not efficient for high-dimensional data. The work carried out here, which relates to the design of a similaritybased unsupervised feature selection method for an efficient and accurate anomaly detection (UFSAD), tackles mainly the selection of reduced set of representative features from high-dimensional data without the data class labels. The selected features should improve the accuracy and performance of anomaly detection systems due to the elimination of redundant and non-representative features. The proposed UFSAD method extends the k-mean clustering algorithm to partition the features into k clusters based on a similarity measure (e.g. PCC - Pearson Correlation Coefficient, LSRE - Least Square Regression Error or MICI - Maximal Information Compression Index) in order to accurately partition the features. Then the proposed centroid-based feature selection method is used, where the feature with the closest similarity to its cluster centroid is selected as the representative feature while others are discarded. Extensive experimental work has shown that UFSAD can generate a reduced representative and non-redundant feature set that achieves good classification accuracy in comparison with well-known unsupervised features selection methods.

Related Organizations

Imam Muhammad ibn Saud Islamic University
Saudi Arabia
Islamic University
Bangladesh

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now