Feature Selection for Unsupervised Machine Learning

descriptionPublicationkeyboard_double_arrow_right Article , Conference object 16 Sep 2023 United States Publisher:IEEEJournal:2023 IEEE 8th International Conference on Smart Cloud (SmartCloud)

Authors: Huyunting Huang; Ziyang Tang; Tonglin Zhang; Baijian Yang 0001; Qianqian Song; Jing Su 0003;

doi: 10.1109/smartcloud58862.2023.00036

pmid: 38706555

pmc: PMC11070246

handle: 1805/42634

Feature Selection for Unsupervised Machine Learning

- Summary
- Subjects
- Metrics

Abstract

Compared to supervised machine learning (ML), the development of feature selection for unsupervised ML is far behind. To address this issue, the current research proposes a stepwise feature selection approach for clustering methods with a specification to the Gaussian mixture model (GMM) and the k-means. Rather than the existing GMM and k-means which are carried out based on all the features, the proposed method selects a subset of features to implement the two methods, respectively. The research finds that a better result can be obtained if the existing GMM and k-means methods are modified by nice initializations. Experiments based on Monte Carlo simulations show that the proposed method is more computationally efficient and the result is more accurate than the existing GMM and k-means methods based on all the features. The experiment based on a real-world dataset confirms this finding.

Country

United States

Related Organizations

Indiana University
United States
University of South Carolina
United States
Indiana University – Purdue University Indianapolis
United States
Indiana University School of Medicine
United States
Purdue University West Lafayette
United States

View all View all

Keywords

adjusted rand index, Gaussian mixture model, Adjusted rand index, k-means, Stepwise, Nursing, stepwise, 004

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

2

Average

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now