Genetic algorithm for clustering mixed-type data

descriptionPublicationkeyboard_double_arrow_right Article 01 Jan 2011 English Publisher:SPIE-Intl Soc Optical EngJournal:Journal of Electronic Imaging, volume 20, page 13,003 (issn: 1017-9909,

Copyright policy )

Authors: Shiueng-Bien Yang; Yung-Gi Wu;

doi: 10.1117/1.3537836

Genetic algorithm for clustering mixed-type data

- Summary
- Metrics

Abstract

The k-modes algorithm was recently proposed to cluster mixed-type data. However, in solving clustering problems, the k-modes algorithm and its variants usually ask the user to provide the number of clusters in the data sets. Unfortunately, the number of clusters is generally unknown to the user. Therefore, clustering becomes a tedious task of trial-and-error and the clustering result is often poor, especially when the number of clusters is large and not easy to guess. Also, it is hard for a user to select the weight between categorical and numeric attributes in the k-modes algorithm. In this paper, a genetic algorithm for clustering large data sets with mixed-type data is proposed, and this algorithm can automatically search the number of clusters in the data set. Also, a weight can be automatically selected by the genetic algorithm to prevent favoring either type of attribute. Experimental results illustrate the effectiveness of the genetic algorithm.

Related Organizations

Chang Jung Christian University
Taiwan
Ursuline College
United States

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	1
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

1

Average

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now