Computationally intensive parameter selection for clustering algorithms: The case of fuzzy c-means with tolerance

descriptionPublicationkeyboard_double_arrow_right Article 07 Dec 2010 English Publisher:WileyJournal:International Journal of Intelligent Systems, volume 26, pages 313-322 (issn: 0884-8173,

Copyright policy )

Authors: Vicenç Torra; Yasunori Endo; Sadaaki Miyamoto;

doi: 10.1002/int.20467

handle: 10261/138218

Computationally intensive parameter selection for clustering algorithms: The case of fuzzy c-means with tolerance

- Summary
- Subjects
- Related research
  (16)
- Metrics

Abstract

Parameter selection is a well-known problem in the fuzzy clustering community. In this paper, we propose to tackle this problem using a computationally intensive approach. We apply this approach to a new method for clustering recently introduced in the literature. It is the fuzzy c-means with tolerance. This method permits data to include some error, and this is modeled by moving data in a particular direction within a particular range when clusters are defined. The proper application of this approach needs the correct definition of the parameter κ. A value that might be different for each record and corresponds to the maximum shift allowed to the data. In this paper, we review this method and we study the definition of this parameter κ when the same value of κ is used for all data elements. Our approach is based on the analysis of sets of data with increasing noise and an exhaustive analysis of the behavior of the algorithm with different values of κ. The analysis is motivated in privacy preserving data mining. The same approach can be used for parameter selection in other clustering algorithms. © 2010 Wiley Periodicals, Inc.

Partial support by the Spanish MEC (projects ARES – CONSOLIDER INGENIO 2010 CSD2007-00004 – and eAEGIS – TSI2007-65406-C03-02) is acknowledged.

Peer Reviewed

Related Organizations

University of Tsukuba
Japan
Spanish National Research Council
Spain

Keywords

Intensive parameters, Parameter selection, Clustering algorithms, Fuzzy clustering, Privacy preserving data mining, Fuzzy c-means, Fuzzy C mean

16 Research products, page 1 of 2

On a comparison between Mahalanobis distance and Choquet integral: The Choquet–Mahalanobis operator
2012IsAmongTopNSimilarDocuments
HOW TO GROUP ATTRIBUTES IN MULTIVARIATE MICROAGGREGATION
2008IsAmongTopNSimilarDocuments
Record linkage for database integration using fuzzy integrals
2008IsAmongTopNSimilarDocuments
On the f -divergence for non-additive measures
2016IsAmongTopNSimilarDocuments
On the comparison of some fuzzy clustering methods for privacy preserving data mining: Towards the development of specific information loss measures
2009IsAmongTopNSimilarDocuments
Semantic microaggregation for the anonymization of query logs using the Open Directory Project
2011IsAmongTopNSimilarDocuments
On the disclosure risk of multivariate microaggregation
2008IsAmongTopNSimilarDocuments
A View of Averaging Aggregation Operators
2007IsAmongTopNSimilarDocuments
Reidentification and k-anonymity: a model for disclosure risk in graphs
2012IsAmongTopNSimilarDocuments
Soft Computing in decision modeling
2009IsAmongTopNSimilarDocuments

chevron_left
1
2
chevron_right

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	6
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average