
With the deepening of the research on clustering algorithm, clustering validity has become an indispensable part of cluster analysis. However, due to the complexity of data structure and different attributes, any clustering validity function cannot be applied to all datasets, so clustering validity function has been bringing forth new ones. Therefore, this paper proposes a clustering validity function fusion model based on D–S evidence theory (DS-CVFFM), which adopts FCM clustering algorithm as the base algorithm, calculates the values of different validity functions, and then uses the values of different clustering validity functions as the evidence source to construct the basic probability assignment function (BPA). Finally, it integrates with the fusion rules of D–S evidence theory, and outputs the optimal clustering number according to the decision conditions. DS-CVFFM uses the information fusion of multiple clustering validity functions to judge the number of optimal clusters without the need to propose complex validity functions, and avoid the influence of expert factors in the weighted combination clustering validity evaluation method. Finally, 4 sets of artificial datasets and 14 sets of UCI datasets are selected to verify the effectiveness of the proposed model. The experimental results show that compared with the traditional clustering validity evaluation methods, the proposed fusion model has a significant improvement in the accuracy of judging the optimal number of clusters, and the stability is improved under different values of fuzzy exponent, which can overcome the shortcomings of traditional clustering validity evaluation methods.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 8 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
