
doi: 10.1002/sta4.405
Mapper is a popular topological data analysis method to analyse structure of complex high‐dimensional data sets. As the Mapper algorithm can be applied to clustering and feature selection with visualization, it is used in various fields such as biology and chemistry. However, some resolution parameters have to be chosen by the user before applying the Mapper algorithm, and the results are sensitive to the selection. In this paper, we focus on the selection of two resolution parameters, the number of intervals and the overlapping percentage. We propose a new resolution parameter selection method in Mapper based on the ensemble technique. We generate multiple Mapper results under various parameter values and apply the fuzzy clustering ensemble method to combine the results. To evaluate Mapper algorithms including the proposed one, three real data sets are considered. The results demonstrate the superiority of the proposed ensemble Mapper method.
machine learning, Statistics, graphical models, high dimensional data, data mining, clustering
machine learning, Statistics, graphical models, high dimensional data, data mining, clustering
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 6 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
