
The traditional K-means algorithm has been widely studied and applied in data analysis due to its simplicity, efficiency and effectiveness. However, the traditional K-means algorithm strongly relies on the pre-defined number of clusters k, which leads to different training iterations and make the clustering results fall into local minimum. In addition, the traditional K-means algorithm is computationally complex and uses a proximity allocation strategy to handle outliers, which results in lower clustering accuracy. To address such issues, we present a novel improved Circular Units-based Adaptive Clustering (CUCA) algorithm in this paper. Our proposed algorithm describes each cluster in terms of circular data units and selects the initial centroids based on the density of each unit to solve the above-mentioned issues, i. e., different training iterations and the possibility of converging to local minimum. The algorithm can exactly find the neighbouring circular data units of each prime centroid based on the tree neighbour cluster search, relationship judgment only between the prime centroids and its neighbour circular data units instead of all units, reducing the time overhead. The algorithm uses predictive affiliation mechanism to determine the relationship between outliers and each cluster, and obtains the result that the outliers belong to a cluster or become a separate cluster, improving clustering accuracy.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
