Downloads provided by UsageCounts
handle: 2117/23414
Mini Batch K-means (cite{Sculley2010}) has been proposed as an alternative to the K-means algorithm for clustering massive datasets. The advantage of this algorithm is to reduce the computational cost by not using all the dataset each iteration but a subsample of a fixed size. This strategy reduces the number of distance computations per iteration at the cost of lower cluster quality. The purpose of this paper is to perform empirical experiments using artificial datasets with controlled characteristics to assess how much cluster quality is lost when applying this algorithm. The goal is to obtain some guidelines about what are the best circumstances to apply this algorithm and what is the maximum gain in computational time without compromising the overall quality of the partition.
Scalable algorithms, Machine learning, Aprenentatge automàtic, Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial, :Informàtica::Intel·ligència artificial [Àrees temàtiques de la UPC], Unsupervised learning, K-means
Scalable algorithms, Machine learning, Aprenentatge automàtic, Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial, :Informàtica::Intel·ligència artificial [Àrees temàtiques de la UPC], Unsupervised learning, K-means
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 673 | |
| downloads | 4K |

Views provided by UsageCounts
Downloads provided by UsageCounts