COMPARATIVE STUDY OF CLUSTERING ALGORITHMS FOR STUDENT PERFORMANCE EVALUATION

Predicting student performance is essential for enhancing educational outcomes, enabling educators to identify studentswho may need additional support or intervention. Clustering algorithms, as unsupervised data mining techniques, areparticularly effective at uncovering patterns in student performance data. These algorithms can group students basedon their exam scores, providing insights that allow for more tailored and targeted educational strategies. This studycompares four unsupervised methods K-Means, DBSCAN, Hierarchical Clustering (Ward linkage), and GaussianMixture Models (GMM) on a dataset of 200 students’ scores across five exam questions. After standardizing the data,we project it into two dimensions via Principal Component Analysis (PCA) for visualization. We then evaluate eachmodel using three validation metrics: Silhouette Score, Davies-Bouldin Index, and Calinski-Harabasz Index. K-Meanswith k = 5 achieves the highest Silhouette (0.387) and Calinski-Harabasz (90.156) scores and the lowest DaviesBouldin Index (0.883), outperforming alternatives in both visual separation and quantitative metrics. DBSCANidentifies noise but yields overlapping clusters; Hierarchical clustering shows moderate cohesion; GMM producessofter boundaries. Our results demonstrate that K-Means offers the most interpretable and robust grouping for thiseducational dataset, providing a practical tool for segmenting students into performance tiers. Future work may exploredynamic k-selection methods, incorporation of additional student features, and deployment in intelligent tutoringsystems.

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green