
handle: 2117/433881
Clustering is one of the most relevant data analysis techniques used for data grouping. There are several different clustering methods with its own strengths and weaknesses depending on the data. In this study, we will focus on ordinal data clustering methods. The purpose of this study is to collect, analyze, and present scientific literature about this type of cluster analysis methods. The final objective is to be able to identify the most widely used and effective clustering methods for the analysis of ordinal data. For this purpose, we will elaborate a systematic review following the PRISMA methodology (Preferred Reporting Items for Systematic Reviews and MetaAnalyses), which sets the guidelines for carrying out this type of study. The literature has been searched in different databases like Scopus or Web of Science (WoS), some of the most recognized webs for scientific literature. Only articles that met certain criteria were reviewed to ensure their quality and consistency with the objective of the study. Before starting with the systematic review, basic knowledge about clustering and more specifically about clustering of ordinal data is presented in order to help to contextualize the study and for people that aren’t familiarized with the subject. Then, in the systematic review, several relevant aspects of the articles are analyzed such as whether they present probabilistic or distance-based models, evaluation metrics used, dataset category, etc. Finally, a case study is carried out, in which different R libraries created for ordinal data clustering are tested, and the most relevant results of the work are discussed, as well as the limitations found.
Àrees temàtiques de la UPC::Matemàtiques i estadística, Multivariate analysis, systematic review, Classificació AMS::68 Computer science::68T Artificial intelligence, R (Computer program language), Estadística multivariable, Classificació AMS::62 Statistics::62H Multivariate analysis, PRISMA, R (Llenguatge de programació), clustering, ordinal data
Àrees temàtiques de la UPC::Matemàtiques i estadística, Multivariate analysis, systematic review, Classificació AMS::68 Computer science::68T Artificial intelligence, R (Computer program language), Estadística multivariable, Classificació AMS::62 Statistics::62H Multivariate analysis, PRISMA, R (Llenguatge de programació), clustering, ordinal data
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
