
The data mining technique of time series clustering is well established in many fields. However, as an unsupervised learning method, it requires making choices that are nontrivially influenced by the nature of the data involved. The aim of this paper is to verify usefulness of the time series clustering method for macroeconomics research, and to develop the most suitable methodology. By extensively testing various possibilities, we arrive at a choice of a dissimilarity measure (compression-based dissimilarity measure, or CDM) which is particularly suitable for clustering macroeconomic variables. We check that the results are stable in time and reflect large-scale phenomena such as crises. We also successfully apply our findings to analysis of national economies, specifically to identifying their structural relations.
14 pages, 3 figures, 1 table
FOS: Economics and business, C63, ddc:330, C18, Econometrics (econ.EM), E00, similarity, GDP, time series clustering, cluster analysis, Economics - Econometrics
FOS: Economics and business, C63, ddc:330, C18, Econometrics (econ.EM), E00, similarity, GDP, time series clustering, cluster analysis, Economics - Econometrics
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 7 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
