
doi: 10.1002/sam.10101
handle: 11379/41866
AbstractThe concept of symbolic data has been developed with the aim of representing variables whose measurement is affected by some internal variation. This idea has been mainly concerned with the need of aggregating individuals in order to summarize large datasets into smaller matrices of manageable size, retaining as much of the original knowledge as possible. Nevertheless it is often applied also with variables structured from their outset as symbolic variables, although measured on single individuals. This paper deals with the latter framework, and aims at showing that symbolic data analysis techniques can be applied to the field of missing values treatment. The algorithm for a symbolic imputation technique in principal component analysis is presented as a generalization of the basic strategy called interval imputation. An illustrative example and a real data case study show how the proposed technique works. © 2011 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 4: 171–183, 2011
missing data, principal component analysis, Missing data; interval-valued data; histogram-valued data; Principal Component Analysis, Statistics, Computer science, interval-valued data, histogram-valued data
missing data, principal component analysis, Missing data; interval-valued data; histogram-valued data; Principal Component Analysis, Statistics, Computer science, interval-valued data, histogram-valued data
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
