
handle: 11573/816629
With big administrative data, often we have a large number of variables with different measurement levels and many missing data. The correct approach to handle these situations depends on the type of data and the purpose of analysis. However, we can not simply delete the incomplete records, because it amounts to a substantial loss of costly collected data. Single imputation or multiple imputation can be applied to obtain different aims, create an ‘imputed’ data matrix with the same characteristics of the observed data or take account, in the estimation of a model, of the additional variability due to the imputation process. For big administrative data, several approaches have been proposed in literature. In this paper we compare different approaches, considering both single and multiple imputation, and we propose a new method, named Multitree. By some simulations, we show that Multitree is competitive with the best methods considered in literature.
Missing Data, multiple imputation, IVEWARE, Multitree
Missing Data, multiple imputation, IVEWARE, Multitree
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
