
arXiv: 2011.12509
This work considers the problem of fitting functional models with sparsely and irregularly sampled functional data. It overcomes the limitations of the state‐of‐the‐art methods, which face major challenges in the fitting of more complex non‐linear models. Currently, many of these models cannot be consistently estimated unless the number of observed points per curve grows sufficiently quickly with the sample size, whereas we show numerically that a modified approach with more modern multiple imputation methods can produce better estimates in general. We also propose a new imputation approach that combines the ideas ofMissForestwithLocal Linear Forestand compare their performance withPACEand several other multivariate multiple imputation methods. This work is motivated by a longitudinal study on smoking cessation, in which the electronic health records (EHR) from Penn State PaTH to Health allow for the collection of a great deal of data, with highly variable sampling. To illustrate our approach, we explore the relation between relapse and diastolic blood pressure. We also consider a variety of simulation schemes with varying levels of sparsity to validate our methods.
FOS: Computer and information sciences, Computer Science - Machine Learning, multiple imputation, functional regression, longitudinal data analysis, Statistics, Machine Learning (stat.ML), Machine Learning (cs.LG), Methodology (stat.ME), missing data, Statistics - Machine Learning, Statistics - Methodology, functional data analysis
FOS: Computer and information sciences, Computer Science - Machine Learning, multiple imputation, functional regression, longitudinal data analysis, Statistics, Machine Learning (stat.ML), Machine Learning (cs.LG), Methodology (stat.ME), missing data, Statistics - Machine Learning, Statistics - Methodology, functional data analysis
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 13 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
