Modern multiple imputation with functional data

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Other literature type 24 Feb 2021Embargo end date: 01 Jan 2020 English Publisher:WileyJournal:Stat, volume 10 (issn: 2049-1573, eissn: 2049-1573,

Copyright policy )Funded by:NSF | Formal Privacy for Comple...

Authors: Aniruddha Rajendra Rao; Matthew Reimherr;

doi: 10.1002/sta4.331 , 10.48550/arxiv.2011.12509

arXiv: 2011.12509

Modern multiple imputation with functional data

- Summary
- Subjects
- Metrics

Abstract

This work considers the problem of fitting functional models with sparsely and irregularly sampled functional data. It overcomes the limitations of the state‐of‐the‐art methods, which face major challenges in the fitting of more complex non‐linear models. Currently, many of these models cannot be consistently estimated unless the number of observed points per curve grows sufficiently quickly with the sample size, whereas we show numerically that a modified approach with more modern multiple imputation methods can produce better estimates in general. We also propose a new imputation approach that combines the ideas ofMissForestwithLocal Linear Forestand compare their performance withPACEand several other multivariate multiple imputation methods. This work is motivated by a longitudinal study on smoking cessation, in which the electronic health records (EHR) from Penn State PaTH to Health allow for the collection of a great deal of data, with highly variable sampling. To illustrate our approach, we explore the relation between relapse and diastolic blood pressure. We also consider a variety of simulation schemes with varying levels of sparsity to validate our methods.

Related Organizations

Pennsylvania State University
United States

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, multiple imputation, functional regression, longitudinal data analysis, Statistics, Machine Learning (stat.ML), Machine Learning (cs.LG), Methodology (stat.ME), missing data, Statistics - Machine Learning, Statistics - Methodology, functional data analysis

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	13
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

13

Top 10%

Green

gold

Fields of Science

Fields of Science

Funded by

NSF| Formal Privacy for Complex Data Objects