Functional Data Analysis for Sparse Functional Data

Found an issue? Give us feedback

https://dx.doi.org/1...arrow_drop_down

https://dx.doi.org/10.18130/v3...

Doctoral thesis . 2018

Data sources: Datacite

Functional Data Analysis for Sparse Functional Data

descriptionPublicationkeyboard_double_arrow_right Doctoral thesis 27 Apr 2018Publisher:University of Virginia

doi: 10.18130/v3h70806k

Functional Data Analysis for Sparse Functional Data

- Summary
- Subjects
- Metrics

Abstract

With the development of science and modern technology, more and more data are being collected continuously over a time interval in various disciplines, such as public health, biology, medicine and finance. Such data can be viewed as ``functional data". Functional data analysis (FDA), which deals with the analysis and theory of functional data, has been receiving increasing popularity over the past decades. In this dissertation, we propose several functional data analysis methods and apply them to NIH cohort study, which is a study in the field of growth modeling. It is well known that early year catch-down growth is highly prevalent in developing countries for the reason of malnutrition (Black et al. [2008]). Children who suffers from malnutrition in the first 5 years of life will be at increasing risk for the development in cognitive and physical growth. Therefore, characterizing the catch-down growth and identifying the associate important risk factors is one of the most popular topics. In our study, we aim to investigate the relationship between height-for-age Z score (HAZ) at year 3 and a collection of predictors. However, we meet two problems. First, all functional predictors are sparsely and irregularly observed, that is, the measurement time varies from individual to individual. Functional predictors over the entire time interval must be estimated in order to perform the regression. In addition, some predictors, such as height, should be monotone over time, and a non-monotone estimation of height would make no sense. Secondly, the relationship between the response and functional predictors is not usually linear. Furthermore, here exists outliers in the response. To address the first problem, we propose a new method based on a monotone transformation, functional principal component (FPC) analysis and a penalized regression to estimate monotone functions for sparse growth data. We also prove the asymptotic properties for this proposed estimator. Extensive numerical studies show that our proposed method outperforms the existing methods in terms of model fitting and monotonicity of the estimation. In addition, the proposed method can also be utilized as a data preprocessing procedure for other methods, such as functional clustering and classification, where the functional predictors are required to be completely known. To address the second problem, we build a functional single index model for the non-linear relationship between response and functional predictors. The functional single index model is not only flexible but also interpretable. To deal with outliers, we propose a local modal regression (LMR) (Yao et al. [2012]) based estimation method. We show that by using the optimal bandwidth, the LMR estimator is not only robust when there are outliers or the error distribution is heavy tailed, but also asymptotically as efficient as the ordinary least squares based estimator when the error distribution is a Gaussian distribution. In addition, we conduct extensive simulation studies to demonstrate the robustness and efficiency of the resulting estimator by comparing it with least squares estimator and Huber estimator across different error distributions.

Related Organizations

University of Virginia
United States

Keywords

local modal regression, monotone function estimation, penalized regression

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average