Automatic sparse principal component analysis

descriptionPublicationkeyboard_double_arrow_right Article 20 Dec 2020 English Publisher:WileyJournal:Canadian Journal of Statistics, volume 49, pages 678-697 (issn: 0319-5724, eissn: 1708-945X,

Copyright policy )

Authors: Heewon Park; Rui Yamaguchi; Seiya Imoto; Satoru Miyano;

doi: 10.1002/cjs.11579

Automatic sparse principal component analysis

- Summary
- Subjects
- Metrics

Abstract

The wide availability of computers enables us to accumulate a huge amount of data, thus effective tools to extract information from the huge volume of data have become critical. Principal component analysis (PCA) is a useful and traditional tool for dimensionality reduction of massive high‐dimensional datasets. Recently, sparse principal component (PC) loading estimation based on L1‐type regularization has drawn a large amount of attention. Although sparse PCA makes interpretation easily and performs dimension reduction without disturbance from noisy features, the existing studies on sparse PCA were based on an arbitrary number of PCs without any statistical justification. We propose a novel method, called as automatic sparse PCA, which can perform PC selection and sparse PC loading estimation, simultaneously. For PC selection, we first develop sparse singular value decomposition (sparse SVD), then incorporate sparsity into PC loading estimation. The proposed method enables us to perform dimension reduction and PC loading estimation, simultaneously. Furthermore, we can perform PCA without disturbance from noisy features. It can be seen through Monte Carlo experiments that the proposed automatic sparse PCA outperforms sparse structure identification and reconstructing data based on low‐dimensional projection. The proposed method is also applied to a number of real datasets and it can be also seen that our method achieves effectiveness for estimation accuracy and interpreting PCA results.

Related Organizations

Keywords

Computational methods for sparse matrices, \(L_1\)-type regularization, principal component analysis, sparsity, singular value decomposition, Factor analysis and principal components; correspondence analysis, dimensionality reduction

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	1
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

1

Average

Fields of Science

Fields of Science

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now