SPARSE PRINCIPAL COMPONENT ANALYSIS WITH MISSING OBSERVATIONS
成果类型:
Article
署名作者:
Park, Seyoung; Zhao, Hongyu
署名单位:
Sungkyunkwan University (SKKU); Yale University
刊物名称:
ANNALS OF APPLIED STATISTICS
ISSN/ISSBN:
1932-6157
DOI:
10.1214/18-AOAS1220
发表日期:
2019
页码:
1016-1042
关键词:
Covariance matrices
gene-expression
power method
identification
Consistency
noisy
摘要:
Principal component analysis (PCA) is a commonly used statistical method in a wide range of applications. However, it does not work well when the number of features is larger than the sample size. We consider the estimation of the sparse principal subspace in the high dimensional setting with missing data motivated by the analysis of single-cell RNA sequence data. We propose a two step estimation procedure, and establish the rates of convergence for estimating the principal subspace. Simulated examples with various missing mechanisms show its competitive performance compared to existing sparse PCA methods. We apply the method to single-cell data and show that the proposed method can better distinguish cell types than other PCA methods.
来源URL: