DEFLATED HETEROPCA: OVERCOMING THE CURSE OF ILL-CONDITIONING IN HETEROSKEDASTIC PCA

成果类型:
Article
署名作者:
Zhou, Yuchen; Chen, Yuxin
署名单位:
University of Illinois System; University of Illinois Urbana-Champaign; University of Pennsylvania
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/24-AOS2456
发表日期:
2025
页码:
91-116
关键词:
covariance-matrix estimation Principal Component Analysis optimal shrinkage factor models noisy approximation number eigenstructure asymptotics completion
摘要:
This paper is concerned with estimating the column subspace of a low- rank matrix X star E Rn1xn2 from contaminated data. How to obtain optimal statistical accuracy while accommodating the widest range of signalto-noise ratios (SNRs) becomes particularly challenging in the presence of heteroskedastic noise and unbalanced dimensionality (i.e., n2 >> n1). While the state-of-the-art algorithm HeteroPCA emerges as a powerful solution for solving this problem, it suffers from the curse of ill-conditioning, namely, its performance degrades as the condition number of X star grows. In order to overcome this critical issue without compromising the range of allowable SNRs, we propose a novel algorithm, called Deflated - HeteroPCA, that achieves near-optimal and condition-number-free theoretical guarantees in terms of both l2 and l2,statistical accuracy. The proposed algorithm divides the spectrum of X star into well-conditioned and mutually well- separated subblocks, and applies HeteroPCA to conquer each subblock successively. Further, an application of our algorithm and theory to two canonical examples-the factor model and tensor PCA-leads to remarkable improvement for each application.