CONVERGENCE AND PREDICTION OF PRINCIPAL COMPONENT SCORES IN HIGH-DIMENSIONAL SETTINGS
成果类型:
Article
署名作者:
Lee, Seunggeun; Zou, Fei; Wright, Fred A.
署名单位:
University of North Carolina; University of North Carolina Chapel Hill
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/10-AOS821
发表日期:
2010
页码:
3605-3629
关键词:
sample covariance matrices
whole-genome association
geometric representation
LARGEST EIGENVALUE
Consistency
survival
models
摘要:
A number of settings arise in which it is of interest to predict Principal Component (PC) scores for new observations using data from an initial sample. In this paper, we demonstrate that naive approaches to PC score prediction can be substantially biased toward 0 in the analysis of large matrices. This phenomenon is largely related to known inconsistency results for sample eigenvalues and eigenvectors as both dimensions of the matrix increase. For the spiked eigenvalue model for random matrices, we expand the generality of these results, and propose bias-adjusted PC score prediction. In addition, we compute the asymptotic correlation coefficient between PC scores from sample and population eigenvectors. Simulation and real data examples from the genetics literature show the improved bias and numerical properties of our estimators.