FUNDAMENTAL LIMITS OF LOW-RANK MATRIX ESTIMATION WITH DIVERGING ASPECT RATIOS
成果类型:
Article
署名作者:
Montanari, Andrea; Wu, Yuchen
署名单位:
Stanford University; Stanford University; University of Pennsylvania
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/24-AOS2400
发表日期:
2024
页码:
1460-1484
关键词:
principal-components
Mutual information
LARGEST EIGENVALUE
em algorithm
mixtures
population
PCA
摘要:
We consider the problem of estimating the factors of a low-rank n x d matrix, when this is corrupted by additive Gaussian noise. A special example of our setting corresponds to clustering mixtures of Gaussians with equal (known) covariances. Simple spectral methods do not take into account the distribution of the entries of these factors and are therefore often suboptimal. Here, we characterize the asymptotics of the minimum estimation error under the assumption that the distribution of the entries is known to the statistician. Our results apply to the high-dimensional regime n, d -> infinity and d/ n -> infinity (or d/ n -> 0) and generalize earlier work that focused on the proportional asymptotics n, d -> infinity , d/ n -> delta is an element of ( 0 , infinity ) . We outline an interesting signal strength regime in which d/ n -> infinity and partial recovery is possible for the left singular vectors while impossible for the right singular vectors. We illustrate the general theory by deriving consequences for Gaussian mixture clustering and carrying out a numerical study on genomics data.
来源URL: