EXACT MINIMAX RISK FOR LINEAR LEAST SQUARES, AND THE LOWER TAIL OF SAMPLE COVARIANCE MATRICES
成果类型:
Article
署名作者:
Mourtada, Jaouad
署名单位:
Institut Polytechnique de Paris; ENSAE Paris
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/22-AOS2181
发表日期:
2022
页码:
2157-2178
关键词:
LITTLEWOOD-OFFORD PROBLEM
robust regression
learning-theory
singular-value
Optimal Rates
Lower bounds
distributions
asymptotics
performance
prediction
摘要:
We consider random-design linear prediction and related questions on the lower tail of random matrices. It is known that, under boundedness constraints, the minimax risk is of order d/n in dimension d with n samples. Here, we study the minimax expected excess risk over the full linear class, depending on the distribution of covariates. First, the least squares estimator is exactly minimax optimal in the well-specified case, for every distribution of covariates. We express the minimax risk in terms of the distribution of statistical leverage scores of individual samples, and deduce a minimax lower bound of d/(n - d + 1) for any covariate distribution, nearly matching the risk for Gaussian design. We then obtain sharp nonasymptotic upper bounds for covariates that satisfy a small ball-type regularity condition in both well-specified and misspecified cases. Our main technical contribution is the study of the lower tail of the smallest singular value of empirical covariance matrices at small values. We establish a lower bound on this lower tail, valid for any distribution in dimension d >= 2, together with a matching upper bound under a necessary regularity condition. Our proof relies on the PAC-Bayes technique for controlling empirical processes, and extends an analysis of Oliveira devoted to a different part of the lower tail.