OBSERVABLE ADJUSTMENTS IN SINGLE-INDEX MODELS FOR REGULARIZED M-ESTIMATORS WITH BOUNDED P/N

成果类型:
Article
署名作者:
Bellec, Pierre c.
署名单位:
State University System of Florida; Florida State University
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/24-AOS2464
发表日期:
2025
页码:
531-560
关键词:
confidence-intervals robust regression Lasso freedom
摘要:
We consider observations (X, y) from single index models with unknown link function, Gaussian covariates and a regularized M-estimator beta constructed from convex loss function and regularizer. In the regime where sample size n and dimension p are both increasing such that p/n has a finite limit, the behavior of the empirical distribution of beta and the predicted values X beta has been previously characterized in a number of models: The empirical distributions are known to converge to proximal operators of the loss and penalty in a related Gaussian sequence model, which captures the interplay between ratio pn , loss, regularization and the data generating process. This connection between (beta, X beta) and the corresponding proximal operators involves mean-field parameters defined as solutions to a nonlinear system of equations. This system typically involve unobservable quantities such as the prior distribution on the index or the link function, so the mean-field parameters need to be estimated. Although estimators for the mean-field parameters have been proposed in specific cases, a general framework that applies simultaneously to a broad class of loss and penalty has so far been missing. This paper develops a different theory to describe the empirical distribution of beta and X beta: Approximations of (beta, X beta) in terms of proximal operators are provided that only involve observable adjustments in place of the mean-field parameters. These proposed observable adjustments are data-driven, for example, do not require prior knowledge of the index or the link function. These new adjustments yield confidence intervals for individual components of the index, as well as estimators of the correlation of beta with the index, enabling parameter tuning to maximize the correlation. The interplay between loss, regularization and the model is captured in a data-driven manner, without relying on the nonlinear systems studied in previous works. The results are proved to hold both strongly convex regularizers and unregularized Mestimation.