您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 统计学 > The Annals of Statistics > 2025 > 1期

COMPUTATIONALLY EFFICIENT AND STATISTICALLY OPTIMAL ROBUST HIGH-DIMENSIONAL LINEAR REGRESSION

成果类型：

Article

署名作者：

Shen, Yinan; Li, Jingyang; Cai, Jian-feng; Xia, Dong

署名单位：

Hong Kong University of Science & Technology

刊物名称：

ANNALS OF STATISTICS

ISSN/ISSBN：

0090-5364

DOI：

10.1214/24-AOS2473

发表日期：

2025

页码：

374-399

关键词：

rank matrix recovery Oracle Inequalities sparse recovery Riemannian Optimization quantile regression tensor completion mean estimation cancer rates algorithm

摘要：

High-dimensional linear regression under heavy-tailed noise or outlier corruption is challenging, both computationally and statistically. Convex approaches have been proven statistically optimal but suffer from high computational costs, especially since the robust loss functions are usually nonsmooth. More recently, computationally fast nonconvex approaches via subgradient descent are proposed, which, unfortunately, fail to deliver a statistically consistent estimator even under sub-Gaussian noise. In this paper, we introduce a projected subgradient descent algorithm for both the sparse linear regression and low-rank linear regression problems. The algorithm is not only computationally efficient with linear convergence but also statistically optimal, be the noise Gaussian or heavy-tailed with a finite 1 + epsilon moment. The convergence theory is established for a general framework and its specific applications to absolute loss, Huber loss and quantile loss are investigated. Compared with existing nonconvex methods, ours reveals a surprising phenomenon of two-phase convergence. In phase one, the algorithm behaves as in typical nonsmooth optimization that requires gradually decaying stepsizes. However, phase one only delivers a statistically suboptimal estimator, which is already observed in the existing literature. Interestingly, during phase two, the algorithm converges linearly as if minimizing a smooth and strongly convex objective function, and thus a constant stepsize suffices. Underlying the phase-two convergence is the smoothing effect of random noise to the nonsmooth robust losses in an area close but not too close to the truth. Numerical simulations confirm our theoretical discovery and showcase the superiority of our algorithm over prior methods.