您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 统计学 > The Annals of Statistics > 2020 > 2期

ROBUST MACHINE LEARNING BY MEDIAN-OF-MEANS: THEORY AND PRACTICE

成果类型：

Article

署名作者：

Lecue, Guillaume; Lerasle, Matthieu

署名单位：

Institut Polytechnique de Paris; ENSAE Paris; Ecole Polytechnique; Universite Paris Saclay; Centre National de la Recherche Scientifique (CNRS)

刊物名称：

ANNALS OF STATISTICS

ISSN/ISSBN：

0090-5364

DOI：

10.1214/19-AOS1828

发表日期：

2020

页码：

906-931

关键词：

model selection variable selection risk minimization regularization CONVERGENCE Lasso estimators regression RECOVERY bounds

摘要：

Median-of-means (MOM) based procedures have been recently introduced in learning theory (Lugosi and Mendelson (2019); Lecue and Lerasle (2017)). These estimators outperform classical least-squares estimators when data are heavy-tailed and/or are corrupted. None of these procedures can be implemented, which is the major issue of current MOM procedures (Ann. Statist. 47 (2019) 783-794). In this paper, we introduce minmax MOM estimators and show that they achieve the same sub-Gaussian deviation bounds as the alternatives (Lugosi and Mendelson (2019); Lecue and Lerasle (2017)), both in small and high-dimensional statistics. In particular, these estimators are efficient under moments assumptions on data that may have been corrupted by a few outliers. Besides these theoretical guarantees, the definition of minmax MOM estimators suggests simple and systematic modifications of standard algorithms used to approximate least-squares estimators and their regularized versions. As a proof of concept, we perform an extensive simulation study of these algorithms for robust versions of the LASSO.