Trimmed Statistical Estimation via Variance Reduction

成果类型:
Article
署名作者:
Aravkin, Aleksandr; Davis, Damek
署名单位:
University of Washington; University of Washington Seattle; Cornell University
刊物名称:
MATHEMATICS OF OPERATIONS RESEARCH
ISSN/ISSBN:
0364-765X
DOI:
10.1287/moor.2019.0992
发表日期:
2020
页码:
292-322
关键词:
VARIABLE SELECTION regression nonconvex
摘要:
In this paper, we show how to transform any optimization problem that arises from fitting a machine learning model into one that (1) detects and removes contaminated data from the training set while (2) simultaneously fitting the trimmed model on the uncontaminated data that remains. To solve the resulting nonconvex optimization problem, we introduce a fast stochastic proximal-gradient algorithm that incorporates prior knowledge through nonsmooth regularization. For data sets of size n, our approach requires O(n(2/3)/epsilon) gradient evaluations to reach epsilon-accuracy, and when a certain error bound holds, the complexity improves to O(kappa n(2/3) log(1/epsilon)), where kappa is a condition number. These rates are n(1/3) times better than those achieved by typical, nonstochastic methods.