COCOLASSO FOR HIGH-DIMENSIONAL ERROR-IN-VARIABLES REGRESSION
成果类型:
Article
署名作者:
Datta, Abhirup; Zou, Hui
署名单位:
Johns Hopkins University; University of Minnesota System; University of Minnesota Twin Cities
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/16-AOS1527
发表日期:
2017
页码:
2400-2426
关键词:
dantzig selector
statistical estimation
oracle properties
Lasso
regularization
RECOVERY
sparsity
larger
models
noisy
摘要:
Much theoretical and applied work has been devoted to high-dimensional regression with clean data. However, we often face corrupted data in many applications where missing data and measurement errors cannot be ignored. Loh and Wainwright [Ann. Statist. 40 (2012) 1637-1664] proposed a non-convex modification of the Lasso for doing high-dimensional regression with noisy and missing data. It is generally agreed that the virtues of convexity contribute fundamentally the success and popularity of the Lasso. In light of this, we propose a new method named CoCoLasso that is convex and can handle a general class of corrupted datasets. We establish the estimation error bounds of CoCoLasso and its asymptotic sign-consistent selection property. We further elucidate how the standard cross validation techniques can be misleading in presence of measurement error and develop a novel calibrated cross-validation technique by using the basic idea in CoCoLasso. The calibrated cross-validation has its own importance. We demonstrate the superior performance of our method over the nonconvex approach by simulation studies.