Analysis of Testing-Based Forward Model Selection

成果类型:
Article
署名作者:
Kozbur, Damian
署名单位:
University of Zurich
刊物名称:
ECONOMETRICA
ISSN/ISSBN:
0012-9682
DOI:
10.3982/ECTA16273
发表日期:
2020
页码:
2147-2173
关键词:
VARIABLE SELECTION confidence-intervals least-squares regression heteroskedasticity inference Lasso time
摘要:
This paper analyzes a procedure called Testing-Based Forward Model Selection (TBFMS) in linear regression problems. This procedure inductively selects covariates that add predictive power into a working statistical model before estimating a final regression. The criterion for deciding which covariate to include next and when to stop including covariates is derived from a profile of traditional statistical hypothesis tests. This paper proves probabilistic bounds, which depend on the quality of the tests, for prediction error and the number of selected covariates. As an example, the bounds are then specialized to a case with heteroscedastic data, with tests constructed with the help of Huber-Eicker-White standard errors. Under the assumed regularity conditions, these tests lead to estimation convergence rates matching other common high-dimensional estimators including Lasso.