A flexible framework for hypothesis testing in high dimensions
成果类型:
Article
署名作者:
Javanmard, Adel; Lee, Jason D.
署名单位:
University of Southern California; Princeton University
刊物名称:
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
ISSN/ISSBN:
1369-7412
DOI:
10.1111/rssb.12373
发表日期:
2020
页码:
685-718
关键词:
Post-selection Inference
likelihood ratio tests
False Discovery Rate
confidence-intervals
variable selection
linear-regression
unknown sparsity
DANTZIG SELECTOR
Minimax Rates
Lasso
摘要:
Hypothesis testing in the linear regression model is a fundamental statistical problem. We consider linear regression in the high dimensional regime where the number of parameters exceeds the number of samples (p>n). To make informative inference, we assume that the model is approximately sparse, i.e. the effect of covariates on the response can be well approximated by conditioning on a relatively small number of covariates whose identities are unknown. We develop a framework for testing very general hypotheses regarding the model parameters. Our framework encompasses testing whether the parameter lies in a convex cone, testing the signal strength, and testing arbitrary functionals of the parameter. We show that the procedure proposed controls the type I error, and we also analyse the power of the procedure. Our numerical experiments confirm our theoretical findings and demonstrate that we control the false positive rate (type I error) near the nominal level and have high power. By duality between hypotheses testing and confidence intervals, the framework proposed can be used to obtain valid confidence intervals for various functionals of the model parameters. For linear functionals, the length of confidence intervals is shown to be minimax rate optimal.