Hierarchical testing of variable importance
成果类型:
Article
署名作者:
Meinshausen, Nicolai
署名单位:
University of Oxford
刊物名称:
BIOMETRIKA
ISSN/ISSBN:
0006-3444
DOI:
10.1093/biomet/asn007
发表日期:
2008
页码:
265278
关键词:
false discovery rate
multiple
摘要:
A frequently encountered challenge in high-dimensional regression is the detection of relevant variables. Variable selection suffers from instability and the power to detect relevant variables is typically low if predictor variables are highly correlated. When taking the multiplicity of the testing problem into account, the power diminishes even further. To gain power and insight, it can be advantageous to look for influence not at the level of individual variables but rather at the level of clusters of highly correlated variables. We propose a hierarchical approach. Variable importance is first tested at the coarsest level, corresponding to the global null hypothesis. The method then tries to attribute any effect to smaller subclusters or even individual variables. The smallest possible clusters, which still exhibit a significant influence on the response variable, are retained. It is shown that the proposed testing procedure controls the familywise error rate at a prespecified level, simultaneously over all resolution levels. The method has power comparable to the Bonferroni - Holm procedure on the level of individual variables and dramatically larger power for coarser resolution levels. The best resolution level is selected adaptively.