A General Framework for Inference on Algorithm-Agnostic Variable Importance

成果类型:
Article
署名作者:
Williamson, Brian D.; Gilbert, Peter B.; Simon, Noah R.; Carone, Marco
署名单位:
Fred Hutchinson Cancer Center; University of Washington; University of Washington Seattle
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2021.2003200
发表日期:
2023
页码:
1645-1658
关键词:
regression
摘要:
In many applications, it is of interest to assess the relative contribution of features (or subsets of features) toward the goal of predicting a response-in other words, to gauge the variable importance of features. Most recent work on variable importance assessment has focused on describing the importance of features within the confines of a given prediction algorithm. However, such assessment does not necessarily characterize the prediction potential of features, and may provide a misleading reflection of the intrinsic value of these features. To address this limitation, we propose a general framework for nonparametric inference on interpretable algorithm-agnostic variable importance. We define variable importance as a population-level contrast between the oracle predictiveness of all available features versus all features except those under consideration. We propose a nonparametric efficient estimation procedure that allows the construction of valid confidence intervals, even when machine learning techniques are used. We also outline a valid strategy for testing the null importance hypothesis. Through simulations, we show that our proposal has good operating characteristics, and we illustrate its use with data from a study of an antibody against HIV-1 infection. Supplementary materials for this article are available online.