A General Framework for Estimation and Inference From Clusters of Features

成果类型:
Article
署名作者:
Reid, Stephen; Taylor, Jonathan; Tibshirani, Robert
署名单位:
Stanford University
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2016.1246368
发表日期:
2018
页码:
280-293
关键词:
regression
摘要:
Applied statistical problems often come with prespecified groupings to predictors. It is natural to test for the presence of simultaneous group-wide signal for groups in isolation, or for multiple groups together. Current tests for the presence of such signals include the classical F-test or a t-test on unsupervised group prototypes (either group centroids or first principal components). In this article, we propose test statistics that aim for power improvements over these classical approaches. In particular, we first create group prototypes, with reference to the response, and then test with likelihood ratio statistics incorporating only these prototypes. We propose a model, called the prototype model, which naturally models this two-step procedure. Furthermore, we introduce an inferential schema detailing the unique considerations for different combinations of prototype formation and univariate/multivariate testing models. The prototype model also suggests new applications to estimation and prediction. Prototype formation often relies on variable selection, which invalidates classical Gaussian test theory. We use recent advances in selective inference to account for selection in the prototyping step and retain test validity. Simulation experiments suggest that our testing procedure enjoys more power than do classical approaches. Supplementary materials for this article are available online.