Integrated Path Stability Selection
成果类型:
Article; Early Access
署名作者:
Melikechi, Omar; Miller, Jeffrey W.
署名单位:
Harvard University; Harvard T.H. Chan School of Public Health
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2025.2525589
发表日期:
2025
关键词:
VARIABLE SELECTION
regularization
regression
Lasso
摘要:
Stability selection is a popular method for improving feature selection algorithms. One of its key attributes is that it provides theoretical upper bounds on the expected number of false positives, E(FP), enabling false positive control in practice. However, stability selection often selects few features because existing bounds on E(FP) are relatively loose. In this article, we introduce a novel approach to stability selection based on integrating stability paths rather than maximizing over them. This yields upper bounds on E(FP) that are much stronger than previous bounds, leading to significantly more true positives in practice for the same target E(FP). Furthermore, our method requires no more computation than the original stability selection algorithm. We demonstrate the method on simulations and real data from two cancer studies. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.