OBLIQUE RANDOM SURVIVAL FORESTS

成果类型:
Article
署名作者:
Jaeger, Byron C.; Long, D. Leann; Long, Dustin M.; Sims, Mario; Szychowski, Jeff M.; Min, Yuan-, I; Mcclure, Leslie A.; Howard, George; Simon, Noah
署名单位:
University of Alabama System; University of Alabama Birmingham; University of Mississippi Medical Center; University of Mississippi; Drexel University; University of Washington; University of Washington Seattle
刊物名称:
ANNALS OF APPLIED STATISTICS
ISSN/ISSBN:
1932-6157
DOI:
10.1214/19-AOAS1261
发表日期:
2019
页码:
1847-1883
关键词:
gene-expression regularization paths Regression trees CLASSIFICATION chemotherapy association CLASSIFIERS models
摘要:
We introduce and evaluate the oblique random survival forest (ORSF). The ORSF is an ensemble method for right-censored survival data that uses linear combinations of input variables to recursively partition a set of training data. Regularized Cox proportional hazard models are used to identify linear combinations of input variables in each recursive partitioning step. Benchmark results using simulated and real data indicate that the ORSF's predicted risk function has high prognostic value in comparison to random survival forests, conditional inference forests, regression and boosting. In an application to data from the Jackson Heart Study, we demonstrate variable and partial dependence using the ORSF and highlight characteristics of its ten-year predicted risk function for atherosclerotic cardiovascular disease events (AS-CVD; stroke, coronary heart disease). We present visualizations comparing variable and partial effect estimation according to the ORSF, the conditional inference forest, and the Pooled Cohort Risk equations. The oblique RSF R package, which provides functions to fit the ORSF and create variable and partial dependence plots, is available on the comprehensive R archive network (CRAN).