EXTENDING MODELS VIA GRADIENT BOOSTING: AN APPLICATION TO MENDELIAN MODELS

成果类型:
Article
署名作者:
Huang, Theodore; Idos, Gregory; Hong, Christine; Gruber, Stephen B.; Parmigiani, Giovanni; Braun, Danielle
署名单位:
Harvard University; Harvard T.H. Chan School of Public Health; Harvard University; Harvard University Medical Affiliates; Dana-Farber Cancer Institute; City of Hope
刊物名称:
ANNALS OF APPLIED STATISTICS
ISSN/ISSBN:
1932-6157
DOI:
10.1214/21-AOAS1482
发表日期:
2021
页码:
1126-1146
关键词:
logistic-regression colorectal-cancer predicting brca1 ovarian-cancer lynch syndrome mutations breast RISK performance validation
摘要:
Improving existing widely-adopted prediction models is often a more efficient and robust way toward progress than training new models from scratch. Existing models may: (a) incorporate complex mechanistic knowledge, (b) leverage proprietary information, and (c) have surmounted barriers to adoption. Compared to model training, model improvement and modification receive little attention. In this paper we propose a general approach to model improvement: we combine gradient boosting with any previously developed model to improve model performance while retaining important existing characteristics. To exemplify, we consider the context of Mendelian models which estimate the probability of carrying genetic mutations that confer susceptibility to disease by using family pedigrees and health histories of family members. Via simulations, we show that integration of gradient boosting with an existing Mendelian model can produce an improved model that outperforms both that model and the model built using gradient boosting alone. We illustrate the approach on genetic testing data from the USC-Stanford Cancer Genetics Hereditary Cancer Panel (HCP) study.
来源URL: