Variable selection and model building via likelihood basis pursuit
成果类型:
Article
署名作者:
Zhang, HH; Wahba, G; Lin, Y; Voelker, M; Ferris, M; Klein, R; Klein, B
署名单位:
North Carolina State University; University of Wisconsin System; University of Wisconsin Madison; University of Wisconsin System; University of Wisconsin Madison; University of Wisconsin System; University of Wisconsin Madison
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1198/016214504000000593
发表日期:
2004
页码:
659-672
关键词:
smoothing spline anova
diabetic-retinopathy
bernoulli observations
PENALIZED LIKELIHOOD
regression
progression
diagnosis
RISK
AGE
摘要:
This article presents a nonparametric penalized likelihood approach for variable selection and model building, called likelihood basis pursuit (LBP). In the setting of a tenser product reproducing kernel Hilbert space, we decompose the log-likelihood into the sum of different functional components such as main effects and interactions, with each component represented by appropriate basis functions. Basis functions are chosen to be compatible with variable selection and model building in the context of a smoothing spline ANOVA model. Basis pursuit is applied to obtain the optimal decomposition in terms of having the smallest L-1 norm on the coefficients. We use the functional l(1) norm to measure the importance of each component and determine the threshold value by a sequential Monte Carlo bootstrap test algorithm. As a generalized LASSO-type method, LBP produces shrinkage estimates for the coefficients, which greatly facilitates the variable selection process and provides highly interpretable multivariate functional estimate,,, at the same time. To choose the regularization parameters appearing in the LBP models, generalized approximate cross-validation (GACV) is derived as a tuning criterion. To make GACV widely applicable to large datasets, its randomized version is proposed as well. A technique slice modeling is used to solve the optimization problem and makes the computation more efficient. LBP has great potential for a wide range of research and application areas such as medical studies, and in this article we apply it to two large ongoing epidermologic studies, the Wisconsin Epidermologic Study of Diabetic Retinopathy (WESDR) and the Beaver Dam Eye Study (BDES).