Variable selection for support vector machines in moderately high dimensions

成果类型:
Article
署名作者:
Zhang, Xiang; Wu, Yichao; Wang, Lan; Li, Runze
署名单位:
North Carolina State University; University of Minnesota System; University of Minnesota Twin Cities; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
刊物名称:
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
ISSN/ISSBN:
1369-7412
DOI:
10.1111/rssb.12100
发表日期:
2016
页码:
53-76
关键词:
nonconcave penalized likelihood gene selection CLASSIFICATION Lasso regression optimality scad
摘要:
The support vector machine (SVM) is a powerful binary classification tool with high accuracy and great flexibility. It has achieved great success, but its performance can be seriously impaired if many redundant covariates are included. Some efforts have been devoted to studying variable selection for SVMs, but asymptotic properties, such as variable selection consistency, are largely unknown when the number of predictors diverges to 1. We establish a unified theory for a general class of non-convex penalized SVMs. We first prove that, in ultrahigh dimensions, there is one local minimizer to the objective function of non-convex penalized SVMs having the desired oracle property. We further address the problem of non-unique local minimizers by showing that the local linear approximation algorithm is guaranteed to converge to the oracle estimator even in the ultrahigh dimensional setting if an appropriate initial estimator is available. This condition on the initial estimator is verified to be automatically valid as long as the dimensions are moderately high. Numerical examples provide supportive evidence.
来源URL: