On L1-norm multiclass support vector machines:: Methodology and theory
成果类型:
Article
署名作者:
Wang, Lifeng; Shen, Xiaotong
署名单位:
University of Minnesota System; University of Minnesota Twin Cities
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1198/016214506000001383
发表日期:
2007
页码:
583-594
关键词:
feature-selection
CLASSIFICATION
regression
cancer
摘要:
Binary support vector machines (SVMs) have been proven to deliver high performance. In multiclass classification, however, issues remain with respect to variable selection. One challenging issue is classification and variable selection in the presence of variables in the magnitude of thousands, greatly exceeding the size of training sample. This often occurs in genomics classification. To meet the challenge, this article proposes a novel multiclass support vector machine, which performs classification and variable selection simultaneously through an L-1-norm penalized sparse representation. The proposed methodology, together with the developed regularization solution path, permits variable selection in such a situation. For the proposed methodology, a statistical learning theory is developed to quantify the generalization error in an attempt to gain insight into the basic structure of sparse learning, permitting the number of variables to greatly exceed the sample size. The operating characteristics of the methodology are examined through both simulated and benchmark data and are compared against some competitors in terms of accuracy of prediction. The numerical results suggest that the proposed methodology is highly competitive.