Statistical behavior and consistency of classification methods based on convex risk minimization
成果类型:
Article
署名作者:
Zhang, T
署名单位:
International Business Machines (IBM); IBM USA
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
发表日期:
2004
页码:
56-85
关键词:
regression
algorithms
摘要:
We study how closely the optimal Bayes error rate can be approximately reached using a classification algorithm that computes a classifier by minimizing a convex upper bound of the classification error function. The measurement of closeness is characterized by the loss function used in the estimation. We show that such a classification scheme can be generally regarded as a (nonmaximum-likelihood) conditional in-class probability estimate, and we use this analysis to compare various convex loss functions that have appeared in the literature. Furthermore, the theoretical insight allows us to design good loss functions with desirable properties. Another aspect of our analysis is to demonstrate the consistency of certain classification methods using convex risk minimization. This study sheds light on the good performance of some recently proposed linear classification methods including boosting and support vector machines. It also shows their limitations and suggests possible improvements.