您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 统计学 > Journal of the Royal Statistical Society: Series B > 2011

Penalized classification using Fisher's linear discriminant

成果类型：

Article

署名作者：

Witten, Daniela M.; Tibshirani, Robert

署名单位：

University of Washington; University of Washington Seattle; Stanford University

刊物名称：

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY

ISSN/ISSBN：

1369-7412

DOI：

10.1111/j.1467-9868.2011.00783.x

发表日期：

2011

页码：

753-772

关键词：

multiclass cancer-diagnosis shrunken centroids variable selection optimization prediction regression

摘要：

We consider the supervised classification setting, in which the data consist of p features measured on n observations, each of which belongs to one of K classes. Linear discriminant analysis (LDA) is a classical method for this problem. However, in the high dimensional setting where p >> n, LDA is not appropriate for two reasons. First, the standard estimate for the within-class covariance matrix is singular, and so the usual discriminant rule cannot be applied. Second, when p is large, it is difficult to interpret the classification rule that is obtained from LDA, since it involves all p features. We propose penalized LDA, which is a general approach for penalizing the discriminant vectors in Fisher's discriminant problem in a way that leads to greater interpretability. The discriminant problem is not convex, so we use a minorization-maximization approach to optimize it efficiently when convex penalties are applied to the discriminant vectors. In particular, we consider the use of L-1 and fused lasso penalties. Our proposal is equivalent to recasting Fisher's discriminant problem as a biconvex problem. We evaluate the performances of the resulting methods on a simulation study, and on three gene expression data sets. We also survey past methods for extending LDA to the high dimensional setting and explore their relationships with our proposal.

来源URL：

访问原文