Classification via Bayesian Nonparametric Learning of Affine Subspaces

成果类型:
Article
署名作者:
Page, Garritt; Bhattacharya, Abhishek; Dunson, David
署名单位:
Pontificia Universidad Catolica de Chile; Indian Statistical Institute; Indian Statistical Institute Kolkata; Duke University
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2013.763566
发表日期:
2013
页码:
187-201
关键词:
sliced inverse regression MULTIVARIATE mixture MODEL
摘要:
It has become common for datasets to contain large numbers of variables in studies conducted in areas such as genetics, machine vision, image analysis, and many others. When analyzing such data, parametric models are often too inflexible while nonparametric procedures tend to be nonrobust because of insufficient data on these high-dimensional spaces. This is particularly true when interest lies in building efficient classifiers in the presence of many predictor variables. When dealing with these types of data, it is often the case that most of the variability tends to lie along a few directions, or more generally along a much smaller dimensional submanifold of the data space. In this article, we propose a class of models that flexibly learn about this submanifold while simultaneously performing dimension reduction in classification. This methodology allows the cell probabilities to vary nonparametrically based on a few coordinates expressed as linear combinations of the predictors. Also, as opposed to many black-box methods for dimensionality reduction, the proposed model is appealing in having clearly interpretable and identifiable parameters that provide insight into which predictors are important in determining accurate classification boundaries. Gibbs sampling methods are developed for posterior computation, and the methods are illustrated using simulated and real data applications.