A new approach to cluster analysis: the clustering-function-based method
成果类型:
Article
署名作者:
Li, Baibing
署名单位:
Loughborough University
刊物名称:
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
ISSN/ISSBN:
1369-7412
DOI:
10.1111/j.1467-9868.2006.00549.x
发表日期:
2006
页码:
457-476
关键词:
gene-expression data
DISCRIMINANT-ANALYSIS
continuous-variables
CLASSIFICATION
Attributes
subsets
objects
摘要:
The purpose of the paper is to present a new statistical approach to hierarchical cluster analysis with n objects measured on p variables. Motivated by the model of multivariate analysis of variance and the method of maximum likelihood, a clustering problem is formulated as a least squares optimization problem, simultaneously solving for both an n-vector of unknown group membership of objects and a linear clustering function. This formulation is shown to be linked to linear regression analysis and Fisher linear discriminant analysis and includes principal component regression for tackling multicollinearity or rank deficiency, polynomial or B-splines regression for handling non-linearity and various variable selection methods to eliminate irrelevant variables from data analysis. Algorithmic issues are investigated by using sign eigenanalysis.