Cluster identification using projections

成果类型:
Article; Proceedings Paper
署名作者:
Peña, D; Prieto, FJ
署名单位:
Universidad Carlos III de Madrid
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1198/016214501753382345
发表日期:
2001
页码:
1433-1445
关键词:
Classification kurtosis
摘要:
This article describes a procedure to identify clusters in multivariate data using information obtained from the univariate projections of the sample data onto certain directions. The directions are chosen as those that minimize and maximize the kurtosis coefficient of the projected data. It is shown that, under certain conditions, these directions provide the largest separation for the different clusters. The projected univariate data are used to group the observations according to the values of the gaps or spacings between consecutive-ordered observations. These groupings are then combined over all projection directions. The behavior of the method is tested on several examples, and compared to k-means, MCLUST, and the procedure proposed by Jones and Sibson in 1987. The proposed algorithm is iterative, affine equivariant, flexible, robust to outliers, fast to implement, and seems to work well in practice.