Clustering objects on subsets of attributes

成果类型:
Article
署名作者:
Friedman, JH; Meulman, JJ
署名单位:
Stanford University; Leiden University; Leiden University - Excl LUMC
刊物名称:
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
ISSN/ISSBN:
1369-7412
DOI:
10.1111/j.1467-9868.2004.02059.x
发表日期:
2004
页码:
815-839
关键词:
variable-selection
摘要:
A new procedure is proposed for clustering attribute value data. When used in conjunction with conventional distance-based clustering algorithms this procedure encourages those algorithms to detect automatically subgroups of objects that preferentially cluster on subsets of the attribute variables rather than on all of them simultaneously. The relevant attribute subsets for each individual cluster can be different and partially (or completely) overlap with those of other clusters. Enhancements for increasing sensitivity for detecting especially low cardinality groups clustering on a small subset of variables are discussed. Applications in different domains, including gene expression arrays, are presented.