Eigen Selection in Spectral Clustering: A Theory-Guided Practice
成果类型:
Article
署名作者:
Han, Xiao; Tong, Xin; Fan, Yingying
署名单位:
Chinese Academy of Sciences; University of Science & Technology of China, CAS; University of Southern California
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2021.1917418
发表日期:
2023
页码:
109-121
关键词:
pca
摘要:
Based on a Gaussian mixture type model of K components, we derive eigen selection procedures that improve the usual spectral clustering algorithms in high-dimensional settings, which typically act on the top few eigenvectors of an affinity matrix (e.g., (XX)-X-T) derived from the data matrix X. Our selection principle formalizes two intuitions: (i) eigenvectors should be dropped when they have no clustering power; (ii) some eigenvectors corresponding to smaller spiked eigenvalues should be dropped due to estimation inaccuracy. Our selection procedures lead to new spectral clustering algorithms: ESSC for K = 2 and GESSC for K > 2. The newly proposed algorithms enjoy better stability and compare favorably against canonical alternatives, as demonstrated in extensive simulation and multiple real data studies. Supplementary materials for this article are available online.