TESTING EQUIVALENCE OF CLUSTERING

成果类型:
Article
署名作者:
Gao, Chao; Ma, Zongming
署名单位:
University of Chicago; University of Pennsylvania
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/21-AOS2113
发表日期:
2022
页码:
407-429
关键词:
higher criticism sparse pca DISCOVERY mixtures breast
摘要:
In this paper, we test whether two data sets measured on the same set of subjects share a common clustering structure. As a leading example, we focus on comparing clustering structures in two independent random samples from two deterministic two-component mixtures of multivariate Gaussian distributions. Mean parameters of these Gaussian distributions are treated as potentially unknown nuisance parameters and are allowed to differ. Assuming knowledge of mean parameters, we first determine the phase diagram of the testing problem over the entire range of signal-to-noise ratios by providing both lower bounds and tests that achieve them. When nuisance parameters are unknown, we propose tests that achieve the detection boundary adaptively as long as ambient dimensions of the data sets grow at a sublinear rate with the sample size.