A Global Optimization Algorithm for K-Center Clustering of One Billion Samples

成果类型:
Article; Early Access
署名作者:
Ren, Jiayang; You, Ningning; Hua, Kaixun; Ji, Chaojie; Cao, Yankai
署名单位:
University of British Columbia; Shanghai Jiao Tong University; University of British Columbia
刊物名称:
MANAGEMENT SCIENCE
ISSN/ISSBN:
0025-1909
DOI:
10.1287/mnsc.2023.00218
发表日期:
2025
关键词:
global optimization K-center clustering branch and bound two-stage decomposition bounds tightening
摘要:
This paper presents a practical global optimization algorithm for the K-center clustering problem, which aims to select K samples as the cluster centers to minimize the maximum within-cluster distance. Specifically, we propose a reduced-space branch and bound scheme that guarantees convergence to the global optimum in a finite number of steps by only branching on the regions of centers. To improve efficiency, we design a twostage decomposable lower bound, the solution of which can be derived in a closed form. In addition, we also propose several structure-exploiting acceleration techniques to narrow down the region of centers, including bounds tightening, sample reduction, and parallelization. Extensive studies on synthetic and real-world data sets have demonstrated that our algorithm can solve the K-center problems to global optimal within four hours for 10 million samples in the serial mode and 1 billion samples in the parallel mode, whereas existing studies can only address small-scale problems (e.g., thousands of samples). Moreover, compared with the state-of-the-art heuristic methods, the global optimum obtained by our algorithm reduces the objective function by an average of 25.8% on these synthetic and real-world data sets.
来源URL: