Beyond symmetry: best submatrix selection for the sparse truncated SVD
成果类型:
Article
署名作者:
Li, Yongchun; Xie, Weijun
署名单位:
University System of Georgia; Georgia Institute of Technology
刊物名称:
MATHEMATICAL PROGRAMMING
ISSN/ISSBN:
0025-5610
DOI:
10.1007/s10107-023-02030-7
发表日期:
2024
页码:
1-50
关键词:
hyperspectral anomaly detection
low-rank representation
DECOMPOSITION
nonconvex
network
摘要:
The truncated singular value decomposition (SVD), also known as the best low-rank matrix approximation with minimum error measured by a unitarily invariant norm, has been applied to many domains such as biology, healthcare, among others, where high-dimensional datasets are prevalent. To extract interpretable information from the high-dimensional data, sparse truncated SVD (SSVD) has been used to select a handful of rows and columns of the original matrix along with the best low-rank approximation. Different from the literature on SSVD focusing on the top singular value or compromising the sparsity for the seek of computational efficiency, this paper presents a novel SSVD formulation that can select the best submatrix precisely up to a given size to maximize its truncated Ky Fan norm. The fact that the proposed SSVD problem is NP-hard motivates us to study effective algorithms with provable performance guarantees. To do so, we first reformulate SSVD as a mixed-integer semidefinite program, which can be solved exactly for small- or medium-sized instances within a branch-and-cut algorithm framework with closed-form cuts and is extremely useful for evaluating the solution quality of approximation algorithms. We next develop three selection algorithms based on different selection criteria and two searching algorithms, greedy and local search. We prove the approximation ratios for all the approximation algorithms and show that all the ratios are tight when the number of rows or columns of the selected submatrix is no larger than half of the data matrix, i.e., our derived approximation ratios are unimprovable. Our numerical study demonstrates the high solution quality and computational efficiency of the proposed algorithms. Finally, all our analysis can be extended to row-sparse PCA.
来源URL: