Binary Matrix Factorization and Completion via Integer Programming

成果类型:
Article
署名作者:
Gunluk, Oktay; Hauser, Raphael Andreas; Kovacs, Reka Agnes
署名单位:
Cornell University; University of Oxford; Alan Turing Institute
刊物名称:
MATHEMATICS OF OPERATIONS RESEARCH
ISSN/ISSBN:
0364-765X
发表日期:
2024
页码:
1278-1302
关键词:
摘要:
Binary matrix factorization is an essential tool for identifying discrete patterns in binary data. In this paper, we consider the rank-k binary matrix factorization problem (kBMF) under Boolean arithmetic: we are given an n x m binary matrix X with possibly missing entries and need to find two binary matrices A and B of dimension n x k and k x m, respectively, which minimize the distance between X and the Boolean product of A and B in the squared Frobenius distance. We present a compact and two exponential size integer programs (IPs) for k-BMF and show that the compact IP has a weak linear programming (LP) relaxation, whereas the exponential size IPs have a stronger equivalent LP relaxation. We introduce a new objective function, which differs from the traditional squared Frobenius objective in attributing a weight to zero entries of the input matrix that is proportional to the number of times the zero is erroneously covered in a rank-k factorization. For one of the exponential size Ips, we describe a computational approach based on column generation. Experimental results on synthetic and real-world data sets suggest that our integer programming approach is competitive against available methods for k-BMF and provides accurate low-error factorizations.