Multiagent Low-Dimensional Linear Bandits

成果类型:
Article
署名作者:
Chawla, Ronshee; Sankararaman, Abishek; Shakkottai, Sanjay
署名单位:
University of Texas System; University of Texas Austin; University of California System; University of California Berkeley; University of Texas System; University of Texas Austin
刊物名称:
IEEE TRANSACTIONS ON AUTOMATIC CONTROL
ISSN/ISSBN:
0018-9286
DOI:
10.1109/TAC.2022.3179521
发表日期:
2023
页码:
2667-2682
关键词:
Collaboration Servers Stochastic processes Sparse matrices Advertising Postal services information sharing Decentralized learning gossip linear bandits networks Regret minimization
摘要:
We study a multiagent stochastic linear bandit with side information, parameterized by an unknown vector 0(*) ? R-d. The side information consists of a finite collection of low-dimensional subspaces, one of which contains 0(*). In our setting, agents can collaborate to reduce regret by sending recommendations across a communication graph connecting them. We present a novel decentralized algorithm, where agents communicate subspace indices with each other and each agent plays a projected variant of LinUCB on the corresponding (low dimensional) subspace. By distributing the search for the optimal subspace across users and learning of the unknown vector by each agent in the corresponding low-dimensional subspace, we show that the per-agent finite-time regret is much smaller than the case when agents do not communicate. We finally complement these results through simulations.