Multiagent Low-Dimensional Linear Bandits
成果类型:
Article
署名作者:
Chawla, Ronshee; Sankararaman, Abishek; Shakkottai, Sanjay
署名单位:
University of Texas System; University of Texas Austin; University of California System; University of California Berkeley; University of Texas System; University of Texas Austin
刊物名称:
IEEE TRANSACTIONS ON AUTOMATIC CONTROL
ISSN/ISSBN:
0018-9286
DOI:
10.1109/TAC.2022.3179521
发表日期:
2023
页码:
2667-2682
关键词:
Collaboration
Servers
Stochastic processes
Sparse matrices
Advertising
Postal services
information sharing
Decentralized learning
gossip
linear bandits
networks
Regret minimization
摘要:
We study a multiagent stochastic linear bandit with side information, parameterized by an unknown vector 0(*) ? R-d. The side information consists of a finite collection of low-dimensional subspaces, one of which contains 0(*). In our setting, agents can collaborate to reduce regret by sending recommendations across a communication graph connecting them. We present a novel decentralized algorithm, where agents communicate subspace indices with each other and each agent plays a projected variant of LinUCB on the corresponding (low dimensional) subspace. By distributing the search for the optimal subspace across users and learning of the unknown vector by each agent in the corresponding low-dimensional subspace, we show that the per-agent finite-time regret is much smaller than the case when agents do not communicate. We finally complement these results through simulations.