Multiagent Online Source Seeking Using Bandit Algorithm
成果类型:
Article
署名作者:
Du, Bin; Qian, Kun; Claudel, Christian; Sun, Dengfeng
署名单位:
Nanjing University of Aeronautics & Astronautics; University of Texas System; University of Texas Austin; Purdue University System; Purdue University
刊物名称:
IEEE TRANSACTIONS ON AUTOMATIC CONTROL
ISSN/ISSBN:
0018-9286
DOI:
10.1109/TAC.2022.3232190
发表日期:
2023
页码:
3147-3154
关键词:
Heuristic algorithms
estimation
Upper bound
Noise measurement
Position measurement
Kalman filters
Task analysis
adaptive learning
Multi-agent systems
upper confidence bound
摘要:
This article presents a learning-based algorithm for solving the online source-seeking problem with a multiagent system under an unknown dynamical environment. Our algorithm, building on a notion termed as dummy confidence upper bound (D-UCB), integrates both estimation of the unknown environment and task planning for the multiple agents simultaneously, and as a result, enables the multiple agents to track the extremum spots of the dynamical environment in an online manner. Unlike the standard confidence upper bound algorithm in the context of multiarmed bandits, the notion of D-UCB helps significantly reduce the computational complexity in solving the subproblems of task planning, and thus renders our algorithm exceptionally computation-efficient in the distributed setting. The performance of our algorithm is theoretically guaranteed by showing a sublinear upper bound of the cumulative regret. Numerical results on a real-world pollution monitoring and tracking problem are also provided to demonstrate the effectiveness of the algorithm.
来源URL: