Stochastic Linear Quadratic Optimal Control Problem: A Reinforcement Learning Method

成果类型:
Article
署名作者:
Li, Na; Li, Xun; Peng, Jing; Xu, Zuo Quan
署名单位:
Shandong University of Finance & Economics; Hong Kong Polytechnic University
刊物名称:
IEEE TRANSACTIONS ON AUTOMATIC CONTROL
ISSN/ISSBN:
0018-9286
DOI:
10.1109/TAC.2022.3181248
发表日期:
2022
页码:
5009-5016
关键词:
Optimal control Stochastic processes Heuristic algorithms trajectory mathematics mathematical models Riccati equations Linear quadratic (LQ) problem reinforcement learning (RL) stochastic optimal control
摘要:
This article adopts a reinforcement learning (RL) method to solve infinite horizon continuous-time stochastic linear quadratic problems, where the drift and diffusion terms in the dynamics may depend on both the state and control. Based on the Bellman's dynamic programming principle, we presented an online RL algorithm to attain optimal control with partial system information. This algorithm computes the optimal control, rather than estimates the system coefficients, and solves the related Riccati equation. It only requires local trajectory information, which significantly simplifies the calculation process. We shed light on our theoretical findings using two numerical examples.