Reinforcement Learning for Partially Observable Linear Gaussian Systems Using Batch Dynamics of Noisy Observations
成果类型:
Article
署名作者:
Yaghmaie, Farnaz Adib; Modares, Hamidreza; Gustafsson, Fredrik
署名单位:
Linkoping University; Michigan State University
刊物名称:
IEEE TRANSACTIONS ON AUTOMATIC CONTROL
ISSN/ISSBN:
0018-9286
DOI:
10.1109/TAC.2024.3385680
发表日期:
2024
页码:
6397-6404
关键词:
costs
HISTORY
noise
dynamical systems
Noise measurement
Heuristic algorithms
data models
Linear quadratic Gaussian
partiially observable dynamical systems
Reinforcement Learning
摘要:
Reinforcement learning algorithms are commonly used to control dynamical systems with measurable state variables. If the dynamical system is partially observable, reinforcement learning algorithms are modified to compensate for the effect of partial observability. One common approach is to feed a finite history of input-output data instead of the state variable. In this article, we study and quantify the effect of this approach in linear Gaussian systems with quadratic costs. We coin the concept of L-Extra-Sampled-dynamics to formalize the idea of using a finite history of input-output data instead of state and show that this approach increases the average cost.