Entropy-Regularized Partially Observed Markov Decision Processes
成果类型:
Article
署名作者:
Molloy, Timothy L.; Nair, Girish N.
署名单位:
University of Melbourne; Australian National University
刊物名称:
IEEE TRANSACTIONS ON AUTOMATIC CONTROL
ISSN/ISSBN:
0018-9286
DOI:
10.1109/TAC.2023.3264177
发表日期:
2024
页码:
379-386
关键词:
entropy
COSTS
STANDARDS
uncertainty
State estimation
PROCESS CONTROL
Markov processes
Directed information
entropy
estimation
partially observed Markov decision process (POMDP)
摘要:
In this article, we investigate partially observed Markov decision processes (POMDPs) with cost functions regularized by entropy terms describing state, observation, and control uncertainty. Standard POMDP techniques are shown to offer bounded-error solutions to these entropy-regularized POMDPs, with exact solutions possible when the regularization involves the joint entropy of the state, observation, and control trajectories. Our joint-entropy result is particularly surprising since it constitutes a novel, tractable formulation of active state estimation.