您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > IEEE Transactions on Automatic Control > 2023 > 8期

Structural Estimation of Partially Observable Markov Decision Processes

成果类型：

Article

署名作者：

Chang, Yanling; Garcia, Alfredo; Wang, Zhide; Sun, Lu

署名单位：

Texas A&M University System; Texas A&M University College Station; Texas A&M University System; Texas A&M University College Station

刊物名称：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

ISSN/ISSBN：

0018-9286

DOI：

10.1109/TAC.2022.3217908

发表日期：

2023

页码：

5135-5141

关键词：

Dynamic Programming Maximum likelihood estimation observability

摘要：

Partially observable Markov decision processes (POMDPs) is a well-developed framework for sequential decision-making under uncertainty and partial information. This article considers the (inverse) structural estimation of the primitives of a POMDP based upon data in the form of sequences of observables and implemented actions. We analyze the structural properties of an entropy regularized POMDP and specify conditions under which the model is identifiable without knowledge of the state dynamics. We consider a soft policy gradient algorithm to compute a maximum likelihood estimator, and illustrate the approach with an equipment replacement problem.