您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > Operations Research > 2024 > 3期

A Low-Rank Approximation for MDPs via Moment Coupling

成果类型：

Article

署名作者：

Zhang, Amy B. Z.; Gurvich, Itai

署名单位：

Cornell University; Northwestern University

刊物名称：

OPERATIONS RESEARCH

ISSN/ISSBN：

0030-364X

DOI：

10.1287/opre.2022.2392

发表日期：

2024

页码：

1255-1277

关键词：

policy-iteration aggregation complexity algorithm

摘要：

We introduce a framework to approximate Markov decision processes (MDPs) that stands on two pillars: (i) state aggregation, as the algorithmic infrastructure, and (ii) central-limit-theorem-type approximations, as the mathematical underpinning of optimality guarantees. The theory is grounded in recent work by Braverman et al. (2020) that relates the solution of the Bellman equation to that of a partial differential equation (PDE) where, in the spirit of the central limit theorem, the transition matrix is reduced to its local first and second moments. Solving the PDE is not required by our method. Instead, we construct a sister (controlled) Markov chain whose two local transition moments are approximately identical with those of the focal chain. Because of this moment matching, the original chain and its sister are coupled through the PDE, a coupling that facilitates optimality guarantees. Embedded into standard soft aggregation algorithms, moment matching provides a disciplined mechanism to tune the aggregation and disaggregation probabilities. Computational gains arise from the reduction of the effective state space from N to N12+e is as one might intuitively expect from approximations grounded in the central limit theorem.

来源URL：

访问原文