您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > Operations Research > 2025 > 1期

Online Matching with Bayesian Rewards

成果类型：

Article

署名作者：

Simchi-Levi, David; Sun, Rui; Wang, Xinshang

署名单位：

Massachusetts Institute of Technology (MIT); Massachusetts Institute of Technology (MIT); Massachusetts Institute of Technology (MIT); Shanghai Jiao Tong University

刊物名称：

OPERATIONS RESEARCH

ISSN/ISSBN：

0030-364X

DOI：

10.1287/opre.2021.0499

发表日期：

2025

关键词：

approximation algorithms stochastic knapsack restless bandits index

摘要：

We study in this paper an online matching problem where a central platform needs to match a number of limited resources to different groups of users that arrive sequentially over time. The reward of each matching option depends on both the type of resource and the time period the user arrives. The matching rewards are assumed to be unknown but drawn from probability distributions that are known a priori. The platform then needs to learn the true rewards online based on real-time observations of the matching results. The goal of the central platform is to maximize the total reward from all of the matchings without violating the resource capacity constraints. We formulate this matching problem with Bayesian rewards as a Markovian multiarmed bandit problem with budget constraints, where each arm corresponds to a pair of a resources and a time period. We devise our algorithm by first finding policies for each single arm separately via a relaxed linear program and then assembling these policies together through judicious selection criteria and well-designed pulling orders. We prove that the expected reward of our algorithm is at least 1 2 ( 2 � 1) of the expected reward of an optimal algorithm.

来源URL：

访问原文