-
作者:Freund, Daniel; Zhao, Jiayu (Kamessi)
作者单位:Massachusetts Institute of Technology (MIT); Massachusetts Institute of Technology (MIT)
摘要:We study a classic problem in revenue management: quantity-based, single resource revenue management with no-shows. In this problem, a firm observes a sequence of T customers requesting a service. Each arrival is drawn independently from a known distribution of k different types, and the firm needs to decide irrevocably whether to accept or reject requests in an online fashion. The firm has a capacity of resources B and wants to maximize its profit. Each accepted service request yields a type-...
-
作者:Xie, Qiaomin; Chen, Yudong; Wang, Zhaoran; Yang, Zhuoran
作者单位:University of Wisconsin System; University of Wisconsin Madison; University of Wisconsin System; University of Wisconsin Madison; Northwestern University; Yale University
摘要:We develop provably efficient reinforcement learning algorithms for two-player zero-sum finite-horizon Markov games with simultaneous moves. To incorporate function approximation, we consider a family of Markov games where the reward function and transition kernel possess a linear structure. Both the offline and online settings of the problems are considered. In the offline setting, we control both players and aim to find the Nash equilibrium by minimizing the duality gap. In the online settin...
-
作者:Bonamy, Marthe; Pilipczuk, Michal; Sereni, Jean-Sebastien; Weber, Richard
作者单位:Universite de Bordeaux; Centre National de la Recherche Scientifique (CNRS); University of Warsaw; University of Cambridge
摘要:We consider a classic rendezvous game in which two players try to meet each other on a set of n locations. In each round, every player visits one of the locations, and the game finishes when the players meet at the same location. The goal is to devise strategies for both players that minimize the expected waiting time till the rendezvous. In the asymmetric case, when the strategies of the players may differ, it is known that the optimum expected waiting time of (n + 1)/2 is achieved by the wai...
-
作者:Chaudhury, Bhaskar Ray; Garg, Jugal; McGlaughlin, Peter; Mehta, Ruta
作者单位:University of Illinois System; University of Illinois Urbana-Champaign
摘要:We study the fair division problem of allocating a mixed manna under additively separable piecewise linear concave (SPLC) utilities. A mixed manna contains goods that everyone likes and bads (chores) that everyone dislikes as well as items that some like and others dislike. The seminal work of Bogomolnaia et al. argues why allocating a mixed manna is genuinely more complicated than a good or a bad manna and why competitive equilibrium is the best mechanism. It also provides the existence of eq...
-
作者:Flesch, Janos; Predtetchinski, Arkadi; Suomala, Ville
作者单位:Maastricht University; Maastricht University; University of Oulu
摘要:The paper proposes a natural measure space of zero-sum perfect information games with upper semicontinuous payoffs. Each game is specified by the game tree and by the assignment of the active player and the capacity to each node of the tree. The payoff in a game is defined as the infimum of the capacity over the nodes that have been visited during the play. The active player, the number of children, and the capacity are drawn from a given joint distribution independently across the nodes. We c...
-
作者:Boodaghians, Shant; Fusco, Federico; Lazos, Philip; Leonardi, Stefano
作者单位:University of Illinois System; University of Illinois Urbana-Champaign; Sapienza University Rome
摘要:The Pandora's Box problem, originally posed by Weitzman in 1979, models selection from a set of random-valued alternatives-the boxes-when evaluation is costly. Weitzman showed that the Pandora's Box problem admits a simple threshold-based solution that considers the options in decreasing order of reservation value, a proxy for the actual value of the boxes in the exploration process. We study for the first time this problem when the order in which the boxes are opened is constrained, forcing t...
-
作者:Mou, Wenlong; Pananjady, Ashwin; Wainwright, Martin J.
作者单位:University of California System; University of California Berkeley; University System of Georgia; Georgia Institute of Technology; Massachusetts Institute of Technology (MIT)
摘要:Linear fixed-point equations in Hilbert spaces arise in a variety of settings, including reinforcement learning, and computational methods for solving differential and integral equations. We study methods that use a collection of random observations to com-pute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space. First, we prove an instance-dependent upper bound on the mean-squared error for a linear stochastic approximation scheme that exploits Polyak...
-
作者:Tsodikovich, Yevgeny; Venel, Xavier; Zseleva, Anna
作者单位:Centre National de la Recherche Scientifique (CNRS); Aix-Marseille Universite; Bar Ilan University; Luiss Guido Carli University; Maastricht University
摘要:We study repeated zero-sum games where one of the players pays a certain cost each time he changes his action. We derive the properties of the value and optimal strategies as a function of the ratio between the switching costs and the stage payoffs. In particular, the strategies exhibit a robustness property and typically do not change with a small perturbation of this ratio. Our analysis extends partially to the case where the players are limited to simpler strategies that are history indepen...
-
作者:Yu, Lun; Perry, Ohad
作者单位:Tsinghua University; Northwestern University
摘要:We characterize heavy-traffic process and steady-state limits for systems staffed according to the square-root safety rule, when the service requirements of the customers are perfectly correlated with their individual patience for waiting in queue. Under the usual many-server diffusion scaling, we show that the system is asymptotically equivalent to a system with no abandonment. In particular, the limit is the Halfin-Whitt diffusion for the M/M/n queue when the traffic intensity approaches its...
-
作者:Rutten, Daan; Mukherjeea, Debankur
作者单位:University System of Georgia; Georgia Institute of Technology
摘要:Consider a system with N identical single-server queues and a number of task types, where each server is able to process only a small subset of possible task types. Arriving tasks select d >= 2 random compatible servers and join the shortest queue among them. The compatibility constraints are captured by a fixed bipartite graph between the servers and the task types. When the graph is complete bipartite, the mean-field approximation is accurate. However, such dense compatibility graphs are inf...