-
作者:Ziliotto, Bruno
作者单位:Universite PSL; Universite Paris-Dauphine
摘要:We prove a Tauberian theorem for nonexpansive operators and apply it to the model of zero-sum stochastic game. Under mild assumptions, we prove that the value of the lambda-discounted game converges uniformly when lambda goes to zero if and only if the value of the n-stage game converges uniformly when n goes to infinity. This generalizes the Tauberian theorem of Lehrer and Sorin [Lehrer E, Sorin S (1992) A uniform Tauberian theorem in dynamic programming. Math. Oper. Res. 17(2): 303-307] to t...
-
作者:Seel, Christian; Strack, Philipp
作者单位:Maastricht University; University of California System; University of California Berkeley
摘要:This paper introduces a class of contest models in which each player decides when to stop a privately observed Brownian motion with drift and incurs costs depending on his stopping time. The player who stops his process at the highest value wins a prize. We prove existence and uniqueness of a Nash equilibrium outcome and derive the equilibrium distribution in closed form. As the variance tends to zero, the equilibrium outcome converges to the symmetric equilibrium of an all-pay auction. For tw...
-
作者:Matsui, Tomomi; Ano, Katsunori
作者单位:Institute of Science Tokyo; Tokyo Institute of Technology; Shibaura Institute of Technology
摘要:This paper addresses Bruss' odds problem with multiple stopping chances. A decision maker sequentially observes a sequence of independent 0/1 (failure/success) random variables to correctly predict the last success with multiple stopping chances. First, we give a nontrivial lower bound of the probability of win (obtaining the last success) for the problem with m-stoppings. Next, we show that the asymptotic value for each classical secretary problem with multiple stoppings attains our lower bou...
-
作者:Friggstad, Zachary; Gupta, Anupam; Singh, Mohit
作者单位:University of Alberta; Carnegie Mellon University; Microsoft
摘要:The asymmetric traveling salesperson path problem (ATSPP) is one where, given an asymmetric metric space (V, d) with specified vertices s and t, the goal is to find an s-t path of minimum length that passes through all the vertices in V. This problem is closely related to the asymmetric TSP (ATSP), which seeks to find a tour (instead of an s-t path) visiting all the nodes: for ATSP, a rho-approximation guarantee implies an O(rho)-approximation for ATSPP. However, no such connection is known fo...
-
作者:Pflug, Georg Ch.; Pichler, Alois
作者单位:University of Vienna; International Institute for Applied Systems Analysis (IIASA); Norwegian University of Science & Technology (NTNU)
摘要:In management and planning it is commonplace for additional information to become available gradually over time. It is well known that most risk measures (risk functionals) are time inconsistent in the following sense: it may happen that at a given time period, some loss distribution appears to be less risky than another one, but looking at the conditional distribution at a later time, the opposite relation holds almost surely. The extended conditional risk functionals introduced in this paper...
-
作者:Epstein, Leah; Levin, Asaf; van Stee, Rob
作者单位:University of Haifa; Technion Israel Institute of Technology; University of Leicester
摘要:We present a unified framework for designing deterministic monotone polynomial time approximation schemes (PTASs) for a wide class of scheduling problems on uniformly related machines. This class includes (among others) minimizing the makespan, maximizing the minimum load, and minimizing the p-norm of the machine loads vector. Previously, this kind of result was only known for the makespan objective. Monotone algorithms have the property that an increase in the speed of a machine cannot decrea...
-
作者:Atar, Rami; Cohen, Asaf
作者单位:Technion Israel Institute of Technology; University of Michigan System; University of Michigan
摘要:We study a differential game that governs the moderate-deviation heavy-traffic asymptotics of a multiclass single-server queueing control problem with a risk-sensitive cost. We consider a cost set on a finite but sufficiently large time horizon, and show that this formulation leads to stationary feedback policies for the game. Several aspects of the game are explored, including its characterization via a (one-dimensional) free boundary problem, the semi-explicit solution of an optimal strategy...
-
作者:Jaskiewicz, Anna; Nowak, Andrzej S.
作者单位:Wroclaw University of Science & Technology; University of Zielona Gora
摘要:In this paper, we study discounted stochastic games with Borel state and compact action spaces depending on the state variable. The primitives of our model satisfy standard continuity and measurability conditions. The transition probability is a convex combination of finitely many probability measures depending on states, and it is dominated by some finite measure on the state space. The coefficients of the combination depend on both states and action profiles. This class of models contains st...
-
作者:Mertikopoulos, Panayotis; Sandholm, William H.
作者单位:Communaute Universite Grenoble Alpes; Institut National Polytechnique de Grenoble; Universite Grenoble Alpes (UGA); Centre National de la Recherche Scientifique (CNRS); Communaute Universite Grenoble Alpes; Institut National Polytechnique de Grenoble; Universite Grenoble Alpes (UGA); Centre National de la Recherche Scientifique (CNRS); University of Wisconsin System; University of Wisconsin Madison
摘要:We investigate a class of reinforcement learning dynamics where players adjust their strategies based on their actions' cumulative payoffs over time-specifically, by playing mixed strategies that maximize their expected cumulative payoff minus a regularization term. A widely studied example is exponential reinforcement learning, a process induced by an entropic regularization term which leads mixed strategies to evolve according to the replicator dynamics. However, in contrast to the class of ...
-
作者:Bouchard, Bruno; Nutz, Marcel
作者单位:Universite PSL; Universite Paris-Dauphine; Institut Polytechnique de Paris; ENSAE Paris; Columbia University; Columbia University
摘要:We study a class of stochastic target games where one player tries to find a strategy such that the state process almost surely reaches a given target, no matter which action is chosen by the opponent. Our main result is a geometric dynamic programming principle, which allows us to characterize the value function as the viscosity solution of a nonlinear partial differential equation. Because abstract measurable selection arguments cannot be used in this context, the main obstacle is the constr...