-
作者:Lam, Henry
作者单位:University of Michigan System; University of Michigan
摘要:We study a worst-case approach to measure the sensitivity to model misspecification in the performance analysis of stochastic systems. The situation of interest is when only minimal parametric information is available on the form of the true model. Under this setting, we post optimization programs that compute the worst-case performance measures, subject to constraints on the amount of model misspecification measured by Kullback-Leibler divergence. Our main contribution is the development of i...
-
作者:Arlotto, Alessandro; Steele, J. Michael
作者单位:Duke University; University of Pennsylvania
摘要:We prove a central limit theorem for a class of additive processes that arise naturally in the theory of finite horizon Markov decision problems. The main theorem generalizes a classic result of Dobrushin for temporally nonhomogeneous Markov chains, and the principal innovation is that here the summands are permitted to depend on both the current state and a bounded number of future states of the chain. We show through several examples that this added flexibility gives one a direct path to asy...
-
作者:Yildiz, Sercan; Cornuejols, Gerard
作者单位:Carnegie Mellon University
摘要:For an integer linear program, Gomory's corner relaxation is obtained by ignoring the nonnegativity of the basic variables in a tableau formulation. In this paper, we do not relax these nonnegativity constraints. We generalize a classical result of Gomory and Johnson characterizing minimal cut-generating functions in terms of subadditivity, symmetry, and periodicity. Our result is based on the notion of generalized symmetry condition. We also prove a 2-slope theorem for extreme cut-generating ...
-
作者:Cominetti, Roberto; Torrico, Alfredo
作者单位:Universidad de Chile; Universidad de Chile
摘要:We investigate the use of risk measures and theories of choice to model risk-averse routing and traffic equilibrium on networks with random travel times. We interpret the postulates of these theories in the context of routing, identifying additive consistency as a plausible condition that allows to reduce risk-averse route choice to a standard shortest path problem. Within the classical theories of choice, we show that the only preferences that satisfy this condition are the ones induced by th...
-
作者:Gupta, Anupam; Molinaro, Marco
作者单位:Carnegie Mellon University; Pontificia Universidade Catolica do Rio de Janeiro
摘要:We consider the problem of solving packing/covering LPs online, when the columns of the constraint matrix are presented in random order. This problem has received much attention, and the main focus is to figure out how large the right-hand sides of the LPs have to be (compared to the entries on the left-hand side of the constraints) to allow (1 + epsilon)-approximations online. It is known that the right-hand sides have to be Omega (epsilon(-2) log m) times the left-hand sides, where m is the ...
-
作者:Flesch, Janos; Predtetchinski, Arkadi
作者单位:Maastricht University; Maastricht University
摘要:We prove the existence of a pure subgame-perfect epsilon-equilibrium, for every epsilon > 0, in multiplayer perfect information games, provided that the payoff functions are bounded and exhibit common preferences at the limit. If, in addition, the payoff functions have finite range, then there exists a pure subgame-perfect 0-equilibrium. These results extend and unify recent existence theorems for bounded and semicontinuous payoffs.
-
作者:Ziliotto, Bruno
作者单位:Universite PSL; Universite Paris-Dauphine
摘要:We prove a Tauberian theorem for nonexpansive operators and apply it to the model of zero-sum stochastic game. Under mild assumptions, we prove that the value of the lambda-discounted game converges uniformly when lambda goes to zero if and only if the value of the n-stage game converges uniformly when n goes to infinity. This generalizes the Tauberian theorem of Lehrer and Sorin [Lehrer E, Sorin S (1992) A uniform Tauberian theorem in dynamic programming. Math. Oper. Res. 17(2): 303-307] to t...
-
作者:Atar, Rami; Cohen, Asaf
作者单位:Technion Israel Institute of Technology; University of Michigan System; University of Michigan
摘要:We study a differential game that governs the moderate-deviation heavy-traffic asymptotics of a multiclass single-server queueing control problem with a risk-sensitive cost. We consider a cost set on a finite but sufficiently large time horizon, and show that this formulation leads to stationary feedback policies for the game. Several aspects of the game are explored, including its characterization via a (one-dimensional) free boundary problem, the semi-explicit solution of an optimal strategy...
-
作者:Mertikopoulos, Panayotis; Sandholm, William H.
作者单位:Communaute Universite Grenoble Alpes; Institut National Polytechnique de Grenoble; Universite Grenoble Alpes (UGA); Centre National de la Recherche Scientifique (CNRS); Communaute Universite Grenoble Alpes; Institut National Polytechnique de Grenoble; Universite Grenoble Alpes (UGA); Centre National de la Recherche Scientifique (CNRS); University of Wisconsin System; University of Wisconsin Madison
摘要:We investigate a class of reinforcement learning dynamics where players adjust their strategies based on their actions' cumulative payoffs over time-specifically, by playing mixed strategies that maximize their expected cumulative payoff minus a regularization term. A widely studied example is exponential reinforcement learning, a process induced by an entropic regularization term which leads mixed strategies to evolve according to the replicator dynamics. However, in contrast to the class of ...
-
作者:Guo, Xianping; Zhang, Yi
作者单位:Sun Yat Sen University; University of Liverpool
摘要:This article concerns the average criteria for continuous-time Markov decision processes with N constraints. We show the following; (a) every extreme point of the space of performance vectors corresponding to the set of stable measures is generated by a deterministic stationary policy; and (b) there exists a mixed optimal policy, where the mixture is over no more than N + 1 deterministic stationary policies.