-
作者:Guo, Xianping; Zhang, Yi
作者单位:Sun Yat Sen University; University of Liverpool
摘要:This article concerns the average criteria for continuous-time Markov decision processes with N constraints. We show the following; (a) every extreme point of the space of performance vectors corresponding to the set of stable measures is generated by a deterministic stationary policy; and (b) there exists a mixed optimal policy, where the mixture is over no more than N + 1 deterministic stationary policies.
-
作者:Cai, Yang; Candogan, Ozan; Daskalakis, Constantinos; Papadimitriou, Christos
作者单位:McGill University; University of Chicago; Massachusetts Institute of Technology (MIT); University of California System; University of California Berkeley
摘要:We show that in zero-sum polymatrix games, a multiplayer generalization of two-person zero-sum games, Nash equilibria can be found efficiently with linear programming. We also show that the set of coarse correlated equilibria collapses to the set of Nash equilibria. In contrast, other important properties of two-person zero-sum games are not preserved: Nash equilibrium payoffs need not be unique, and Nash equilibrium strategies need not be exchangeable or max-min.
-
作者:Beck, Amir; Hallak, Nadav
作者单位:Technion Israel Institute of Technology
摘要:We consider the problem of minimizing a general continuously differentiable function over symmetric sets under sparsity constraints. These type of problems are generally hard to solve because the sparsity constraint induces a combinatorial constraint into the problem, rendering the feasible set to be nonconvex. We begin with a study of the properties of the orthogonal projection operator onto sparse symmetric sets. Based on this study, we derive efficient methods for computing sparse projectio...
-
作者:Braverman, Mark; Chen, Jing; Kannan, Sampath
作者单位:Princeton University; State University of New York (SUNY) System; Stony Brook University; University of Pennsylvania; Institute for Advanced Study - USA; Princeton University
摘要:We investigate computational and mechanism design aspects of allocating medical treatments at hospitals of different costs to patients who each value these hospitals differently. The payer wants to ensure that the total cost of all treatments is at most the budget, B. Access to overdemanded hospitals is rationed through waiting times. We first show that optimizing social welfare in equilibrium is NP-hard. But if the number of hospitals is small and the budget can be relaxed to (1 + epsilon)B f...
-
作者:Cavazos-Cadena, Rolando; Hernandez-Hernandez, Daniel
作者单位:CIMAT - Centro de Investigacion en Matematicas
摘要:This work is concerned with finite-state irreducible Markov decision chains satisfying continuity-compactness requirements. It is supposed that the system is driven by a decision maker with utility function U, which, aside mild conditions, is arbitrary, and the performance of a control policy is measured by the long-run average cost criterion induced by U. The main conclusions about this performance index are as follows: (i) the optimal U-average value function coincides with the optimal V-ave...
-
作者:Lim, Shiau Hong; Xu, Huan; Mannor, Shie
作者单位:National University of Singapore; Technion Israel Institute of Technology
摘要:An important challenge in Markov decision processes (MDP) is to ensure robustness with respect to unexpected or adversarial system behavior. A standard paradigm to tackle this challenge is the robust MDP framework that models the parameters as arbitrary elements of pre-defined uncertainty sets, and seeks the minimax policy-the policy that performs the best under the worst realization of the parameters in the uncertainty set. A crucial issue of the robust MDP framework, largely unaddressed in l...
-
作者:von Stengel, Bernhard
作者单位:University of London; London School Economics & Political Science
摘要:We consider a sequential inspection game where an inspector uses a limited number of inspections over a larger number of time periods to detect a violation (an illegal act) of an inspectee. Compared with earlier models, we allow varying rewards to the inspectee for successful violations. As one possible example, the most valuable reward may be the completion of a sequence of thefts of nuclear material needed to build a nuclear bomb. The inspectee can observe the inspector, but the inspector ca...
-
作者:Dentcheva, Darinka; Wolfhagen, Eli
作者单位:Stevens Institute of Technology
摘要:We propose a two-stage risk-averse stochastic optimization problem with a stochastic-order constraint on a vector-valued function of the second-stage decisions. This model is motivated by a multiobjective second-stage problem. We formulate optimality conditions for the problem and analyse the Lagrangian relaxation of the order constraint. We propose two decomposition methods to solve the problems and prove their convergence. The methods are based on Lagrangian relaxation of the order constrain...
-
作者:Fleiner, Tamas; Kamiyama, Naoyuki
作者单位:Budapest University of Technology & Economics; Eotvos Lorand University; Kyushu University
摘要:In 2010, Huang introduced the laminar classified stable matching problem (LCSM for short) that is motivated by academic hiring. This problem is an extension of the well-known hospitals/residents problem in which a hospital has laminar classes of residents and it sets lower and upper bounds on the number of residents that it can hire in each class. Against the intuition that variations of the stable matching problem with lower quotas are difficult in general, Huang proved that LCSM can be solve...
-
作者:Bravo, Mario
作者单位:Universidad de Santiago de Chile
摘要:We study a simple adaptive model in the framework of an N -player normal form game. The model consists of a repeated game where the players only know their own action space and their own payoff scored at each stage, not those of the other agents. Each player, in order to update her mixed action, computes the average vector payoff she has obtained by using the number of times she has played each pure action. The resulting stochastic process is analyzed via the ODE method from stochastic approxi...