-
作者:Lim, Shiau Hong; Xu, Huan; Mannor, Shie
作者单位:National University of Singapore; Technion Israel Institute of Technology
摘要:An important challenge in Markov decision processes (MDP) is to ensure robustness with respect to unexpected or adversarial system behavior. A standard paradigm to tackle this challenge is the robust MDP framework that models the parameters as arbitrary elements of pre-defined uncertainty sets, and seeks the minimax policy-the policy that performs the best under the worst realization of the parameters in the uncertainty set. A crucial issue of the robust MDP framework, largely unaddressed in l...
-
作者:Bravo, Mario
作者单位:Universidad de Santiago de Chile
摘要:We study a simple adaptive model in the framework of an N -player normal form game. The model consists of a repeated game where the players only know their own action space and their own payoff scored at each stage, not those of the other agents. Each player, in order to update her mixed action, computes the average vector payoff she has obtained by using the number of times she has played each pure action. The resulting stochastic process is analyzed via the ODE method from stochastic approxi...
-
作者:Mannor, Shie; Mebel, Ofir; Xu, Huan
作者单位:Technion Israel Institute of Technology; Alphabet Inc.; Google Incorporated; National University of Singapore
摘要:Markov decision processes are a common tool for modeling sequential planning problems under uncertainty. In almost all realistic situations, the system model cannot be perfectly known and must be approximated or estimated. Thus, we consider Markov decision processes under parameter uncertainty, which effectively adds a second layer of uncertainty. Most previous studies restrict to the case that uncertainties among different states are uncoupled, which leads to conservative solutions. On the ot...
-
作者:Ye, Heng-Qing; Yao, David D.
作者单位:Hong Kong Polytechnic University; Columbia University
摘要:We study a resource-sharing network where each job requires the concurrent occupancy of a subset of links (servers/resources), and each link's capacity is shared among job classes that require its service. The real-time allocation of the service capacity among job classes is determined by the so-called proportional fair scheme, which allocates the capacity among job classes taking into account the queue lengths and the shadow prices of link capacity. We show that the usual traffic condition is...
-
作者:Gfrerer, Helmut; Outrata, Jiri V.
作者单位:Johannes Kepler University Linz; Czech Academy of Sciences; Institute of Information Theory & Automation of the Czech Academy of Sciences; Federation University Australia
摘要:The paper concerns the computation of the graphical derivative and the regular (Frechet) coderivative of the normal-cone mapping related to C-2 inequality constraints under very weak qualification conditions. This enables us to provide the graphical derivative and the regular coderivative of the solution map to a class of parameterized generalized equations with the constraint set of the investigated type. On the basis of these results, we finally obtain a characterization of the isolated calm...
-
作者:Adelman, Daniel; Mancini, Angelo J.
作者单位:University of Chicago
摘要:Quasi-open-loop policies consist of sequences of Markovian decision rules that are insensitive to one component of the state space. Given a semi-Markov decision process (SMDP), we distinguish between exogenous and endogenous state components as follows: (i) the decision-maker's actions do not impact the evolution of an exogenous state component, and (ii) between consecutive decision epochs, the exogenous and endogenous state components are conditionally independent given the decision-maker's l...
-
作者:Martyr, Randall
作者单位:University of London; Queen Mary University London
摘要:This paper is concerned with optimal switching over multiple modes in continuous time and on a finite horizon. The performance index includes a running reward, terminal reward, and switching costs that can belong to a large class of stochastic processes. Particularly, the switching costs are modelled by right-continuous with left-limits processes that are quasi-left-continuous and can take positive and negative values. We provide sufficient conditions leading to a well known probabilistic repr...