-
作者:Bladt, Martin; Peraltab, Oscar
作者单位:University of Copenhagen; Cornell University
摘要:The study of time-inhomogeneous Markov jump processes is a traditional topic within probability theory that has recently attracted substantial attention in various applications. However, their flexibility also incurs a substantial mathematical burden which is usually circumvented by using well-known generic distributional approximations or simulations. This article provides a novel approximation method that tailors the dynamics of a time-homogeneous Markov jump process to meet those of its tim...
-
作者:Tabri, Rami
作者单位:Monash University
摘要:Relative entropy minimization is a widely used method in decisions and operations research that incorporates information through constraints on the underlying probability model. The solution is called information projection, and we present new results for its existence, exponential family representation, and approximation in the infinite-dimensional setting for moment inequality constraint sets, nesting both conditional and unconditional moments and allowing for an infinite number of such ineq...
-
作者:Semirat, Stephan; Forges, Francoise
作者单位:Communaute Universite Grenoble Alpes; Universite Grenoble Alpes (UGA); INRAE; Centre National de la Recherche Scientifique (CNRS); Institut National Polytechnique de Grenoble; Universite PSL; Universite Paris-Dauphine; Centre National de la Recherche Scientifique (CNRS); Institut de Recherche pour le Developpement (IRD); Laboratoire dEconomie de Dauphine LEDa
摘要:We consider information transmission between a sender, who has finitely many types, and a receiver, who must choose a decision in a real interval. The payoffs depend on the sender's type and the receiver's decision. We assume that the payoff functions are wellbehaved. We characterize the pure strategy perfect Bayesian equilibrium outcomes as incentive-compatible partitions of the sender's types. We propose an algorithm, which starts from the finest partition. Then, at every step, if the curren...
-
作者:Hajiabolhassan, Hossein; Ortner, Ronald
作者单位:Medical University of Graz; University of Leoben
摘要:We consider general reinforcement learning under the average reward criterion in Markov decision processes (MDPs), when the learner's goal is not to learn an optimal policy, but accepts any policy whose average reward is above a given satisfaction level a. We show that with this more modest objective, it is possible to give algorithms that only have constant regret with respect to the level a, provided that there is a policy above this level. This is a generalization of known results from the ...
-
作者:Fujishie, Satoru; Yang, Zaifu
作者单位:Kyoto University; University of York - UK
摘要:We propose a novel strategy-proof dynamic auction for efficiently allocating heterogeneous indivisible commodities. The auction applies to all unimodular demand types of Baldwin and Klemperer's necessary and sufficient condition for the existence of competitive equilibrium which accommodate a wide variety of complements, substitutes, gross substitutes and complements, and any other kinds. Although bidders are not assumed to be price takers so they can act strategically, this auction induces bi...
-
作者:Brustle, Johannes; Correa, Jose; Dutting, Paul; Ezra, Tomer; Feldman, Michal; Verdugo, Victor
作者单位:Sapienza University Rome; Universidad de Chile; Harvard University; Tel Aviv University; Pontificia Universidad Catolica de Chile; Pontificia Universidad Catolica de Chile
摘要:We study the classic single-choice prophet inequality problem through a resource augmentation lens. Our goal is to bound the (1- E)-competition complexity of different types of online algorithms. This metric asks for the smallest k such that the expected value of the online algorithm on k copies of the original instance is at least a (1 - E)-approximation to the expected off-line optimum on a single copy. We show that block threshold algorithms, which set one threshold per copy, are optimal an...
-
作者:Garcia-Segador, Pedro; Grabisch, Michel; Miranda, Pedro
作者单位:Paris School of Economics; Complutense University of Madrid
摘要:We study the geometric structure of the set of cooperative transferable utility games having a nonempty core, characterized by Bondareva and Shapley as balanced games. We show that this set is a nonpointed polyhedral cone, and we find the set of its extremal rays and facets. This study is also done for the set of balanced games whose value for the grand coalition is fixed, which yields an affine nonpointed polyhedral cone. Finally, the case of nonnegative balanced games with fixed value for th...
-
作者:Grand-Clement, Julien; Petrik, Marek
作者单位:University System Of New Hampshire; University of New Hampshire
摘要:Robust Markov decision processes (MDPs) are used for applications of dynamic optimization in uncertain environments and have been studied extensively. Many of the main properties and algorithms of MDPs, such as value iteration and policy iteration, extend directly to RMDPs. Surprisingly, there is no known analog of the MDP convex optimization formulation for solving RMDPs. This work describes the first convex optimization formulation of RMDPs under the classical sa-rectangularity and s-rectang...
-
作者:Jin, Xinghu; Pang, Guodong; Xu, Lihu; Xu, Xin
作者单位:Hefei University of Technology; University of Macau; Rice University; South China Normal University
摘要:In this paper, we develop a stochastic algorithm based on the Euler-Maruyama scheme to approximate the invariant measure of the limiting multidimensional diffusion of G/Ph/n + GI queues in the Halfin-Whitt regime. Specifically, we prove a nonasymptotic error bound between the invariant measures of the approximate model from the algorithm and the limiting diffusion. To establish the error bound, we employ the recently developed Stein's method for multidimensional diffusions, in which the regula...
-
作者:Bazhba, Mihail; Blanchet, Jose; Rhee, Chang-Han; Zwart, Bert
作者单位:University of Amsterdam; Stanford University; Northwestern University; Eindhoven University of Technology
摘要:We prove a sample -path large deviation principle (LDP) with sublinear speed for unbounded functionals of certain Markov chains induced by the Lindley recursion. The LDP holds in the Skorokhod space D[0, 1] equipped with the M ' 1 topology. Our technique hinges on a suitable decomposition of the Markov chain in terms of regeneration cycles. Each regeneration cycle denotes the area accumulated during the busy period of the reflected random walk. We prove a large deviation principle for the area...