-
作者:Lugosi, Gabor; Mehrabian, Abbas
作者单位:Pompeu Fabra University; ICREA; McGill University
摘要:We study multiplayer stochastic multiarmed bandit problems in which the players cannot communicate, and if two or more players pull the same arm, a collision occurs and the involved players receive zero reward. We consider two feedback models: a model in which the players can observe whether a collision has occurred and a more difficult setup in which no collision information is available. We give the first theoretical guarantees for the second model: an algorithm with a logarithmic regret and...
-
作者:Feinberg, Eugene A.; Mandava, Manasa; Shiryaev, Albert N.
作者单位:State University of New York (SUNY) System; Stony Brook University; Indian School of Business (ISB); Russian Academy of Sciences; Steklov Mathematical Institute of the Russian Academy of Sciences
摘要:One of the basic facts known for discrete-time Markov decision processes is that, if the probability distribution of an initial state is fixed, then for every policy it is easy to construct a (randomized) Markov policy with the same marginal distributions of state-action pairs as for the original policy. This equality of marginal distributions implies that the values of major objective criteria, including expected discounted total costs and average rewards per unit time, are equal for these tw...
-
作者:Apt, Krzysztof R.; Simon, Sunil; Wojtczak, Dominik
作者单位:University of Warsaw; Indian Institute of Technology System (IIT System); Indian Institute of Technology (IIT) - Kanpur; University of Liverpool
摘要:We study strategic games on weighted directed graphs, where each player's payoff is defined as the sum of the weights on the edges from players who chose the same strategy, augmented by a fixed nonnegative integer bonus for picking a given strategy. These games capture the idea of coordination in the absence of globally common strategies. We identify natural classes of graphs for which finite improvement or coalition-improvement paths of polynomial length always exist, and consequently a (pure...
-
作者:Vardi, Shai; Psomas, Alexandros; Friedman, Eric
作者单位:Purdue University System; Purdue University; Purdue University System; Purdue University
摘要:A single homogeneous resource needs to be fairly shared between users that dynamically arrive and depart over time. Although good allocations exist for any fixed number of users, implementing these allocations dynamically is impractical: it typically entails adjustments in the allocation of every user in the system whenever a new user arrives. We introduce a dynamic fair resource division problem in which there is a limit on the number of users that can be disrupted when a new user arrives and...
-
作者:Sandholm, William H.; Tran, Hung, V; Arigapudi, Srinivas
作者单位:University of Wisconsin System; University of Wisconsin Madison; University of Wisconsin System; University of Wisconsin Madison
摘要:We characterize solutions of a class of time-homogeneous optimal control problems with semilinear running costs and state constraints as maximal viscosity sub-solutions to Hamilton-Jacobi equations and show that optimal solutions to these problems can be constructed explicitly. We present applications to large deviations problems arising in evolutionary game theory.
-
作者:Nikolov, Aleksandar; Singh, Mohit; Tantipongpipat, Uthaipon (Tao)
作者单位:University of Toronto; University System of Georgia; Georgia Institute of Technology
摘要:We study optimal design problems in which the goal is to choose a set of linear measurements to obtain the most accurate estimate of an unknown vector. We study the A-optimal design variant where the objective is to minimize the average variance of the error in the maximum likelihood estimate of the vector being measured. We introduce the proportional volume sampling algorithm to obtain nearly optimal bounds in the asymptotic regime when the number k of measurements made is significantly large...
-
作者:Lacker, Daniel; Soret, Agathe
作者单位:Columbia University
摘要:We study a class of linear-quadratic stochastic differential games in which each player interacts directly only with its nearest neighbors in a given graph. We find a semiexplicit Markovian equilibrium for any transitive graph, in terms of the empirical eigenvalue distribution of the graph's normalized Laplacian matrix. This facilitates large-population asymptotics for various graph sequences, with several sparse and dense examples discussed in detail. In particular, the mean field game is the...
-
作者:Hirai, Hiroshi; Sato, Ryosuke
作者单位:University of Tokyo
摘要:In this paper, we present a new model and mechanisms for auctions in two-sided markets of buyers and sellers, where budget constraints are imposed on buyers. Our model incorporates polymatroidal environments and is applicable to a variety of models that include multiunit auctions, matching markets, and reservation exchange markets. Our mechanisms are built on the polymatroidal network flow model by Lawler and Martel. Additionally, they feature nice properties such as the incentive compatibilit...
-
作者:Van Eynde, Rob; Vanhoucke, Mario
作者单位:Ghent University; Vlerick Business School; University of London; University College London
摘要:The resource-constrained project scheduling problem (RCPSP) addresses the problem of constructing a schedule with minimum makespan for a set of activities, subject to precedence and resource constraints. Recent research introduced a data set with small instances that cannot be solved by the state-of-the-art algorithms, revealing a gap in our understanding of instance complexity. We propose a new theoretical framework for the instance complexity for the RCPSP, consecutively incorporating preced...
-
作者:Chen, Xi; Chen, Yunxiao; Li, Xiaoou
作者单位:New York University; University of London; London School Economics & Political Science; University of Minnesota System; University of Minnesota Twin Cities
摘要:A sequential design problem for rank aggregation is commonly encountered in psychology, politics, marketing, sports, etc. In this problem, a decision maker is responsible for ranking K items by sequentially collecting noisy pairwise comparisons from judges. The decision maker needs to choose a pair of items for comparison in each step, decide when to stop data collection, and make a final decision after stopping based on a sequential flow of information. Because of the complex ranking structur...