-
作者:Mou, Wenlong; Pananjady, Ashwin; Wainwright, Martin J.
作者单位:University of California System; University of California Berkeley; University System of Georgia; Georgia Institute of Technology; Massachusetts Institute of Technology (MIT)
摘要:Linear fixed-point equations in Hilbert spaces arise in a variety of settings, including reinforcement learning, and computational methods for solving differential and integral equations. We study methods that use a collection of random observations to com-pute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space. First, we prove an instance-dependent upper bound on the mean-squared error for a linear stochastic approximation scheme that exploits Polyak...
-
作者:Tsodikovich, Yevgeny; Venel, Xavier; Zseleva, Anna
作者单位:Centre National de la Recherche Scientifique (CNRS); Aix-Marseille Universite; Bar Ilan University; Luiss Guido Carli University; Maastricht University
摘要:We study repeated zero-sum games where one of the players pays a certain cost each time he changes his action. We derive the properties of the value and optimal strategies as a function of the ratio between the switching costs and the stage payoffs. In particular, the strategies exhibit a robustness property and typically do not change with a small perturbation of this ratio. Our analysis extends partially to the case where the players are limited to simpler strategies that are history indepen...
-
作者:Paul, Alice; Freund, Daniel; Ferber, Aaron; Shmoys, David B.; Williamson, David P.
作者单位:Brown University; Massachusetts Institute of Technology (MIT); University of Southern California; Cornell University
摘要:There is an error in our paper [Paul A, Freund D, Ferber A, Shmoys DB, William-son DP (2020) Budgeted prize-collecting traveling salesman and minimum spanning tree problems. Math. Oper. Res. 45(2):576-590]. In that paper, we consider constrained versions of the prize-collecting traveling salesman and the prize-collecting minimum spanning tree problems. The goal is to maximize the number of vertices in the returned tour/tree subject to a bound on the tour/tree cost. Rooted variants of the probl...
-
作者:Kara, Ali Devran; Yuksel, Serdar
作者单位:University of Michigan System; University of Michigan; Queens University - Canada
摘要:In this paper, for partially observed Markov decision problems (POMDPs), we provide the convergence of a Q learning algorithm for control policies using a finite history of past observations and control actions, and consequentially, we establish near optimality of such limit Q functions under explicit filter stability conditions. We present explicit error bounds relating the approximation error to the length of the finite history window. We establish the convergence of such Q learning iteratio...
-
作者:Graf, Lukas; Harks, Tobias
作者单位:University of Passau
摘要:We consider flows over time within the deterministic queueing model of Vickrey and study the solution concept of instantaneous dynamic equilibrium (IDE), in which flow particles select at every decision point a currently shortest path. The length of such a path is measured by the physical travel time plus the time spent in queues. Although IDE has been studied since the eighties, the worst-case efficiency of the solution concept with respect to the travel time is not well understood. We study ...
-
作者:Zhou, Zhou
作者单位:University of Sydney
摘要:We consider mean-field contribution games, where players in a team choose some effort levels at each time period and the aggregate reward for the team depends on the aggregate cumulative performance of all the players. Each player aims to maximize the expected reward of her own share subject to her cost of effort. To reduce the free-rider issue, we propose some relative performance criteria (RPC), based on which the reward is redistributed to each player. We are interested in those RPCs that i...
-
作者:Klimm, Max; Pfetsch, Marc E.; Raber, Rico; Skutella, Martin
作者单位:Technical University of Berlin; Technical University of Darmstadt
摘要:We consider potential-based flow networks with terminal nodes at which flow can enter or leave the network and physical properties, such as voltages or pressures, are measured and controlled. We study conditions under which such a network can be reduced to a smaller, equivalent network with the same behavior at the terminal nodes. Potential-based flow networks are widely used to model infrastructure networks, such as electricity, gas, or water networks. In contrast to Kron's reduction for elec...
-
作者:Lee, Jon; Paat, Joseph; Stallknecht, Ingo; Xu, Luze
作者单位:University of Michigan System; University of Michigan; University of British Columbia
摘要:We study integer-valued matrices with bounded determinants. Such matrices appear in the theory of integer programs (IPs) with bounded determinants. For example, an IP can be solved in strongly polynomial time if the constraint matrix is bimodular: that is, the determinants are bounded in absolute value by two. Determinants are also used to bound the euro1 distance between IP solutions and solutions of its linear relaxation. One of the first to quantify the complexity of IPs with bounded determ...
-
作者:Garnier, Guillaume; Ziliotto, Bruno
作者单位:Universite Paris Cite; Inria; Sorbonne Universite; Universite PSL; Universite Paris-Dauphine; Centre National de la Recherche Scientifique (CNRS)
摘要:This paper introduces a discrete-time stochastic game class on Zd, which plays the role of a toy model for the well-known problem of stochastic homogenization of Hamilton-Jacobi equations. Conditions are provided under which the n-stage game value converges as n tends to infinity, and connections with homogenization theory are discussed.
-
作者:Bartl, Daniel; Tangpi, Ludovic
作者单位:University of Vienna; Princeton University
摘要:Let rho be a general law-invariant convex risk measure, for instance, the average value at risk, and let X be a financial loss, that is, a real random variable. In practice, either the true distribution mu of X is unknown, or the numerical computation of rho(mu) is not possible. In both cases, either relying on historical data or using a Monte Carlo approach, one can resort to an independent and identically distributed sample of mu to approximate rho(mu) by the finite sample estimator rho(mu N...