-
作者:Russo, Daniel; Van Roy, Benjamin
作者单位:Columbia University; Stanford University; Stanford University
摘要:Much of the recent literature on bandit learning focuses on algorithms that aim to converge on an optimal action. One shortcoming is that this orientation does not account for time sensitivity, which can play a crucial role when learning an optimal action requires much more information than near-optimal ones. Indeed, popular approaches, such as upper-confidence-bound methods and Thompson sampling, can fare poorly in such situations. We consider instead learning a satisficing action, which is n...
-
作者:Gradwohl, Ronen; Hahn, Niklas; Hoefer, Martin; Smorodinsky, Rann
作者单位:Ariel University; Goethe University Frankfurt; Technion Israel Institute of Technology
摘要:The Bayesian persuasion paradigm of strategic communication models interaction between a privately informed sender and an ignorant but rational receiver. The goal is typically to design a (near-)optimal communication (or signaling) scheme for the sender. It enables the sender to disclose information to the receiver in a way as to incentivize her to take an action that is preferred by the sender. Finding the optimal signaling scheme is known to be computationally difficult in general. This hard...
-
作者:Necoara, Ion; Fercoq, Olivier
作者单位:National University of Science & Technology POLITEHNICA Bucharest; Romanian Academy; Institute of Mathematical Statistics & Applied Mathematics of Romanian Academy; IMT - Institut Mines-Telecom; Institut Polytechnique de Paris; Telecom Paris
摘要:This paper deals with constrained convex problems, where the objective function is smooth and strongly convex, and the feasible set is described by a large number of closed convex (possibly nonpolyhedral) sets. In order to deal efficiently with the complicated constraints, we consider a dual formulation of this problem. We prove that the corresponding dual function satisfies a quadratic growth property on any sublevel set, provided that the primal objective function is smooth and strongly conv...
-
作者:Mueller, David; Nesterov, Yurii; Shikhman, Vladimir
作者单位:Technische Universitat Chemnitz; Universite Catholique Louvain
摘要:We derive new prox-functions on the simplex from additive random utility models of discrete choice. They are convex conjugates of the corresponding surplus functions. In particular, we explicitly derive the convexity parameter of discrete choice prox-functions associated with generalized extreme value models, and specifically with generalized nested logit models. Incorporated into subgradient schemes, discrete choice prox-functions lead to a probabilistic interpretations of the iteration steps...
-
作者:Aid, Rene; Possamai, Dylan; Touzi, Nizar
作者单位:Universite PSL; Universite Paris-Dauphine; Swiss Federal Institutes of Technology Domain; ETH Zurich; Institut Polytechnique de Paris; Ecole Polytechnique; ENSTA Paris
摘要:Demand response programs in retail electricity markets are very popular. However, despite their success in reducing average consumption, the random responsiveness of consumers to price events makes their efficiency questionable to achieve the flexibility needed for electric systems with a large share of renewable energy. This paper aims at designing demand response contracts that allow to act on both the average consumption and its variance. The interaction between a risk-averse producer and a...
-
作者:Lykouris, Thodoris; Sridharan, Karthik; Tardos, Eva
作者单位:Massachusetts Institute of Technology (MIT); Cornell University
摘要:We consider the problem of adversarial (nonstochastic) online learning with partial-information feedback, in which, at each round, a decision maker selects an action from a finite set of alternatives. We develop a black-box approach for such problems in which the learner observes as feedback only losses of a subset of the actions that includes the selected action. When losses of actions are nonnegative, under the graph-based feedback model introduced by Mannor and Shamir, we offer algorithms t...
-
作者:Shioura, Akiyoshi
作者单位:Institute of Science Tokyo; Tokyo Institute of Technology
摘要:In this paper, we consider a problem of minimizing an M-convex function under an L1-distance constraint (MML1); the constraint is given by an upper bound for L1-distance between a feasible solution and a given center. This is motivated by a nonlinear integer programming problem for reallocation of dock capacity in a bike-sharing system discussed by Freund et al. (2017). The main aim of this paper is to better understand the combinatorial structure of the dock reallocation problem through the c...
-
作者:Schol, Dennis; Vlasiou, Maria; Zwart, Bert
作者单位:Eindhoven University of Technology; University of Twente
摘要:In this paper, we study an N server fork-join queue with nearly deterministic arrival and service times. Specifically, we present a fluid limit for the maximum queue length as N -> infinity. This fluid limit depends on the initial number of tasks. In order to prove these results, we develop extreme value theory and diffusion approximations for the queue lengths.
-
作者:Chatterjee, Krishnendu; Saona, Raimundo; Ziliotto, Bruno
作者单位:Institute of Science & Technology - Austria; Universite PSL; Universite Paris-Dauphine; Centre National de la Recherche Scientifique (CNRS)
摘要:Partially observable Markov decision processes (POMDPs) are standard models for dynamic systems with probabilistic and nondeterministic behaviour in uncertain environments. We prove that in POMDPs with long-run average objective, the decision maker has approximately optimal strategies with finite memory. This implies notably that approximating the long-run value is recursively enumerable, as well as a weak continuity property of the value with respect to the transition function.
-
作者:Campi, Luciano; Fischer, Markus
作者单位:University of Milan; University of Padua
摘要:In the context of simple finite-state discrete time systems, we introduce a generalization of a mean field game solution, called a correlated solution, which can be seen as the mean field game analogue of a correlated equilibrium. Our notion of a solution is justified in two ways: we prove that correlated solutions arise as limits of exchangeable correlated equilibria in restricted (Markov open-loop) strategies for the underlying N-player games, and we show how to construct approximate N-playe...