-
作者:Wang, Jiaqi; Xie, Weijun; Ryzhov, Ilya O.
作者单位:University System of Georgia; Georgia Institute of Technology; University System of Maryland; University of Maryland College Park
摘要:D-optimal experimental design is a classical statistical problem in which one chooses a collection of data vectors, from some available large pool, in order to maximize a measure of predictive quality. In the classical formulation, the only constraint is on the cardinality of the collection, that is, the number of vectors chosen. We study a more general budget-constrained variant in which vectors have heterogeneous costs, and develop four new algorithms (two deterministic and two randomized) w...
-
作者:Zhang, Xilin; Cheung, Wang Chi
作者单位:National University of Singapore
摘要:We study a general reusable resource allocation model under both model uncertainty and nonstationarity. Our study involves a set of heterogeneous customers who arrive sequentially at the decision maker's (DM's) platform, each associated with a different customer type. Each arriving customer's type is drawn from an unknown and time-varying probability distribution. Upon observing the customer's type, the DM selects an allocation decision that generates a random amount of reward and occupies ran...
-
作者:Bayraktar, Erhan; Han, Bingyan
作者单位:University of Michigan System; University of Michigan; Hong Kong University of Science & Technology (Guangzhou)
摘要:Given two probability measures on sequential data, we investigate the transport problem with time-inconsistent preferences in a discrete-time setting. Motivating examples are nonlinear objectives, state-dependent costs, and regularized optimal transport with general f-divergence. Under the bicausal constraint, we introduce the concept of equilibrium transport. Existence is proved in the semidiscrete Markovian case and the continuous non-Markovian case with strict quasiconvexity, whereas unique...
-
作者:Zhong, Xianghui
作者单位:University of Bonn
摘要:One way to speed up the calculation of optimal traveling salesman problem tours in practice is eliminating edges that are certainly not in the optimal tour as a preprocessing step. In order to do so, several edge elimination approaches have been proposed in the past. In this work, we investigate two of them in the scenario where the input consists of n independently distributed random points in the two-dimensional unit square with density function bounded from above and below by arbitrary posi...
-
作者:Gao, Xuefeng; Zhou, Xunyu
作者单位:Chinese University of Hong Kong; Columbia University
摘要:We study reinforcement learning for continuous-time Markov decision processes (MDPs) in the finite-horizon episodic setting. In contrast to discrete-time MDPs, the intertransition times of a continuous-time MDP are exponentially distributed with rate parameters depending on the state-action pair at each transition. We present a learning algorithm based on the methods of value iteration and upper confidence bound. We derive an upper bound on the worst case expected regret for the proposed algor...
-
作者:Wang, Wenyuan; Xu, Ran; Van, Kaixin
作者单位:Fujian Normal University; Fujian Normal University; Fujian Normal University; Fujian Normal University; Fujian Normal University; Xiamen University; Xi'an Jiaotong-Liverpool University
摘要:In this paper, we investigate the optimal dividend problem with capital injection and ratcheting constraint with nondecreasing dividend payout rate. Capital injections are introduced in order to eliminate the possibility of bankruptcy. Under the Cramer-Lundberg risk model, the problem is formulated as a two-dimensional stochastic control problem. By applying the viscosity theory, we show that the value function is the unique viscosity solution to the associated Hamilton-Jacobi-Bellman equation...
-
作者:Mohammadi, Ashkan; Sarabi, Ebrahim
作者单位:Georgetown University; University System of Ohio; Miami University
摘要:This paper is devoted to the study of the second-order variational analysis of spectral functions. It is well-known that spectral functions can be expressed as a composite function of symmetric functions and eigenvalue functions. We establish several secondorder properties of spectral functions when their associated symmetric functions enjoy these properties. Our main attention is given to characterize parabolic regularity for this class of functions. It was observed recently that parabolic re...
-
作者:Zhao, Jingyang; Xiao, Mingyu
作者单位:University of Electronic Science & Technology of China
摘要:The traveling tournament problem (TTP) is a hard but interesting sports scheduling problem inspired by Major League Baseball, which is to design a double round-robin schedule such that each pair of teams plays one game in each other's home venue, minimizing the total distance traveled by all n teams (n is even). In this paper, we consider TTP-2 (i.e., TTP under the constraint that at most two consecutive home games or away games are allowed for each team). In this paper, we propose practical a...
-
作者:Cardoso, Adrian Rivera; Wang, He; Xu, Huan
作者单位:University System of Georgia; Georgia Institute of Technology; Alibaba Group
摘要:We study the online saddle point problem, an online learning problem where at each iteration, a pair of actions needs to be chosen without knowledge of the current and future (convex-concave) payoff functions. The objective is to minimize the gap between the cumulative payoffs and the saddle point value of the aggregate payoff function, which we measure using a metric called saddle point regret (SP-Regret). The problem generalizes the online convex optimization framework, but here, we must ens...
-
作者:Delarue, Francois; Vasileiadis, Athanasios
作者单位:Universite Cote d'Azur; Centre National de la Recherche Scientifique (CNRS)
摘要:The goal of this paper is to demonstrate that common noise may serve as an exploration noise for learning the solution of a mean field game. This concept is here exemplified through a toy linear-quadratic model, for which a suitable form of common noise has already been proven to restore existence and uniqueness. We here go one step further and prove that the same form of common noise may force the convergence of the learning algorithm called fictitious play, and this without any further poten...