-
作者:Korpeoglu, Ersin; Korpeoglu, C. Gizem; Hafalir, Isa Emin
作者单位:University of London; University College London; Eindhoven University of Technology; University of Technology Sydney
摘要:We study multiple parallel contests in which contest organizers elicit solutions to innovation-related problems from a set of solvers. Each solver may participate in multiple contests and exert effort to improve the solution for each contest the solver enters, but the quality of the solver???s solution in each contest also depends on an output uncertainty. We first analyze whether an organizer???s profit can be improved by discouraging solvers from participating in multiple contests. We show, ...
-
作者:Kallus, Nathan; Uehara, Masatoshi
作者单位:Cornell University
摘要:Off-policy evaluation (OPE) in reinforcement learning is notoriously difficult in long- and infinite-horizon settings due to diminishing overlap between behavior and target policies. In this paper, we study the role of Markovian and time-invariant structure in efficient OPE. We first derive the efficiency bounds and efficient influence functions for OPE when one assumes each of these structures. This precisely characterizes the curse of horizon: in time-variant processes, OPE is only feasible ...
-
作者:Qu, Guannan; Wierman, Adam; Li, Na
作者单位:Carnegie Mellon University; California Institute of Technology; Harvard University
摘要:We study reinforcement learning (RL) in a setting with a network of agents whose states and actions interact in a local manner where the objective is to find localized policies such that the (discounted) global reward is maximized. A fundamental challenge in this setting is that the state-action space size scales exponentially in the number of agents, rendering the problem intractable for large networks. In this paper, we propose a scalable actor critic (SAC) framework that exploits the networ...
-
作者:Chang, Yanling; Keblis, Matthew F.; Li, Ran; Iakovou, Eleftherios; White, Chelsea C.
作者单位:Texas A&M University System; Texas A&M University College Station; Texas A&M University System; Texas A&M University College Station; Texas A&M University System; Texas A&M University College Station; Texas A&M University System; Texas A&M University College Station; University System of Georgia; Georgia Institute of Technology
摘要:Advanced information technology has changed the landscape of modern warfare, as it facilitates communication among nonconventional actors such as violent extremist groups. This paper examines the value of misinformation and disinformation to a military leader who through investment in people, programs, and technology is able to affect the accuracy of information communicated between other actors. We model the problem as a partially observable stochastic game with three agents, a leader and two...
-
作者:Li, Kai; Liu, Jun
作者单位:Macquarie University; Southwestern University of Finance & Economics - China; University of California System; University of California San Diego
摘要:We explicitly solve for the optimal dynamic trading strategy between a riskless asset and a risky asset with momentum. The optimal portfolio weight depends not only on the momentum, as in Merton???s framework, but also on the historical price path; this contrasts with Merton. Because of their path dependence, optimal portfolio weights have a wide distribution for a given level of momentum; for example, investors may short the risky asset if it has rebound price paths but leverage if it has hum...
-
作者:Zhu, Nan; Bauer, Daniel
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; University of Wisconsin System; University of Wisconsin Madison; University of Wisconsin System; University of Wisconsin Madison
摘要:This paper presents and applies models for the valuation and management of mortality-contingent exposures. Such exposures include insurance and pension benefits, as well as novel mortality-linked securities traded in financial markets. Unlike conventional approaches to modeling mortality, we consider the stochastic evolution of mortality projections rather than realized mortality rates. Relying on a time series of age-specific mortality forecasts, we develop a set of stochastic models that-unl...
-
作者:Chen, Louis; Ma, Will; Natarajan, Karthik; Simchi-Levi, David; Yan, Zhenzhen
作者单位:United States Department of Defense; United States Navy; Naval Postgraduate School; Columbia University; Singapore University of Technology & Design; Massachusetts Institute of Technology (MIT); Massachusetts Institute of Technology (MIT); Nanyang Technological University
摘要:In this paper, we study linear and discrete optimization problems in which the objective coefficients are random, and the goal is to evaluate a robust bound on the expected optimal value, where the set of admissible joint distributions is assumed to be specified only up to the marginals. We study a primal-dual formulation for this problem, and in the process, unify existing results with new results. We establish NP-hardness of computing the bound for general polytopes and identify two sufficie...
-
作者:Zhang, Hao
作者单位:University of British Columbia
摘要:This paper presents a new methodology to solve a general model of dynamic decision making with a continuous unknown parameter or state. The methodology centers on the continuation-value functions (mappings from the parameter space to the continuation-value space), created by feasible continuation policies. When the model primitives can be described through a family of basis functions (e.g., polynomials), a continuation-value function retains that property and can be represented by a basis weig...
-
作者:Suen, Sze-chuan; Negoescu, Diana; Goh, Joel
作者单位:University of Southern California; University of Minnesota System; University of Minnesota Twin Cities; National University of Singapore; National University of Singapore; Harvard University
摘要:Premature cessation of antibiotic therapy (nonadherence) is common in long treatment regimens and can severely compromise health outcomes. In this work, we investigate the problem of designing a schedule of incentive payments to induce socially optimal treatment adherence levels in a setting in which treatment adherence can be observed (e.g., through directly observed therapy for tuberculosis), but patient preferences for treatment adherence are heterogeneous and unobservable to a health provi...
-
作者:Peeters, Yannik; den Boer, Arnoud V.; Mandjes, Michel
作者单位:University of Amsterdam; University of Amsterdam
摘要:We consider assortment optimization over a continuous spectrum of products represented by the unit interval, where the seller's problem consists of determining the optimal subset of products to offer to potential customers. To describe the relation between assortment and customer choice, we propose a probabilistic choice model that forms the continuous counterpart of the widely studied discrete multinomial logit model. We consider the seller's problem under incomplete information, propose a st...