-
作者:Muhle-Karbe, Johannes; Shi, Xiaofei; Yang, Chen
作者单位:Imperial College London; University of Toronto; Columbia University; Chinese University of Hong Kong
摘要:We study a risk-sharing economy where an arbitrary number of heterogeneous agents trades an arbitrary number of risky assets subject to quadratic transaction costs. For linear state dynamics, the forward-backward stochastic differential equations characterizing equilibrium asset prices and trading strategies in this context reduce to a coupled system of matrix-valued Riccati equations. We prove the existence of a unique global solution and provide explicit asymptotic expansions that allow us t...
-
作者:Pichler, Alois; Schlotter, Ruben
作者单位:Technische Universitat Chemnitz
摘要:This paper extends dynamic control problems from a risk-neutral to a risk-averse setting. We establish a limit for consecutive risk-averse decision making by consistently and adequately nesting coherent risk measures. This approach provides a new perspective on multistage optimal control problems in continuous time. For the limiting case, we elaborate a new dynamic programming principle, which is risk averse, and give risk-averse Hamilton-Jacobi-Bellman equations by generalizing the infinitesi...
-
作者:Hu, Mingshang; Ji, Shaolin; Xue, Xiaole
作者单位:Shandong University; Shandong University
摘要:In this paper, we propose a general modeling framework for optimal control of stochastic fully coupled forward-backward linear quadratic (FBLQ) problems with indefinite control weight costs that stem from rational expectations models. We propose a new decoupling technique to obtain the optimal feedback control, which is accompanied by one kind of non-Riccati-type ordinary differential equation (ODE). By applying the completion-of-squares method, we prove the existence of the solutions for the ...
-
作者:Walton, Neil; Denisov, Denis
作者单位:Durham University
摘要:We consider a policy gradient algorithm applied to a finite-arm bandit problem with Bernoulli rewards. We allow learning rates to depend on the current state of the algorithm rather than using a deterministic time-decreasing learning rate. The state of the algorithm forms a Markov chain on the probability simplex. We apply Foster-Lyapunov techniques to analyze the stability of this Markov chain. We prove that, if learning rates are well-chosen, then the policy gradient algorithm is a transient...