-
作者:Wen, Min; Topcu, Ufuk
作者单位:University of Pennsylvania; University of Texas System; University of Texas Austin
摘要:Reinforcement learning (RL) algorithms have been used to learn how to implement tasks in uncertain and partially unknown environments. In practice, environments are usually uncontrolled and may affect task performance in an adversarial way. In this article, we model the interaction between an RL agent and its potentially adversarial environment as a turn-based zero-sum stochastic game. The task requirements are represented both qualitatively as a subset of linear temporal logic (LTL) specifica...
-
作者:Bistritz, Ilai; Heydaribeni, Nasimeh; Anastasopoulos, Achilleas
作者单位:Stanford University; University of Michigan System; University of Michigan
摘要:We consider an environment where players need to decide whether to buy a certain product (or adopt a technology) or not. The product is either good or bad, but its true value is unknown to the players. Instead, each player has her own private information on its quality. Each player can observe the previous actions of other players and estimate the quality of the product. A classic result in the literature shows that in similar settings, informational cascades occur, where learning stops for th...
-
作者:Rigo, Damiano; Segala, Chiara; Sansonetto, Nicola; Muradore, Riccardo
作者单位:University of Verona; RWTH Aachen University
摘要:Attitude estimation is a core problem in many mobile robotic systems, such as unmanned aerial and ground vehicles. The configuration space of these systems is properly modeled by exploiting the theory of Lie groups. In this article, we propose a second-order-optimal minimum-energy filter on the matrix Lie group TSE(2), the tangent bundle of the special Euclidean group SE(2), where the optimality is with respect to a cost function in the unknown input and output error measurements. In this arti...
-
作者:Li, Huaqing; Wu, Xiangzhao; Wang, Zheng; Huang, Tingwen
作者单位:Southwest University - China; University of New South Wales Sydney; Qatar Foundation (QF); Texas A&M University Qatar
摘要:This article considers the distributed structured optimization problem of collaboratively minimizing the global objective function composed of the sum of local cost functions. Each local objective function involves a Lipschitz-differentiable convex function, a nonsmooth convex function, and a linear composite nonsmooth convex function. For such problems, we derive the synchronous distributed primal-dual splitting (S-DPDS) algorithm with uncoordinated stepsizes. Meanwhile, we develop the asynch...
-
作者:Qin, Jiahu; Ma, Qichao; Yi, Peng; Wang, Long
作者单位:Chinese Academy of Sciences; University of Science & Technology of China, CAS; Tongji University; Tongji University; Peking University
摘要:In this article, we investigate the interval consensus for a network of agents with flocking dynamics, i.e., second-order multiagent systems, where each agent imposes an interval constraint on its preferred consensus values, with the aim of driving the agent into a favorable interval. Specifically, we work on two different frameworks of interval constraints, viz., the first one that the node states are constrained in their own constraint intervals and the second one that the node states are co...
-
作者:Shi, Mingming
作者单位:Universite Catholique Louvain
摘要:This article focuses on the distributed estimation of time-varying bias signals in relative state measurements of sensors, where each sensor measures the relative state of neighboring sensors and the measurements contain a time-varying bias signal that is generated from a linear exo-systems. Assume that sensors can communicate with others, we propose several distributed bias estimators, by which each sensor can reconstruct its own bias signal and the exo-system's state using local information.
-
作者:Sassano, Mario; Mylvaganam, Thulasi; Astolfi, Alessandro
作者单位:University of Rome Tor Vergata; Imperial College London; Imperial College London
摘要:We consider optimal control problems for continuous-time systems with time-dependent dynamics, in which the time-dependence arises from the presence of a known exogenous signal. The problem has been elegantly solved in the case of linear input-affine systems, for which it has been shown that the solution has a remarkable structure: It is given by the sum of two contributions; a state feedback, which coincides with the unperturbed optimal control law, and a purely feedforward term in charge of ...
-
作者:Tognazzi, Stefano; Tribastone, Mirco; Tschaikowski, Max; Vandin, Andrea
作者单位:University of Konstanz; IMT School for Advanced Studies Lucca; Aalborg University; Scuola Superiore Sant'Anna; Technical University of Denmark
摘要:Differential-algebraic equations (DAEs) are a widespread dynamical model that describes continuously evolving quantities defined with differential equations, subject to constraints expressed through algebraic relationships. As such, DAEs arise in many fields ranging from physics, chemistry, and engineering. In this article, we focus on linear DAEs, and develop a theory for their minimization up to an equivalence relation. We present differential equivalence, which relates DAE variables that ha...
-
作者:Wang, Ran; Parunandi, Karthikeya S.; Yu, Dan; Kalathil, Dileep; Chakravorty, Suman
作者单位:Texas A&M University System; Texas A&M University College Station; Nanjing University of Aeronautics & Astronautics; Texas A&M University System; Texas A&M University College Station
摘要:This article addresses the problem of learning the optimal control policy for a nonlinear stochastic dynamical. This problem is subject to the curse of dimensionality associated with the dynamic programming method. This article proposes a novel decoupled data-based control (D2C) algorithm that addresses this problem using a decoupled, open-loop-closed-loop, approach. First, an open-loop deterministic trajectory optimization problem is solved using a black-box simulation model of the dynamical ...
-
作者:Cao, Xuanyu; Basar, Tamer
作者单位:University of Illinois System; University of Illinois Urbana-Champaign
摘要:In online decision making, feedback delays often arise due to the latency caused by computation and communication in practical systems. In this article, we study decentralized online convex optimization over a multiagent network in the presence of feedback delays. Each agent is associated with a time-varying local loss function, which is revealed to the agent sequentially with delays. The goal of every agent is to minimize the accumulated total loss function (the sum of the local loss function...