-
作者:Rizvi, Syed Ali Asad; Lin, Zongli
作者单位:Tennessee Technological University; University of Virginia
摘要:This note presents an analysis of the state parameterizations used in output feedback reinforcement learning (RL) control. Output feedback algorithms based on state parameterization involve additional conditions on the state parameterization beyond the standard conditions on the system matrices for their convergence to the optimal solution. It is shown that the state parameterization matrix needs to be of full row rank to guarantee the convergence of the output feedback RL algorithms. We prese...
-
作者:Geng, Li-Hui; Wills, Adrian George; Ninness, Brett; Schon, Thomas Bo
作者单位:Tianjin University of Technology & Education; University of Newcastle; University of Newcastle; Uppsala University
摘要:This article addresses the problem of computing fixed-interval smoothed state estimates of a linear time-varying Gaussian stochastic system. There already exist many algorithms that perform this computation, but all of them impose certain restrictions on system matrices in order for them to be applicable, and the restrictions vary considerably between the various existing algorithms. This article establishes a new sufficient condition for the fixed-interval smoothing density to exist in a Gaus...
-
作者:Khosravi, M.; Smith, R. S.
作者单位:Delft University of Technology; Swiss Federal Institutes of Technology Domain; ETH Zurich
摘要:In this article, we consider the problem of system identification when side-information is available on the steady-state gain (SSG) of the system. We formulate a general nonparametric identification method as an infinite-dimensional constrained convex program over the reproducing kernel Hilbert space (RKHS) of stable impulse responses. The objective function of this optimization problem is the empirical loss regularized with the norm of RKHS, and the constraint is considered for enforcing the ...
-
作者:Liu, Pin; Lu, Kaihong; Xiao, Feng; Wei, Bo; Zheng, Yuanshi
作者单位:North China Electric Power University; North China Electric Power University; Shandong University of Science & Technology; North China Electric Power University; Xidian University
摘要:This article proposes a new aggregative game (AG) model with feedback delays. The strategies of players are selected from given strategy sets and subject to global nonlinear inequality constraints. Both cost functions and constrained functions of players are time varying, which reflects the changing nature of environments. At each time, each player only has access to its strategy set information, and the information of its current cost function and current constrained function is unknown. Due ...
-
作者:Zhang, Hao; Yan, Huaicheng; Qiu, Jianbin
作者单位:Harbin Institute of Technology; East China University of Science & Technology
摘要:This article addresses the finite-horizon H-infinity consensus problem for general discrete time-varying multiagent systems with energy-bounded external disturbances and limited network bandwidth resource. A dynamic event-based learning approach is proposed to produce a dynamic discrete-time event-triggered scheme and a dynamic distributed H-infinity consensus protocol. Based on the local state information, the dynamic distributed H-infinity consensus protocol is developed to guarantee the H-i...
-
作者:Cosentino, Francesco; Oberhauser, Harald; Abate, Alessandro
作者单位:University of Oxford; Alan Turing Institute; University of Oxford
摘要:This article concerns continuous-time, continuous-space stochastic dynamical systems described by stochastic differential equations (SDE). It presents a new approach to compute probabilistic safety regions, namely sets of initial conditions of the SDE associated to trajectories that are safe with a probability larger than a given threshold. The approach introduces a functional that is minimized at the border of the probabilistic safety region, then solves an optimization problem using techniqu...
-
作者:Yin, Hao; Jayawardhana, Bayu; Trenn, Stephan
作者单位:University of Groningen
摘要:This article studies contraction analysis of switched systems that are composed of a mixture of contracting and noncontracting modes. The first result pertains to the equivalence of the contraction of a switched system and the uniform global exponential stability of its variational system. Based on this equivalence property, sufficient conditions for a mode-dependent average dwell/leave-time based switching law to be contractive are established. Correspondingly, linear matrix inequality (LMI) ...
-
作者:Mcallister, Robert D.; Rawlings, James B.
作者单位:University of California System; University of California Santa Barbara
摘要:In this work, we establish and compare the stochastic and deterministic robustness properties achieved by nominal model predictive control (MPC), stochastic MPC (SMPC), and a proposed constraint tightened MPC (CMPC) formulation, which represents an idealized version of tube-based MPC. We consider three definitions of robustness for nonlinear systems and bounded disturbances: robustly asymptotically stable (RAS), robustly asymptotically stable in expectation (RASiE), and RASiE with respect to t...
-
作者:Burohman, Azka Muji; Besselink, Bart; Scherpen, Jacquelien M. A.; Camlibel, M. Kanat
作者单位:University of Groningen; University of Groningen; University of Groningen
摘要:This article proposes a data-driven model reduction approach on the basis of noisy data with a known noise model. Firstl, the concept of data reduction is introduced. In particular, we show that the set of reduced-order models obtained by applying a Petrov-Galerkin projection to all systems explaining the data characterized in a large-dimensional quadratic matrix inequality (QMI) can again be characterized in a lower-dimensional QMI. Next, we develop a data-driven generalized balanced truncati...
-
作者:Hassan, Syeda Sakira; Sarkka, Simo
作者单位:Aalto University
摘要:In this article, we propose a novel computational method for solving nonlinear optimal control problems. The method is based on the use of Fourier-Hermite series for approximating the action-value function arising in dynamic programming instead of the conventional Taylor-series expansion used in differential dynamic programming. The coefficients of the Fourier-Hermite series can be numerically computed by using sigma-point methods, which leads to a novel class of sigma-point-based dynamic prog...