-
作者:Kuper, Armin; Waldherr, Steffen
作者单位:KU Leuven; University of Vienna
摘要:We present a novel Kalman filter (KF) for spatiotemporal systems called the numerical Gaussian process Kalman filter (NGPKF). Numerical Gaussian processes have recently been introduced as a physics-informed machine-learning method for simulating time-dependent partial differential equations without the need for spatial discretization while also providing uncertainty quantification of the simulation resulting from noisy initial data. We formulate numerical Gaussian processes as linear Gaussian ...
-
作者:Lopez, Victor G.; Alsalti, Mohammad; Mueller, Matthias A.
作者单位:Leibniz University Hannover
摘要:This article introduces and analyzes an improved Q-learning algorithm for discrete-time linear time-invariant systems. The proposed method does not require any knowledge of the system dynamics, and it enjoys significant efficiency advantages over other data-based optimal control methods in the literature. This algorithm can be fully executed offline, as it does not require to apply the current estimate of the optimal input to the system as in on-policy algorithms. It is shown that a PE input, ...
-
作者:Tabuada, Paulo; Gharesifard, Bahman
作者单位:University of California System; University of California Los Angeles
摘要:In this article, we show that deep residual neural networks have the power of universal approximation by using, in an essential manner, the observation that these networks can be modeled as nonlinear control systems. We first study the problem of using a deep residual neural network to exactly memorize training data by formulating it as a controllability problem for an ensemble control system. Using techniques from geometric control theory, we identify a class of activation functions that allo...
-
作者:Zhao, Feiran; You, Keyou; Basar, Tamer
作者单位:Tsinghua University; Tsinghua University; University of Illinois System; University of Illinois Urbana-Champaign
摘要:While the techniques in optimal control theory are often model-based, the policy optimization (PO) approach directly optimizes the performance metric of interest. Even though it has been an essential approach for reinforcement learning problems, there is little theoretical understanding of its performance. In this article, we focus on the risk-constrained linear quadratic regulator problem via the PO approach, which requires addressing a challenging nonconvex constrained optimization problem. ...
-
作者:Mavridis, Christos N.; Baras, John S.
作者单位:University System of Maryland; University of Maryland College Park; University System of Maryland; University of Maryland College Park
摘要:In this work, we introduce a learning model designed to meet the needs of applications in which computational resources are limited, and robustness and interpretability are prioritized. Learning problems can be formulated as constrained stochastic optimization problems, with the constraints originating mainly from model assumptions that define a tradeoff between complexity and performance. This tradeoff is closely related to overfitting, generalization capacity, and robustness to noise and adv...
-
作者:Possieri, Corrado; Sassano, Mario
作者单位:University of Rome Tor Vergata; Consiglio Nazionale delle Ricerche (CNR)
摘要:Two data-driven strategies for value iteration in linear quadratic optimal control problems over an infinite horizon are proposed. The two architectures share common features, since they both consist of a purely continuous-time control architecture and are based on the forward integration of the differential Riccati equation (DRE). They profoundly differ, instead, in the estimation mechanism of the vector field of the underlying DRE from collected data: The first relies on a characterization o...
-
作者:Franca, Guilherme; Robinson, Daniel P.; Vidal, Rene
作者单位:University of California System; University of California Berkeley; Lehigh University; Johns Hopkins University
摘要:Recently, there has been great interest in connections between continuous-time dynamical systems and optimization methods, notably in the context of accelerated methods for smooth and unconstrained problems. In this article, we extend this perspective to nonsmooth and constrained problems by obtaining differential inclusions associated with novel accelerated variants of the alternating direction method of multipliers (ADMM). Through a Lyapunov analysis, we derive rates of convergence for these...
-
作者:Yi, Xinlei; Li, Xiuxian; Yang, Tao; Xie, Lihua; Chai, Tianyou; Johansson, Karl Henrik
作者单位:Royal Institute of Technology; Tongji University; Tongji University; Northeastern University - China; Nanyang Technological University
摘要:This article considers the distributed online convex optimization problem with time-varying constraints over a network of agents. This is a sequential decision making problem with two sequences of arbitrarily varying convex loss and constraint functions. At each round, each agent selects a decision from the decision set, and then only a portion of the loss function and a coordinate block of the constraint function at this round are privately revealed to this agent. The goal of the network is t...
-
作者:Zegers, Federico M. M.; Sun, Runhan; Chowdhary, Girish; Dixon, Warren E. E.
作者单位:State University System of Florida; University of Florida; University of Illinois System; University of Illinois Urbana-Champaign
摘要:This work explores the distributed state estimation problem for an uncertain, nonlinear, and continuous-time system. Given a sensor network, each agent is assigned a deep neural network (DNN) that is used to approximate the system's dynamics. Each agent updates the weights of their DNN through a multiple timescale approach, i.e., the outer layer weights are updated online with a Lyapunov-based gradient descent update law, and the inner layer weights are updated concurrently using a supervised ...
-
作者:Castellano, Agustin; Min, Hancheng; Bazerque, Juan Andres; Mallada, Enrique
作者单位:Johns Hopkins University; Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh
摘要:This article puts forward the concept that learning to take safe actions in unknown environments, even with probability one guarantees, can be achieved without the need for an unbounded number of exploratory trials. This is indeed possible, provided that one is willing to navigate tradeoffs between optimality, level of exposure to unsafe events, and the maximum detection time of unsafe actions. We illustrate this concept in two complementary settings. We first focus on the canonical multiarmed...