您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > IEEE Transactions on Automatic Control > 2023 > 5期

5期

In the IEEE Transactions on Automatic Control, the IEEE Control Systems Society publishes high-quality papers on the theory, design, and applications of control engineering.

主办单位: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

期刊语言: 英语

创刊时间: 1956年

出版周期: 月刊

国际电子刊号: 1558-2523

影响因子: 7.0

最新文章

Compactly Restrictable Metric Policy Optimization Problems

作者：Dorobantu, Victor D.; Azizzadenesheli, Kamyar; Yue, Yisong

作者单位：California Institute of Technology; Purdue University System; Purdue University; Nvidia Corporation

摘要：We study policy optimization problems for deterministic Markov decision processes (MDPs) with metric state and action spaces, which we refer to as metric policy optimization problems (MPOPs). Our goal is to establish theoretical results on the well-posedness of MPOPs that can characterize practically relevant continuous control systems. To do so, we define a special class of MPOPs called compactly restrictable MPOPs (CR-MPOPs), which are flexible enough to capture the complex behavior of robot...
Multiagent Low-Dimensional Linear Bandits

作者：Chawla, Ronshee; Sankararaman, Abishek; Shakkottai, Sanjay

作者单位：University of Texas System; University of Texas Austin; University of California System; University of California Berkeley; University of Texas System; University of Texas Austin

摘要：We study a multiagent stochastic linear bandit with side information, parameterized by an unknown vector 0(*) ? R-d. The side information consists of a finite collection of low-dimensional subspaces, one of which contains 0(*). In our setting, agents can collaborate to reduce regret by sending recommendations across a communication graph connecting them. We present a novel decentralized algorithm, where agents communicate subspace indices with each other and each agent plays a projected varian...
Near-Optimal Design of Safe Output-Feedback Controllers From Noisy Data

作者：Furieri, Luca; Guo, Baiwei; Martin, Andrea; Ferrari-Trecate, Giancarlo

作者单位：Swiss Federal Institutes of Technology Domain; Ecole Polytechnique Federale de Lausanne

摘要：As we transition toward the deployment of data-driven controllers for black-box cyberphysical systems, complying with hard safety constraints becomes a primary concern. Two key aspects should be addressed when input-output data are corrupted by noise: how much uncertainty can one tolerate without compromising safety, and to what extent is the control performance affected? By focusing on finite-horizon constrained linear- quadratic problems, we provide an answer to these questions in terms of t...
Safe Value Functions

作者：Massiani, Pierre-Francois; Heim, Steve; Solowjow, Friedrich; Trimpe, Sebastian

作者单位：RWTH Aachen University; Max Planck Society; Massachusetts Institute of Technology (MIT)

摘要：Safety constraints and optimality are important but sometimes conflicting criteria for controllers. Although these criteria are often solved separately with different tools to maintain formal guarantees, it is also common practice in reinforcement learning (RL) to simply modify reward functions by penalizing failures, with the penalty treated as a mere heuristic. We rigorously examine the relationship of both safety and optimality to penalties, and formalize sufficient conditions for safe valu...
Hamiltonian Deep Neural Networks Guaranteeing Nonvanishing Gradients by Design

作者：Galimberti, Clara Lucia; Furieri, Luca; Xu, Liang; Ferrari-Trecate, Giancarlo

作者单位：Swiss Federal Institutes of Technology Domain; Ecole Polytechnique Federale de Lausanne; Shanghai University

摘要：Deep neural networks (DNNs) training can be difficult due to vanishing and exploding gradients during weight optimization through backpropagation. To address this problem, we propose a general class of Hamiltonian DNNs (H-DNNs) that stem from the discretization of continuous-time Hamiltonian systems and include several existing DNN architectures based on ordinary differential equations. Our main result is that a broad set of H-DNNs ensures nonvanishing gradients by design for an arbitrary netw...
PAC Reinforcement Learning Algorithm for General-Sum Markov Games

作者：Zehfroosh, Ashkan; Tanner, Herbert G.

作者单位：University of Delaware

摘要：This article presents a theoretical framework for probably approximately correct (PAC) multi-agent reinforcement learning (MARL) algorithms for Markov games. Using the idea of delayed Q-learning, this article extends the well-known Nash Q-learning algorithm to build a new PAC MARL algorithm for general-sum Markov games. In addition to guiding the design of a provably PAC MARL algorithm, the framework enables checking whether an arbitrary MARL algorithm is PAC. Comparative numerical results dem...
Efficient Learning of a Linear Dynamical System With Stability Guarantees

作者：Jongeneel, Wouter; Sutter, Tobias; Kuhn, Daniel

作者单位：Swiss Federal Institutes of Technology Domain; Ecole Polytechnique Federale de Lausanne; University of Konstanz

摘要：We propose a principled method for projecting an arbitrary square matrix to the nonconvex set of asymptotically stable matrices. Leveraging ideas from large deviations theory, we show that this projection is optimal in an information-theoretic sense and that it simply amounts to shifting the initial matrix by an optimal linear quadratic feedback gain, which can be computed exactly and highly efficiently by solving a standard linear quadratic regulator problem. The proposed approach allows us t...
Deep Neural Network-Based Approximate Optimal Tracking for Unknown Nonlinear Systems

作者：Greene, Max L.; Bell, Zachary I.; Nivison, Scott; Dixon, Warren E.

作者单位：Johns Hopkins University; State University System of Florida; University of Florida

摘要：The infinite horizon optimal tracking problem is solved for a deterministic, control-affine, unknown nonlinear dynamical system. A deep neural network (DNN) is updated in real time to approximate the unknown nonlinear system dynamics. The developed framework uses a multitimescale concurrent learning-based weight update policy, with which the output layer DNN weights are updated in real time, but the internal DNN features are updated discretely and at a slower timescale (i.e., with batch-like u...
Adaptive Composite Online Optimization: Predictions in Static and Dynamic Environments

作者：Scroccaro, Pedro Zattoni; Kolarijani, Arman Sharifi; Esfahani, Peyman Mohajerin

作者单位：Delft University of Technology

摘要：In the past few years, online convex optimization (OCO) has received notable attention in the control literature thanks to its flexible real-time nature and powerful performance guarantees. In this article, we propose new step-size rules and OCO algorithms that simultaneously exploit gradient predictions, function predictions and dynamics, features particularly pertinent to control applications. The proposed algorithms enjoy static and dynamic regret bounds in terms of the dynamics of the refe...
Regret-Optimal Estimation and Control

作者：Goel, Gautam; Hassibi, Babak

作者单位：University of California System; University of California Berkeley; California Institute of Technology

摘要：In this article, we consider estimation and control in linear dynamical systems from the perspective of regret minimization. Unlike most prior work in this area, we focus on the problem of designing causal state estimators and causal controllers, which compete against a clairvoyant noncausal policy, instead of the best policy selected in hindsight from some fixed parametric class. We show that regret-optimal filters and regret-optimal controllers can be derived in state space form using operat...