您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 综合性期刊 > Proceedings of the National Academy of Sciences of the United States of America > 2024 > 30期

Time- scale invariant contingency yields one- shot reinforcement learning despite extremely long delays to reinforcement

成果类型：

Article

署名作者：

Gallistel, Charles R.; Shahan, Timothy A.

署名单位：

Rutgers University System; Rutgers University New Brunswick; Rutgers University System; Rutgers University New Brunswick; Utah System of Higher Education; Utah State University

刊物名称：

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA

ISSN/ISSBN：

0027-8600

DOI：

10.1073/pnas.2405451121

发表日期：

2024-07-23

关键词：

response acquisition REPRESENTATION memory cs

摘要：

Reinforcement learning inspires much theorizing in neuroscience, cognitive science, machine learning, and AI. A central question concerns the conditions that produce the perception of a contingency between an action and reinforcement-the assignment- of- credit problem. Contemporary models of associative and reinforcement learning do not leverage the temporal metrics (measured intervals). Our information- theoretic approach formalizes contingency by time- scale invariant temporal mutual information. It predicts that learning may proceed rapidly even with extremely long action-reinforcer delays. We show that rats can learn an action after a single reinforcement, even with a 16- min delay between the action and reinforcement (15- fold longer than any delay previously shown to support such learning). By leveraging metric temporal information, our solution obviates the need for windows of associability, exponentially decaying eligibility traces, microstimuli, or distributions over Bayesian belief states. Its three equations have no free parameters; they predict one- shot learning without iterative simulation.

来源URL：

访问原文