您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 综合性期刊 > Proceedings of the National Academy of Sciences of the United States of America > 2024 > 12期

The training process of many deep networks explores the same low-dimensional manifold

成果类型：

Article

署名作者：

Mao, Jialin; Griniasty, Itay; Teoh, Han Kheng; Ramesh, Rahul; Yang, Rubing; Transtrum, Mark K.; Sethna, James P.; Chaudhari, Pratik

署名单位：

University of Pennsylvania; Cornell University; University of Pennsylvania; Brigham Young University; University of Pennsylvania

刊物名称：

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA

ISSN/ISSBN：

0027-12131

DOI：

10.1073/pnas.2310002121

发表日期：

2024-03-19

关键词：

摘要：

We develop information-geometric techniques to analyze the trajectories of the predictions of deep networks during training. By examining the underlying highdimensional probabilistic models, we reveal that the training process explores an effectively low-dimensional manifold. Networks with a wide range of architectures, sizes, trained using different optimization methods, regularization techniques, data augmentation techniques, and weight initializations lie on the same manifold in the prediction space. We study the details of this manifold to find that networks with different architectures follow distinguishable trajectories, but other factors have a minimal influence; larger networks train along a similar manifold as that of smaller networks, just faster; and networks initialized at very different parts of the prediction space converge to the solution along a similar manifold.