Sampling with flows, diffusion, and autoregressive neural networks from a spin-glass perspective

成果类型:
Article
署名作者:
Ghio, Davide; Dandi, Yatin; Krzakala, Florent; Zdeborova, Lenka
署名单位:
Swiss Federal Institutes of Technology Domain; Ecole Polytechnique Federale de Lausanne; Swiss Federal Institutes of Technology Domain; Ecole Polytechnique Federale de Lausanne
刊物名称:
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
ISSN/ISSBN:
0027-13689
DOI:
10.1073/pnas.2311810121
发表日期:
2024-07-02
关键词:
dynamics MODEL TRANSITION
摘要:
Recent years witnessed the development of powerful generative models based on flows, diffusion, or autoregressive neural networks, achieving remarkable success in generating data from examples with applications in a broad range of areas. A theoretical analysis of the performance and understanding of the limitations of these methods remain, however, challenging. In this paper, we undertake a step in this direction by analyzing the efficiency of sampling by these methods on a class of problems with a known probability distribution and comparing it with the sampling performance of more traditional methods such as the Monte Carlo Markov chain and Langevin dynamics. We focus on a class of probability distribution widely studied in the statistical physics of disordered systems that relate to spin glasses, statistical inference, and constraint satisfaction problems. We leverage the fact that sampling via flow-based, diffusionbased, or autoregressive networks methods can be equivalently mapped to the analysis of a Bayes optimal denoising of a modified probability measure. Our findings demonstrate that these methods encounter difficulties in sampling stemming from the presence of a first-order phase transition along the algorithm's denoising path. Our conclusions go both ways: We identify regions of parameters where these methods are unable to sample efficiently, while that is possible using standard Monte Carlo or Langevin approaches. We also identify regions where the opposite happens: standard approaches are inefficient while the discussed generative methods work well.