您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 统计学 > Journal of the Royal Statistical Society: Series B > 2025 > 4期

Augmentation invariant manifold learning

成果类型：

Article

署名作者：

Wang, Shulei

署名单位：

University of Illinois System; University of Illinois Urbana-Champaign

刊物名称：

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY

ISSN/ISSBN：

1369-7412

DOI：

10.1093/jrsssb/qkaf003

发表日期：

2025

页码：

978-1000

关键词：

nonlinear dimensionality reduction Graph Laplacian diffusion maps eigenmaps kernel

摘要：

Data augmentation is a widely used technique and an essential ingredient in the recent advance in self-supervised representation learning. By preserving the similarity between augmented data, the resulting data representation can improve various downstream analyses and achieve state-of-the-art performance in many applications. Despite the empirical effectiveness, most existing methods lack theoretical understanding under a general nonlinear setting. To fill this gap, we develop a statistical framework on a low-dimensional product manifold to model the data augmentation transformation. Under this framework, we introduce a new representation learning method called augmentation invariant manifold learning and design a computationally efficient algorithm by reformulating it as a stochastic optimization problem. Compared with existing self-supervised methods, the new method simultaneously exploits the manifold's geometric structure and invariant property of augmented data and has an explicit theoretical guarantee. Our theoretical investigation characterizes the role of data augmentation in the proposed method and reveals why and how the data representation learned from augmented data can improve the k-nearest neighbour classifier in the downstream analysis, showing that a more complex data augmentation leads to more improvement in downstream analysis. Finally, numerical experiments on simulated and real data sets are presented to demonstrate the merit of the proposed method.

来源URL：

访问原文