您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 统计学 > Journal of the American Statistical Association > 2025 > 549期

Data Fission: Splitting a Single Data Point

成果类型：

Article

署名作者：

Leiner, James; Duan, Boyan; Wasserman, Larry; Ramdas, Aaditya

署名单位：

Carnegie Mellon University; Alphabet Inc.; Google Incorporated

刊物名称：

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION

ISSN/ISSBN：

0162-1459

DOI：

10.1080/01621459.2023.2270748

发表日期：

2025

页码：

135-146

关键词：

Post-selection Inference False Discovery Rate models

摘要：

Suppose we observe a random vector X from some distribution in a known family with unknown parameters. We ask the following question: when is it possible to split X into two pieces f(X) and g(X) such that neither part is sufficient to reconstruct X by itself, but both together can recover X fully, and their joint distribution is tractable? One common solution to this problem when multiple samples of X are observed is data splitting, but Rasines and Young offers an alternative approach that uses additive Gaussian noise-this enables post-selection inference in finite samples for Gaussian distributed data and asymptotically when errors are non-Gaussian. In this article, we offer a more general methodology for achieving such a split in finite samples by borrowing ideas from Bayesian inference to yield a (frequentist) solution that can be viewed as a continuous analog of data splitting. We call our method data fission, as an alternative to data splitting, data carving and p-value masking. We exemplify the method on several prototypical applications, such as post-selection inference for trend filtering and other regression problems, and effect size estimation after interactive multiple testing. Supplementary materials for this article are available online.

来源URL：

访问原文