您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 统计学 > Journal of the American Statistical Association > 2025 > 549期

Selection and Aggregation of Conformal Prediction Sets

成果类型：

Article

署名作者：

Yang, Yachong; Kuchibhotla, Arun Kumar

署名单位：

University of Pennsylvania; Carnegie Mellon University

刊物名称：

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION

ISSN/ISSBN：

0162-1459

DOI：

10.1080/01621459.2024.2344700

发表日期：

2025

页码：

435-447

关键词：

摘要：

Conformal prediction is a generic methodology for finite-sample valid distribution-free prediction. This technique has garnered a lot of attention in the literature partly because it can be applied with any machine learning algorithm that provides point predictions to yield valid prediction regions. Of course, the efficiency (width/volume) of the resulting prediction region depends on the performance of the machine learning algorithm. In the context of point prediction, several techniques (such as cross-validation) exist to select one of many machine learning algorithms for better performance. In contrast, such selection techniques are seldom discussed in the context of set prediction (or prediction regions). In this article, we consider the problem of obtaining the smallest conformal prediction region given a family of machine learning algorithms. We provide two general-purpose selection algorithms and consider coverage as well as width properties of the final prediction region. The first selection method yields the smallest width prediction region among the family of conformal prediction regions for all sample sizes but only has an approximate coverage guarantee. The second selection method has a finite sample coverage guarantee but only attains close to the smallest width. The approximate optimal width property of the second method is quantified via an oracle inequality. As an illustration, we consider the use of aggregation of nonparametric regression estimators in the split conformal method with the absolute residual conformal score. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.