Controlled Discovery and Localization of Signals via Bayesian Linear Programming

成果类型:
Article
署名作者:
Spector, Asher; Janson, Lucas
署名单位:
Stanford University; Harvard University
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2024.2347667
发表日期:
2025
页码:
460-471
关键词:
genome-wide association variable selection variational inference regression bounds
摘要:
Scientists often must simultaneously localize and discover signals. For instance, in genetic fine-mapping, high correlations between nearby genetic variants make it hard to identify the exact locations of causal variants. So the statistical task is to output as many disjoint regions containing a signal as possible, each as small as possible, while controlling false positives. Similar problems arise, for example, when locating stars in astronomical surveys and in changepoint detection. Common Bayesian approaches to these problems involve computing a posterior distribution over signal locations. However, existing procedures to translate these posteriors into credible regions for the signals fail to capture all the information in the posterior, leading to lower power and (sometimes) inflated false discoveries. We introduce Bayesian Linear Programming (BLiP), which can efficiently convert any posterior distribution over signals into credible regions for signals. BLiP overcomes an extremely high-dimensional and nonconvex problem to verifiably nearly maximize expected power while controlling false positives. Applying BLiP to existing state-of-the-art analyses of UK Biobank data (for genetic fine-mapping) and the Sloan Digital Sky Survey (for astronomical point source detection) increased power by 30%-120% in just a few minutes of additional computation. BLiP is implemented in pyblip (Python) and blipr (R). Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.