SIMULTANEOUS HIGH-PROBABILITY BOUNDS ON THE FALSE DISCOVERY PROPORTION IN STRUCTURED, REGRESSION AND ONLINE SETTINGS

成果类型:
Article
署名作者:
Katsevich, Eugene; Ramdas, Aaditya
署名单位:
Carnegie Mellon University
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/19-AOS1938
发表日期:
2020
页码:
3465-3487
关键词:
fdr control hypotheses number tests
摘要:
While traditional multiple testing procedures prohibit adaptive analysis choices made by users, Goeman and Solari (Statist. Sci. 26 (2011) 584-597) proposed a simultaneous inference framework that allows users such flexibility while preserving high-probability bounds on the false discovery proportion (FDP) of the chosen set. In this paper, we propose a new class of such simultaneous FDP bounds, tailored for nested sequences of rejection sets. While most existing simultaneous FDP bounds are based on closed testing using global null tests based on sorted p-values, we additionally consider the setting where side information can be leveraged to boost power, the variable selection setting where knockoff statistics can be used to order variables, and the online setting where decisions about rejections must be made as data arrives. Our finite-sample, closed form bounds are based on repurposing the FDP estimates from false discovery rate (FDR) controlling procedures designed for each of the above settings. These results establish a novel connection between the parallel literatures of simultaneous FDP bounds and FDR control methods, and use proof techniques employing martingales and filtrations that are new to both these literatures. We demonstrate the utility of our results by augmenting a recent knockoffs analysis of the UK Biobank dataset.