The p-filter: multilayer false discovery rate control for grouped hypotheses

成果类型:
Article
署名作者:
Barber, Rina Foygel; Ramdas, Aaditya
署名单位:
University of Chicago; University of California System; University of California Berkeley
刊物名称:
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
ISSN/ISSBN:
1369-7412
DOI:
10.1111/rssb.12218
发表日期:
2017
页码:
1247-1268
关键词:
摘要:
In many practical applications of multiple testing, there are natural ways to partition the hypotheses into groups by using the structural, spatial or temporal relatedness of the hypotheses, and this prior knowledge is not used in the classical Benjamini-Hochberg procedure for controlling the false discovery rate (FDR). When one can define (possibly several) such partitions, it may be desirable to control the group FDR simultaneously for all partitions (as special cases, the 'finest' partition divides the n hypotheses into n groups of one hypothesis each, and this corresponds to controlling the usual notion of FDR, whereas the 'coarsest' partition puts all n hypotheses into a single group, and this corresponds to testing the global null hypothesis). We introduce the p-filter, which takes as input a list of n p-values and M >= 1 partitions of hypotheses, and produces as output a list of n or fewer discoveries such that the group FDR is provably simultaneously controlled for all partitions. Importantly, since the partitions are arbitrary, our procedure can also handle multiple partitions which are non-hierarchical. The p-filter generalizes two classical procedures-when M=1, choosing the finest partition into n singletons, we exactly recover the Benjamini-Hochberg procedure, whereas, choosing instead the coarsest partition with a single group of size n, we exactly recover the Simes test for the global null hypothesis. We verify our findings with simulations that show how this technique can not only lead to the aforementioned multilayer FDR control but also lead to improved precision of rejected hypotheses. We present some illustrative results from an application to a neuroscience problem with functional magnetic resonance imaging data, where hypotheses are explicitly grouped according to predefined regions of interest in the brain, thus allowing the scientist to employ field-specific prior knowledge explicitly and flexibly.
来源URL: