False Discovery in A/B Testing

成果类型:
Article
署名作者:
Berman, Ron; Van den Bulte, Christophe
署名单位:
University of Pennsylvania
刊物名称:
MANAGEMENT SCIENCE
ISSN/ISSBN:
0025-1909
DOI:
10.1287/mnsc.2021.4207
发表日期:
2022
页码:
6762-6782
关键词:
STATISTICS Design of experiments Decision Analysis inference A/B testing False Discovery Rate
摘要:
We investigate what fraction of all significant results in website A/B testing is actually null effects (i.e., the false discovery rate (FDR)). Our data consist of 4,964 effects from 2,766 experiments conducted on a commercial A/B testing platform. Using three different methods, we find that the FDR ranges between 28% and 37% for tests conducted at 10% significance and between 18% and 25% for tests at 5% significance (two sided). These high FDRs stem mostly from the high fraction of true null effects, about 70%, rather than from low power. Using our estimates, we also assess the potential of various A/B test designs to reduce the FDR. The twomain implications are that decisionmakers should expect one in five interventions achieving significance at 5% confidence to be ineffective when deployed in the field and that analysts should consider using two-stage designs with multiple variations rather than basic A/B tests.
来源URL: