Nonstationary A/B Tests: Optimal Variance Reduction, Bias Correction, and Valid Inference
成果类型:
Article
署名作者:
Wu, Yuhang; Zheng, Zeyu; Zhang, Guangyu; Zhang, Zuohua; Wang, Chu
署名单位:
University of California System; University of California Berkeley; Amazon.com
刊物名称:
MANAGEMENT SCIENCE
ISSN/ISSBN:
0025-1909
DOI:
10.1287/mnsc.2022.01205
发表日期:
2025
关键词:
A/B tests
Nonstationarity
Central Limit Theorem
optimal asymptotic variance
BIAS
inference
摘要:
We develop an analytical framework to appropriately model and adequately analyze A/B tests in presence of nonparametric nonstationarities in the targeted business metrics. A/B tests, also known as online randomized controlled experiments, have been used at scale by data-driven enterprises to guide decisions and test innovative ideas to improve core business metrics. Meanwhile, nonstationarities, such as the time-of-day effect and the day-of-week effect, can often arise nonparametrically in key business metrics involving purchases, revenue, conversions, customer experiences, and so on. First, we develop a generic nonparametric stochastic model to capture nonstationarities in A/B test experiments, where each sample represents a visit or action associated with a time label. We build a practically relevant limiting regime to facilitate analyzing large-sample estimator performances under nonparametric nonstationarities. Second, we show that ignoring or inadequately addressing nonstationarities can cause standard A/B test estimators to have suboptimal variance and nonvanishing bias, therefore leading to loss of statistical efficiency and accuracy. We provide a new estimator that views time as a continuous strata and performs poststratification with a data-dependent number of stratification levels. Without making parametric assumptions, we prove a central limit theorem for the proposed estimator and show that the estimator attains the best achievable asymptotic variance and is asymptotically unbiased. Third, we propose a time-grouped randomization that is designed to balance treatment and control assignments at granular time scales. We show that when the time-grouped randomization is integrated to standard experimental designs to generate experiment data, simple A/B test estimators can achieve asymptotically optimal variance. A brief account of numerical experiments are conducted to illustrate the analysis.