您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 统计学 > The Annals of Statistics > 2025 > 2期

A NEW CENTRAL LIMIT THEOREM FOR THE AUGMENTED IPW ESTIMATOR: VARIANCE INFLATION, CROSS-FIT COVARIANCE AND BEYOND

成果类型：

Article

署名作者：

Jiang, Kuanhao; Mukherjee, Rajarshi; Sen, Subhabrata; Sur, Pragya

署名单位：

Harvard University; Harvard University; Harvard T.H. Chan School of Public Health

刊物名称：

ANNALS OF STATISTICS

ISSN/ISSBN：

0090-5364

DOI：

10.1214/24-AOS2476

发表日期：

2025

页码：

647-675

关键词：

regularized calibrated estimation message-passing algorithms propensity score Causal Inference regression-models robust regression g-computation BIAS UNIVERSALITY asymptotics

摘要：

Estimation of the average treatment effect (ATE) is a central problem in causal inference. In recent times, inference for the ATE in the presence of high-dimensional covariates has been extensively studied. Among diverse approaches that have been proposed, augmented inverse propensity weighting (AIPW) with cross-fitting has emerged a popular choice in practice. In this work, we study this cross-fit AIPW estimator under well-specified outcome regression and propensity score models in a high-dimensional regime where the number of features and samples are both large and comparable. Under assumptions on the covariate distribution, we establish a new central limit theorem for the suitably scaled cross-fit AIPW that applies without any sparsity assumptions on the underlying high-dimensional parameters. Our CLT uncovers two crucial phenomena among others: (i) the AIPW exhibits a substantial variance inflation that can be precisely quantified in terms of the signal-to-noise ratio and other problem parameters, (ii) the asymptotic covariance between the precross-fit estimators is nonnegligible even on the root n scale. These findings are strikingly different from their classical counterparts. On the technical front, our work utilizes a novel interplay between three distinct tools-approximate message passing theory, the theory of deterministic equivalents and the leave-one-out approach. We believe our proof techniques should be useful for analyzing other two-stage estimators in this high-dimensional regime. We complement our theoretical results with simulations that demonstrate both the finite sample efficacy of our CLT and its robustness to our assumptions. Finally, we provide some theoretical evidence for the universality of our CLT to the law of the covariates, and explore the effects of certain forms of model misspecification.