Combining Multiple Observational Data Sources to Estimate Causal Effects
成果类型:
Article
署名作者:
Yang, Shu; Ding, Peng
署名单位:
North Carolina State University; University of California System; University of California Berkeley
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2019.1609973
发表日期:
2020
页码:
1540-1554
关键词:
propensity score calibration
doubly robust estimation
LARGE-SAMPLE PROPERTIES
auxiliary information
missing confounders
matching estimators
validation data
regression
inference
2-phase
摘要:
The era of big data has witnessed an increasing availability of multiple data sources for statistical analyses. We consider estimation of causal effects combining big main data with unmeasured confounders and smaller validation data withon these confounders. Under the unconfoundedness assumption with completely observed confounders, the smaller validation data allow for constructing consistent estimators for causal effects, but the big main data can only give error-prone estimators in general. However, by leveraging the information in the big main data in a principled way, we can improve the estimation efficiencies yet preserve the consistencies of the initial estimators based solely on the validation data. Our framework applies to asymptotically normal estimators, including the commonly used regression imputation, weighting, and matching estimators, and does not require a correct specification of the model relating the unmeasured confounders to the observed variables. We also propose appropriate bootstrap procedures, which makes our method straightforward to implement using software routines for existing estimators.for this article are available online.
来源URL: