Empirical-likelihood-based inference in missing response problems and its application in observational studies

成果类型:
Article
署名作者:
Qin, Jing; Zhang, Biao
署名单位:
University System of Ohio; University of Toledo; National Institutes of Health (NIH) - USA; NIH National Institute of Allergy & Infectious Diseases (NIAID)
刊物名称:
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
ISSN/ISSBN:
1369-7412
发表日期:
2007
页码:
101-122
关键词:
large-sample theory nonparametric-estimation propensity score distributions models
摘要:
The problem of missing response data is ubiquitous in medical and social science studies. In the case of responses that are missing at random (depending on some covariate information), analyses focused only on the complete data may lead to biased results. Various debias methods have been extensively studied in the literature, particularly the weighting method that was motivated by Horvitz and Thompson's estimators. To improve efficiency, Robins, Rotnitzky and Zhao proposed augmented estimating equations based on corrected complete-case analyses. A nice feature of the augmented method is its 'double robustness', i.e. the estimator that is derived from the augmented method is asymptotically unbiased if either the underlying missing data mechanism or the underlying regression function is correctly specified. Furthermore, the augmented estimator can achieve full efficiency if both the missing data mechanism and the regression function are correctly specified. In general, however, it is very difficult to specify the regression function correctly, especially when the dimension of covariates is high- this is the so-called curse of dimensionality problem. The augmented estimator has much lower efficiency if the 'working regression model' is not close to the true regression model. In this paper, the empirical likelihood method is employed to seek a constrained empirical likelihood estimation of mean response with the assumption that responses are missing at random. The empirical-likelihood-based estimators enjoy the double-robustness property. Moreover, it is possible that the empirical-likelihood-based inference can produce asymptotically unbiased and efficient estimators even if the true regression function is not completely known. Simulation results indicate that the empirical-likelihood-based estimators are very robust to a misspecification of the propensity score and dominate other competitors in the sense of having smaller mean-square errors. Methods that are developed in this paper have a nice application in observational causal inferences. The propensity score is used to adjust for differences in pretreatment variables in the estimation of average treatment effects.