Multiple edit/multiple imputation for multivariate continuous data
成果类型:
Article
署名作者:
Ghosh-Dastidar, B; Schafer, JL
署名单位:
RAND Corporation; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1198/016214503000000738
发表日期:
2003
页码:
807-817
关键词:
t-distribution
摘要:
Multiple imputation replaces an incomplete dataset with m > 1 simulated complete versions that are analyzed separately by standard methods. We present a natural extension of multiple imputation for handling the dual problems of nonresponse and response error. This extension, which we call multiple edit/multiple imputation (MEMI), replaces an observed dataset containing missing values and errors with m > 1 simulated versions of the ideal dataset that is complete and error-free. These ideal data sets are analyzed separately, and the results are combined using the same rules as for multiple imputation. The resulting inferences simultaneously reflect uncertainty due to nonresponse and response error. MEMI may be an attractive alternative to deterministic or quasi-statistical edit and imputation procedures used by many data-collecting agencies. Producing MEMI's requires assumptions about the distribution of the ideal data, the nature of nonresponse, and a model for the response error mechanism. However, fitting such a model does not necessarily require data from a follow-up study. In this article we develop and implement MEMI for preliminary data from the Third National Health and Nutrition Examination Survey, Phase I (1988-1991). Raw body measurements for 1,345 children age 2-3 years are imputed under a Bayesian model for intermittent or semicontinuous errors. The resulting population estimates are found to be quite insensitive to prior assumptions about the rates and magnitude of errors.