Inference in Semiparametric Regression Models Under Partial Questionnaire Design and Nonmonotone Missing Data
成果类型:
Article
署名作者:
Chatterjee, Nilanjan; Li, Yan
署名单位:
National Institutes of Health (NIH) - USA; NIH National Cancer Institute (NCI); NIH National Cancer Institute- Division of Cancer Epidemiology & Genetics; University of Texas System; University of Texas Arlington
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1198/jasa.2010.tm08756
发表日期:
2010
页码:
787-797
关键词:
maximum-likelihood-estimation
2-stage case-control
logistic-regression
parameters
multistage
exposure
2-phase
RISK
摘要:
In epidemiologic studies, partial questionnaire design (PQD) can reduce cost, time, and other practical burdens associated with lengthy questionnaires by assigning different subsets of the questionnaire to different, but overlapping, subsets of the study participants. In this article, we describe methods for semiparametric inference for regression model under PQD and other study settings that can generate nonmonotone missing data in covariates. In particular, motivated from methods for multiphase designs, we develop three estimators, namely mean score, pseudo-likelihood, and semiparametric maximum likelihood, each of which has some unique advantages. We develop the asymptotic theory and a sandwich variance estimator for each of the estimators under the underlying semiparametric model that allows the distribution of the covariates to remain nonparametric. We study the finite sample performances and relative efficiencies of the methods using simulation studies. We illustrate the methods using data from a case-control study of non-Hodgkin's lymphoma where the data on the main chemical exposures of interest are collected using two different instruments on two different, but overlapping, subsets of the participants. This article has supplementary material online.