Proper Inference for Value Function in High-Dimensional Q-Learning for Dynamic Treatment Regimes

成果类型:
Article
署名作者:
Zhu, Wensheng; Zeng, Donglin; Song, Rui
署名单位:
Northeast Normal University - China; University of North Carolina; University of North Carolina Chapel Hill; University of North Carolina School of Medicine; North Carolina State University
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2018.1506341
发表日期:
2019
页码:
1404-1417
关键词:
sequenced treatment alternatives nonconcave penalized likelihood variable selection rationale
摘要:
Dynamic treatment regimes are a set of decision rules and each treatment decision is tailored over time according to patients' responses to previous treatments as well as covariate history. There is a growing interest in development of correct statistical inference for optimal dynamic treatment regimes to handle the challenges of nonregularity problems in the presence of nonrespondents who have zero-treatment effects, especially when the dimension of the tailoring variables is high. In this article, we propose a high-dimensional Q-learning (HQ-learning) to facilitate the inference of optimal values and parameters. The proposed method allows us to simultaneously estimate the optimal dynamic treatment regimes and select the important variables that truly contribute to the individual reward. At the same time, hard thresholding is introduced in the method to eliminate the effects of the nonrespondents. The asymptotic properties for the parameter estimators as well as the estimated optimal value function are then established by adjusting the bias due to thresholding. Both simulation studies and real data analysis demonstrate satisfactory performance for obtaining the proper inference for the value function for the optimal dynamic treatment regimes.