Q-LEARNING WITH CENSORED DATA
成果类型:
Article
署名作者:
Goldberg, Yair; Kosorok, Michael R.
署名单位:
University of North Carolina; University of North Carolina Chapel Hill
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/12-AOS968
发表日期:
2012
页码:
529-560
关键词:
2-stage randomization designs
survival distributions
regression
摘要:
We develop methodology for a multistage decision problem with flexible number of stages in which the rewards are survival times that are subject to censoring. We present a novel Q-learning algorithm that is adjusted for censored data and allows a flexible number of stages. We provide finite sample bounds on the generalization error of the policy learned by the algorithm, and show that when the optimal Q-function belongs to the approximation space, the expected survival time for policies obtained by the algorithm converges to that of the optimal policy. We simulate a multistage clinical trial with flexible number of stages and apply the proposed censored-Q-learning algorithm to find individualized treatment regimens. The methodology presented in this paper has implications in the design of personalized medicine trials in cancer and in other life-threatening diseases.