Inference of dependency knowledge graph for Electronic Health Records
成果类型:
Article; Early Access
署名作者:
Xu, Zhiwei; Gan, Ziming; Zhou, Doudou; Shen, Shuting; Lu, Junwei; Cai, Tianxi
署名单位:
University of Michigan System; University of Michigan; University of Chicago; National University of Singapore; Harvard University; Harvard T.H. Chan School of Public Health; Harvard University; Harvard Medical School
刊物名称:
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
ISSN/ISSBN:
1369-7412
DOI:
10.1093/jrsssb/qkaf061
发表日期:
2025
关键词:
alzheimers-disease
摘要:
The effective analysis of high-dimensional Electronic Health Record (EHR) data, with substantial potential for healthcare research, presents notable methodological challenges. Employing predictive modeling guided by a knowledge graph (KG), which enables efficient feature selection, can enhance both statistical efficiency and interpretability. While various methods have emerged for constructing KGs, existing techniques often lack statistical certainty concerning the presence of links between entities, especially in scenarios where the utilization of patient-level EHR data is limited due to privacy concerns. In this paper, we propose the first inferential framework for deriving a sparse KG with statistical guarantee based on a dynamic log-linear topic model. Within this model, the KG embeddings are estimated by performing singular value decomposition on the empirical pointwise mutual information matrix, offering a scalable solution. We then establish entrywise asymptotic normality for the KG low-rank estimator, enabling the recovery of sparse graph edges with controlled type I error. Our work uniquely addresses the under-explored domain of statistical inference about non-linear statistics under the low-rank temporal dependent models, a critical gap in existing research. We validate our approach through extensive simulation studies and then apply the method to real-world EHR data in constructing clinical KGs and generating clinical feature embeddings.