On the development and validation of large language model- based classifiers for identifying social determinants of health

成果类型:
Article
署名作者:
Gabriel, Rodney A.; Litake, Onkar; Simpson, Sierra; Burton, Brittany N.; Waterman, Ruth S.; Macias, Alvaro A.
署名单位:
University of California System; University of California San Diego; University of California System; University of California San Diego; University of California System; University of California Los Angeles
刊物名称:
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
ISSN/ISSBN:
0027-8580
DOI:
10.1073/pnas.2320716121
发表日期:
2024-09-24
关键词:
摘要:
The assessment of social determinants of health (SDoH) within healthcare systems is crucial for comprehensive patient care and addressing health disparities. Current challenges arise from the limited inclusion of structured SDoH information within electronic health record (EHR) systems, often due to the lack of standardized diagnosis codes. This study delves into the transformative potential of large language models (LLM) to overcome these challenges. LLM-based classifiers-using Bidirectional Encoder Representations from Transformers (BERT) and A Robustly Optimized BERT Pretraining Approach (RoBERTa)-were developed for SDoH concepts, including homelessness, food insecurity, and domestic violence, using synthetic training datasets generated by generative pre- trained transformers combined with authentic clinical notes. Models were then validated on separate datasets: Medical Information Mart for Intensive Care- III and our institutional EHR data. When training the model with a combination of synthetic and authentic notes, validation on our institutional dataset yielded an area under the receiver operating characteristics curve of 0.78 for detecting homelessness, 0.72 for detecting food insecurity, and 0.83 for detecting domestic violence. This study underscores the potential of LLMs in extracting SDoH information from clinical text. Automated detection of SDoH may be instrumental for healthcare providers in identifying at- risk patients, guiding targeted interventions, and contributing to population health initiatives aimed at mitigating disparities.
来源URL: