Author ORCID Identifier
https://orcid.org/0000-0002-9893-4084
Date Available
8-20-2026
Year of Publication
2025
Document Type
Doctoral Dissertation
Degree Name
Doctor of Philosophy (PhD)
College
Medicine
Department/School/Program
Clinical and Translational Science
Faculty
Daniel Harris
Faculty
Sharon Walsh
Faculty
Claire Clark
Abstract
The ongoing opioid overdose crisis in the United States requires timely and accurate surveillance systems to inform public health responses. Traditional public health surveillance methods rely on hospital discharge data and death certificates, which suffer from significant reporting delays and miss cases where patients refuse hospital transportation. Emergency Medical Services (EMS) data presents a promising alternative with advantages in timeliness and case ascertainment but lacks validated definitions for suspected opioid overdose (SOO).
This dissertation addresses this critical gap through the development, validation, and fairness assessment of machine learning models with natural language processing (ML-NLP) for identifying SOOs in EMS data. The research extends existing expert-driven knowledge-based (KB) definitions by leveraging rich narrative text in EMS records to improve classification accuracy while examining potential algorithmic bias across socioeconomic and demographic groups.
In the first study, a sample of 2,327 Kentucky EMS encounters from 2018-2022 underwent expert review to establish ground-truth SOO labels. Five established KB definitions were evaluated against novel ML-NLP approaches using random forest models. Results demonstrated that ML-NLP models outperformed KB definitions, with the full-featured model (combining structured and unstructured data) achieving the highest F1-score (0.81) compared to the best KB definition (0.77). The ML-NLP model demonstrated superior precision (0.82 vs. 0.69) while maintaining comparable sensitivity, underscoring the value of integrating domain-specific knowledge with advanced analytical techniques to enhance SOO surveillance.
The second study examined potential algorithmic bias across demographic groups and neighborhood social vulnerability index (SVI) quartiles. Using a demographically balanced dataset with oversampling of Black patients, various model designs were evaluated for fairness. While modest disparities were observed in classification performance, the SVI inclusive ML-NLP model (incorporating incident location's SVI) demonstrated the most balanced performance. Fairness metrics indicated minimal systemic bias in the optimized models, particularly when integrating both race and social vulnerability features. Notably, the performance gains over the highest-performing KB definition were relatively modest, suggesting well-designed expert-driven approaches remain viable alternatives to computationally intensive methods.
The dissertation concludes by exploring practical implementation, addressing technical infrastructure requirements, workforce training needs, and organizational barriers within public health agencies. A framework for responsible implementation balances improved surveillance capabilities with equity considerations.
This research makes three significant contributions: (1) establishing a methodological framework for validating EMS-based opioid overdose surveillance, (2) demonstrating performance advantages of ML-NLP over traditional rule-based approaches, and (3) providing fairness assessment methodologies essential for responsible implementation. The findings support the integration of advanced analytics into public health surveillance while emphasizing ongoing evaluation of algorithmic fairness in definition evaluation.
Digital Object Identifier (DOI)
https://doi.org/10.13023/etd.2025.417
Funding Information
This study was supported by the Centers for Disease Control and Prevention Overdose Data to Action Grant (no.: NU17CE924971-01-01) from 2019-2022,
the National Institutes of Health's National Institute on Drug Abuse Rapid Actionable Data for Opioid Response in Kentucky (no.:R01 DA057605-01) and
Associated supplemental grant focused on artificial intelligence fairness (no.:R01 DA057605-01S2)
Recommended Citation
Rock, Peter J., "ENHANCING PUBLIC HEALTH SURVEILLANCE: DEVELOPMENT AND VALIDATION OF MACHINE LEARNING MODELS FOR SUSPECTED OPIOID OVERDOSE DETECTION IN EMERGENCY MEDICAL SERVICES DATA" (2025). Theses and Dissertations--Clinical and Translational Science. 26.
https://uknowledge.uky.edu/cts_etds/26
Included in
Artificial Intelligence and Robotics Commons, Biomedical Informatics Commons, Epidemiology Commons, Translational Medical Research Commons
