Author ORCID Identifier
Year of Publication
Doctor of Philosophy (PhD)
Epidemiology and Biostatistics
Dr. April M. Young
Epidemiological surveillance is key to monitoring and assessing the health of populations. Drug overdose surveillance has become an increasingly important part of public health practice as overdose morbidity and mortality has increased due in large part to the opioid crisis. Monitoring drug overdose mortality relies on death certificate data, which has several limitations including timeliness and the coding structure used to identify specific substances that caused death. These limitations stem from the need to analyze the free-text cause-of-death sections of the death certificate that are completed by the medical certifier during death investigation. Other fields, including clinical sciences, have utilized natural language processing (NLP) methods to gain insight from free-text data, but thus far, adoption of NLP methods in epidemiological surveillance has been limited. Through a narrative review of NLP methods currently used in public health surveillance and the integration of two NLP tasks, classification and named entity recognition, this dissertation enhances the capabilities of public health practitioners and researchers to perform drug overdose mortality surveillance. This dissertation advances both surveillance science and public health practice by integrating methods from bioinformatics into the surveillance pipeline which provides more timely and increased quality overdose mortality surveillance, which is essential to guiding effective public health response to the continuing drug overdose epidemic.
Digital Object Identifier (DOI)
This dissertation was supported by the Center for Disease Control and Prevention's Enhanced State Opioid Overdose Surveillance Grant (no: 5NU17CE924880-03), 2017-2019, and the Center for Disease Control and Prevention's Overdose Data to Action Grant (no: 1NU17CE924971-01-00), 2019-2021.
Ward, Patrick J., "Enhancing Drug Overdose Mortality Surveillance through Natural Language Processing and Machine Learning" (2021). Theses and Dissertations--Epidemiology and Biostatistics. 27.