Author ORCID Identifier

https://orcid.org/0000-0002-6176-945X

Date Available

9-26-2025

Year of Publication

2025

Document Type

Doctoral Dissertation

Degree Name

Doctor of Philosophy (PhD)

College

Medicine

Department/School/Program

Clinical and Translational Science

Faculty

Jeffery Talbert

Abstract

Inflammatory bowel disease is a chronic relapsing-remitting disease of the intestinal tract affecting approximately 0.5% of the population in the US. Moderate to severe inflammatory bowel disease is managed with biologic therapies to induce and maintain remission, however nonadherence to biologics occurs in and estimated 16.9% and 45% of patients. Nonadherence to biologics can increase the risk of disease flares, emergent healthcare utilization, and surgical intervention. While numerous factors have been associated with biologic nonadherence, there are limited resources for prospectively assessing patient-specific risk of biologic nonadherence. Accordingly, objective of this work is to develop machine learning methods of predicting biologic nonadherence in patients with inflammatory bowel disease. This was accomplished by conducting a scoping review of machine learning methods for prediction of medication adherence (Aim 1), and applying findings within two iterative studies to create models (Aim 2) and subsequently improve predictive performance (Aim 3). Key findings of the scoping review (Aim 1) included commonplace use of indirect, dispense history-based methods for defining adherence, such as proportion of days covered. Further, numerous important predictors, including sociodemographic factors, comorbidities and medication history, prior adherence were identified. Finally, effective model training algorithms, techniques and validation metrics were recognized. In initial creation of predictive models, 48 machine learning models were developed by training on one year of beneficiary administrative claims data using combinations of 8 algorithms and 6 data preprocessing strategies (Aim 2). Model predictive performance was poor when generalized to data obtained from a clinical electronic medical record (area under the receiver operating characteristic curve [AUC] 0.55). Subsequently, models were trained on per-dispense adherence data from Medicare Fee-for-Service claims data (Aim 3). Inclusion of additional sociodemographic factors and time-varying factors such as prior adherence to the model improved predictive performance (AUC 0.714). While further study is required to improve sensitivity and translate to clinical benefit, this work provides a blueprint for future clinical decision support tools predicting biologic nonadherence in patients with inflammatory bowel disease.

Digital Object Identifier (DOI)

https://doi.org/10.13023/etd.2025.457

Supplemental_Data.Systematic_Review_Data_Collection.txt (29 kB)
Supplemental Data

Supplemental_Figure_3.1_Distribution.tiff (1593 kB)
Supplemental Figure 3.1

Supplemental_Figure_3.2_AUC_Training.tiff (2840 kB)
Supplemental Figure 3.2

Supplemental_Figure_3.3_AUC_Test.tiff (2732 kB)
Supplemental Figure 3.3

Supplemental_Methods_Chapter4.pdf (540 kB)
Supplemental Methods 4.0

Supplemental_Table_2.1_Search_Strategy.pdf (92 kB)
Supplemental Table 2.1

Supplemental_Table_2.2_Data_Dictionary.pdf (109 kB)
Supplemental Table 2.2

Supplemental_Table_3.1_Study_Definitions.pdf (296 kB)
Supplemental Table 3.1

Supplemental_Table_3.2_Model_Hyperparameters.pdf (296 kB)
Supplemental Table 3.2

Supplemental_Table_3.3_Training_Performance.pdf (306 kB)
Supplemental Table 3.3

Supplemental_Table_3.4_Test_Performance.pdf (306 kB)
Supplemental Table 3.4

Supplemental_Table_4.1_Study_Definitions.pdf (425 kB)
Supplemental Table 4.1

Supplemental_Table_4.2_Demographic_OR.pdf (568 kB)
Supplemental Table 4.2

Supplemental_Table_4.3_Diagnoses_OR.pdf (446 kB)
Supplemental Table 4.3

Share

COinS