Theses and Dissertations--Computer Science

Machine Learning Framework for Real-World Electronic Health Records Regarding Missingness, Interpretability, and Fairness

Jing Lucas Liu, University of KentuckyFollow

Author ORCID Identifier

https://orcid.org/0000-0003-2890-9673

Date Available

5-9-2023

Year of Publication

2023

Document Type

Doctoral Dissertation

Degree Name

Doctor of Philosophy (PhD)

College

Engineering

Department/School/Program

Computer Science

Advisor

Dr. Jin Chen

Abstract

Machine learning (ML) and deep learning (DL) techniques have shown promising results in healthcare applications using Electronic Health Records (EHRs) data. However, their adoption in real-world healthcare settings is hindered by three major challenges. Firstly, real-world EHR data typically contains numerous missing values. Secondly, traditional ML/DL models are typically considered black-boxes, whereas interpretability is required for real-world healthcare applications. Finally, differences in data distributions may lead to unfairness and performance disparities, particularly in subpopulations.

This dissertation proposes methods to address missing data, interpretability, and fairness issues. The first work proposes an ensemble prediction framework for EHR data with large missing rates using multiple subsets with lower missing rates. The second method introduces the integration of medical knowledge graphs and double attention mechanism with the long short-term memory (LSTM) model to enhance interpretability by providing knowledge-based model interpretation. The third method develops an LSTM variant that integrates medical knowledge graphs and additional time-aware gates to handle multi-variable temporal missing issues and interpretability concerns. Finally, a transformer-based model is proposed to learn unbiased and fair representations of diverse subpopulations using domain classifiers and three attention mechanisms.

Digital Object Identifier (DOI)

https://doi.org/10.13023/etd.2023.156

Funding Information

This study was supported by Jin Chen startup at University of Kentucky in 2019 to 2020.

This study was supported by National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) Grant (no.: R56 DK126930 and P30 DK079337) in 2021 to 2023.

Recommended Citation

Liu, Jing Lucas, "Machine Learning Framework for Real-World Electronic Health Records Regarding Missingness, Interpretability, and Fairness" (2023). Theses and Dissertations--Computer Science. 131.
https://uknowledge.uky.edu/cs_etds/131

Download

Included in

Artificial Intelligence and Robotics Commons, Biomedical Informatics Commons

COinS

Theses and Dissertations--Computer Science

Machine Learning Framework for Real-World Electronic Health Records Regarding Missingness, Interpretability, and Fairness

Author ORCID Identifier

Date Available

Year of Publication

Document Type

Degree Name

College

Department/School/Program

Advisor

Abstract

Digital Object Identifier (DOI)

Funding Information

Recommended Citation

Included in

Search

Browse by Author

Author Corner

Connect

Theses and Dissertations--Computer Science

Machine Learning Framework for Real-World Electronic Health Records Regarding Missingness, Interpretability, and Fairness

Author

Author ORCID Identifier

Date Available

Year of Publication

Document Type

Degree Name

College

Department/School/Program

Advisor

Abstract

Digital Object Identifier (DOI)

Funding Information

Recommended Citation

Included in

Share

Search

Browse by Author

Author Corner

Connect