Year of Publication


Degree Name

Master of Science (MS)

Document Type

Master's Thesis




Computer Science

First Advisor

Dr. Zongming Fei

Second Advisor

Dr. Sujin Kim


With exponential growth on a daily basis, there is potentially valuable information hidden in complex electronic medical records (EMR) systems. In this thesis, several efficient data mining algorithms were explored to discover hidden knowledge in insurance claims data. The first aim was to cluster three levels of information overload(IO) groups among chronic rheumatic disease (CRD) patient groups based on their clinical events extracted from insurance claims data. The second aim was to discover hidden patterns using three renowned pattern mining algorithms: Apriori, frequent pattern growth(FP-Growth), and sequential pattern discovery using equivalence classes(SPADE). The SPADE algorithm was found to be the most efficient method for the dataset used. Finally, a prototype system named myDietPHIL was developed to manage clinical events for CRD patients’ and visualize the relationships of frequent clinical events. The system has been tested and visualization of relationships could facilitate patient education.

Digital Object Identifier (DOI)