Author ORCID Identifier

https://orcid.org/0009-0000-1817-9641

Date Available

7-27-2026

Year of Publication

2024

Document Type

Doctoral Dissertation

Degree Name

Doctor of Philosophy (PhD)

College

Arts and Sciences

Department/School/Program

Statistics

Advisor

Dr. Chenglong Ye

Abstract

For high-dimensional data where the number of variables greatly exceeds the number of observations, selecting important variables while maintaining the required heredity conditions can be challenging. This dissertation is structured into three interconnected parts. In the first part, we propose a variable selection method by implementing a well-known optimization technique, the Genetic Algorithm. An R package was developed to simplify the implementation and usage of the proposed method. We then propose another variable selection method by extending the study from the Genetic Algorithm to a different but related optimization technique, Simulated Annealing. We consider three different hierarchical structures in both studies. We compare the performance and efficiency of the two proposed algorithms using multiple simulation studies. In the last part of the dissertation, a transfer learning-inspired algorithm with a specific focus on studying microbiome-metabolome interactions is proposed. We compare the proposed method with other existing methods in terms of mean squared error, type-I error, and power. An application of this method to real-world data reveals biologically significant interactions between gut microbes and various bile acids.

Digital Object Identifier (DOI)

https://doi.org/10.13023/etd.2024.284

Available for download on Monday, July 27, 2026

Share

COinS