Date Available


Year of Publication


Degree Name

Doctor of Philosophy (PhD)

Document Type

Doctoral Dissertation


Arts and Sciences



First Advisor

Dr. Chi Wang

Second Advisor

Dr. Arnold J. Stromberg


Carcinogenesis is a complex process involving somatic mutations in a number of key biological pathways. Studying cancer evolution is an important task which contributes to better understanding of cancer biology and facilitates identification of new therapeutic targets. We focus on two important questions in cancer evolution. The first question is to delineating the temporal order of pathway mutations during tumorigenesis. And the other question is to cluster patients into biologically meaningful cancer subtypes. We present new statistical methods to 1)leverage functional annotations of mutations to enhance estimation of the order of pathway mutations during carcinogenesis, 2) incorporate intra-tumoral heterogeneity information in pathway mutation order inference and 3) identify cancer subtypes and classify patients into cancer subtypes based on the order of pathway mutation.
Our methods use a probabilistic approach to characterize the likelihood of mutational events from different pathways occurring in a certain order, wherein it giving greater weight to the orders that have more evidence to be consistent with the phylogenetic structures of the tumor samples. The functional impact of each mutation is incorporated to weigh more on a mutation that is more integral to tumor development. A maximum likelihood method is used to estimate parameters and infer the probability of one pathway being mutated prior to another. Furthermore, we propose a PMBC(penalized model-based clustering)-like method to leverage the temporal order of pathway mutations for cancer subtype classification, which leads to mechanistically interpretable cancer subtypes. Our method uses a penalized likelihood approach and an EM algorithm to simultaneously cluster patients into subtypes and select important temporal order features that contribute to discriminating different subtypes. Simulation studies and real data analyses demonstrate the ability of our methods to accurately infer the temporal order of pathway mutations and cluster patients into subtypes.

Digital Object Identifier (DOI)

Available for download on Friday, May 30, 2025