Author ORCID Identifier

https://orcid.org/0000-0003-4818-5959

Year of Publication

2018

Degree Name

Doctor of Philosophy (PhD)

Document Type

Doctoral Dissertation

College

Arts and Sciences

Department

Statistics

First Advisor

Dr. Xiangrong Yin

Abstract

When scientists know in advance that some features (variables) are important in modeling a data, then these important features should be kept in the model. How can we utilize this prior information to effectively find other important features? This dissertation is to provide a solution, using such prior information. We propose the Conditional Adaptive Lasso (CAL) estimates to exploit this knowledge. By choosing a meaningful conditioning set, namely the prior information, CAL shows better performance in both variable selection and model estimation. We also propose Sufficient Conditional Adaptive Lasso Variable Screening (SCAL-VS) and Conditioning Set Sufficient Conditional Adaptive Lasso Variable Screening (CS-SCAL-VS) algorithms based on CAL. The asymptotic and oracle properties are proved. Simulations, especially for the large p small n problems, are performed with comparisons with other existing methods. We further extend to the linear model setup to the generalized linear models (GLM). Instead of least squares, we consider the likelihood function with L1 penalty, that is the penalized likelihood methods. We proposed for Generalized Conditional Adaptive Lasso (GCAL) for the generalized linear models. We then further extend the method for any penalty terms that satisfy certain regularity conditions, namely Conditionally Penalized Estimate (CPE). Asymptotic and oracle properties are showed. Four corresponding sufficient variable screening algorithms are proposed. Simulation examples are evaluated for our method with comparisons with existing methods. GCAL is also evaluated with a read data set on leukemia.

Digital Object Identifier (DOI)

https://doi.org/10.13023/etd.2018.484

Share

COinS