Author ORCID Identifier

https://orcid.org/0000-0003-4818-5959

Date Available

12-7-2018

Year of Publication

2018

Document Type

Doctoral Dissertation

Degree Name

Doctor of Philosophy (PhD)

College

Arts and Sciences

Department/School/Program

Statistics

Advisor

Dr. Xiangrong Yin

Abstract

When scientists know in advance that some features (variables) are important in modeling a data, then these important features should be kept in the model. How can we utilize this prior information to effectively find other important features? This dissertation is to provide a solution, using such prior information. We propose the Conditional Adaptive Lasso (CAL) estimates to exploit this knowledge. By choosing a meaningful conditioning set, namely the prior information, CAL shows better performance in both variable selection and model estimation. We also propose Sufficient Conditional Adaptive Lasso Variable Screening (SCAL-VS) and Conditioning Set Sufficient Conditional Adaptive Lasso Variable Screening (CS-SCAL-VS) algorithms based on CAL. The asymptotic and oracle properties are proved. Simulations, especially for the large p small n problems, are performed with comparisons with other existing methods. We further extend to the linear model setup to the generalized linear models (GLM). Instead of least squares, we consider the likelihood function with L1 penalty, that is the penalized likelihood methods. We proposed for Generalized Conditional Adaptive Lasso (GCAL) for the generalized linear models. We then further extend the method for any penalty terms that satisfy certain regularity conditions, namely Conditionally Penalized Estimate (CPE). Asymptotic and oracle properties are showed. Four corresponding sufficient variable screening algorithms are proposed. Simulation examples are evaluated for our method with comparisons with existing methods. GCAL is also evaluated with a read data set on leukemia.

Digital Object Identifier (DOI)

https://doi.org/10.13023/etd.2018.484

Share

COinS