Author ORCID Identifier

https://orcid.org/0000-0003-3495-3873

Date Available

1-18-2025

Year of Publication

2024

Degree Name

Doctor of Philosophy (PhD)

Document Type

Doctoral Dissertation

College

Arts and Sciences

Department/School/Program

Statistics

First Advisor

Dr. Katherine Thompson

Abstract

This paper introduces the bar-code variable, a novel method for processing a sequence of binary explanatory variables efficiently in the linear regression modeling framework. Represented as an integer or a sequence of bits, the bar-code variable captures infor- mation on original binary variables and their potential interaction effects. Utilizing the bar-code variable, the study explores streamlined feature selection in linear re- gression modeling with binary explanatory variables. The paper demonstrates how the bar-code variable, through re-parameterization, facilitates the transition from cell means estimates, µ̂, in the cell-means ANOVA model to coefficient estimates, β̂, in the linear regression model, and vice versa. The adoption of bar-code variable most importantly improves memory usage and computational efficiency. Furthermore, this provides a unique perspective on feature selection from all possible interaction effects when the use of the bar-code variable is extended to be integrated with agglomerative clustering and Lasso regression. Additionally, two novel importance score methods are introduced to further leverage the bar-code variable in identifying interaction effects. These findings will contribute to a more efficient and insightful statistical analysis approach.

Digital Object Identifier (DOI)

https://doi.org/10.13023/etd.2024.21

Recommended Citation

Park, Lee Sak, "Bar-Code Variable: A Novel Approach to Efficiently Find Interaction Effects" (2024). Theses and Dissertations--Statistics. 74.
https://uknowledge.uky.edu/statistics_etds/74

Download

Available for download on Saturday, January 18, 2025

Contact Author

Included in

Applied Statistics Commons

COinS

Theses and Dissertations--Statistics

Bar-Code Variable: A Novel Approach to Efficiently Find Interaction Effects

Author ORCID Identifier

Date Available

Year of Publication

Degree Name

Document Type

College

Department/School/Program

First Advisor

Abstract

Digital Object Identifier (DOI)

Recommended Citation

Included in

Search

Browse by Author

Author Corner

Connect

Theses and Dissertations--Statistics

Bar-Code Variable: A Novel Approach to Efficiently Find Interaction Effects

Author

Author ORCID Identifier

Date Available

Year of Publication

Degree Name

Document Type

College

Department/School/Program

First Advisor

Abstract

Digital Object Identifier (DOI)

Recommended Citation

Included in

Share

Search

Browse by Author

Author Corner

Connect