Biostatistics Faculty Publications

Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time

Suyan Tian, The First Hospital of Jilin University, ChinaFollow
Chi Wang, University of KentuckyFollow

Abstract

With the rapid evolution of high-throughput technologies, time series/longitudinal high-throughput experiments have become possible and affordable. However, the development of statistical methods dealing with gene expression profiles across time points has not kept up with the explosion of such data. The feature selection process is of critical importance for longitudinal microarray data. In this study, we proposed aggregating a gene’s expression values across time into a single value using the sign average method, thereby degrading a longitudinal feature selection process into a classic one. Regularized logistic regression models with pseudogenes (i.e., the sign average of genes across time as predictors) were then optimized by either the coordinate descent method or the threshold gradient descent regularization method. By applying the proposed methods to simulated data and a traumatic injury dataset, we have demonstrated that the proposed methods, especially for the combination of sign average and threshold gradient descent regularization, outperform other competitive algorithms. To conclude, the proposed methods are highly recommended for studies with the objective of carrying out feature selection for longitudinal gene expression data.

Document Type

Article

Publication Date

3-19-2019

Notes/Citation Information

Published in BioMed Research International, v. 2019, article ID 1724898, p. 1-12.

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Digital Object Identifier (DOI)

https://doi.org/10.1155/2019/1724898

Funding Information

This study was supported by funding (No. 31401123) from the Natural Science Foundation of China.

Repository Citation

Tian, Suyan and Wang, Chi, "Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time" (2019). Biostatistics Faculty Publications. 43.
https://uknowledge.uky.edu/biostatistics_facpub/43

1724898.f1.docx (18 kB)
Supplementary File 1: R codes for the proposed method

Download

Additional files available below

Included in

Biostatistics Commons, Computational Biology Commons, Longitudinal Data Analysis and Time Series Commons, Microarrays Commons

COinS

Biostatistics Faculty Publications

Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time

Abstract

Document Type

Publication Date

Notes/Citation Information

Digital Object Identifier (DOI)

Funding Information

Related Content

Repository Citation

Included in

Search

Browse by Author

Author Corner

Connect

Biostatistics Faculty Publications

Feature Selection for Longitudinal Data by Using Sign Averages to Summarize Gene Expression Values over Time

Authors

Abstract

Document Type

Publication Date

Notes/Citation Information

Digital Object Identifier (DOI)

Funding Information

Related Content

Repository Citation

Included in

Share

Search

Browse by Author

Author Corner

Connect