BACKGROUND: Non-small cell lung cancer (NSCLC) is the predominant histological type of lung cancer, accounting for up to 85% of cases. Disease stage is commonly used to determine adjuvant treatment eligibility of NSCLC patients, however, it is an imprecise predictor of the prognosis of an individual patient. Currently, many researchers resort to microarray technology for identifying relevant genetic prognostic markers, with particular attention on trimming or extending a Cox regression model. Adenocarcinoma (AC) and squamous cell carcinoma (SCC) are two major histology subtypes of NSCLC. It has been demonstrated that fundamental differences exist in their underlying mechanisms, which motivated us to postulate the existence of specific genes related to the prognosis of each histology subtype.

RESULTS: In this article, we propose a simple filter feature selection algorithm with a Cox regression model as the base. Applying this method to real-world microarray data identifies a histology-specific prognostic gene signature. Furthermore, the resulting 32-gene (32/12 for AC/SCC) prognostic signature for early-stage AC and SCC samples has superior predictive ability relative to two relevant prognostic signatures, and has comparable performance with signatures obtained by applying two state-of-the art algorithms separately to AC and SCC samples.

CONCLUSIONS: Our proposal is conceptually simple, and straightforward to implement. Furthermore, it can be easily adapted and applied to a range of other research settings.

REVIEWERS: This article was reviewed by Leonid Hanin (nominated by Dr. Lev Klebanov), Limsoon Wong and Jun Yu.

Document Type


Publication Date


Notes/Citation Information

Published in Biology Direct, v. 10, article 15, p. 1-17.

© 2015 Tian et al.; licensee BioMed Central.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Digital Object Identifier (DOI)


s13062-015-0051-z-s1.doc (106 kB)
Supplementary materials

s13062-015-0051-z-s2.jpeg (1020 kB)
Figure S1.: Constructed functional protein-protein networks for 32-, 15-, 13-gene prognostic signatures.

Included in

Biostatistics Commons