BACKGROUND: Genome-wide association studies (GWAS) have proven successful in predicting genetic risk of disease using single-locus models; however, identifying single nucleotide polymorphism (SNP) interactions at the genome-wide scale is limited due to computational and statistical challenges. We addressed the computational burden encountered when detecting SNP interactions for survival analysis, such as age of disease-onset. To confront this problem, we developed a novel algorithm, called the Efficient Survival Multifactor Dimensionality Reduction (ES-MDR) method, which used Martingale Residuals as the outcome parameter to estimate survival outcomes, and implemented the Quantitative Multifactor Dimensionality Reduction method to identify significant interactions associated with age of disease-onset.

METHODS: To demonstrate efficacy, we evaluated this method on two simulation data sets to estimate the type I error rate and power. Simulations showed that ES-MDR identified interactions using less computational workload and allowed for adjustment of covariates. We applied ES-MDR on the OncoArray-TRICL Consortium data with 14,935 cases and 12,787 controls for lung cancer (SNPs = 108,254) to search over all two-way interactions to identify genetic interactions associated with lung cancer age-of-onset. We tested the best model in an independent data set from the OncoArray-TRICL data.

RESULTS: Our experiment on the OncoArray-TRICL data identified many one-way and two-way models with a single-base deletion in the noncoding region of BRCA1 (HR 1.24, P = 3.15 × 10–15), as the top marker to predict age of lung cancer onset.

CONCLUSIONS: From the results of our extensive simulations and analysis of a large GWAS study, we demonstrated that our method is an efficient algorithm that identified genetic interactions to include in our models to predict survival outcomes.

Document Type


Publication Date


Notes/Citation Information

Published in BMC Medical Genomics, v. 13, issue 1, article no. 162.

© The Author(s) 2020

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (https://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Digital Object Identifier (DOI)


Funding Information

This publication was funded in part by the following: National Institute of Health (NIH), National Library of Medicine (NLM) Institute research grants R01LM012012 and T32LM012204, National Cancer Institute (NCI) grant U19CA203654, and the Cancer Prevention Research Institute of Texas (CPRIT) RR170048. NIH-NLM R01LM012012, NCI U19CA203654 and CPRIT RR170048 funded the design, data collection, analysis and interpretation of the study and in writing the manuscript. NIH-NLM T32LM012204 funded the analysis and interpretation of the data and in writing the manuscript.

Related Content

The datasets generated and/or analyzed during the current study are available in the NCBI dbGaP repository, dbGaP Study Accession: phs001273.v3.p2, website: https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001273.v3.p2.