In the era of big data, researchers interested in developing statistical models are challenged with how to achieve parsimony. Usually, some sort of dimension reduction strategy is employed. Classic strategies are often in the form of traditional inference procedures, such as hypothesis testing; however, the increase in computing capabilities has led to the development of more sophisticated methods. In particular, sufficient dimension reduction has emerged as an area of broad and current interest. While these types of dimension reduction strategies have been employed for numerous data problems, they are scantly discussed in the context of analyzing survey data. This paper provides an overview of some classic and modern dimension reduction methods, followed by a discussion of how to use the transformed variables in the context of analyzing survey data. We highlight some of these methods with an analysis of health insurance coverage using the US Census Bureau’s 2015 Planning Database.

Document Type


Publication Date


Notes/Citation Information

Published in Journal of Big Data, v. 4, 43, p. 1-19.

© The Author(s) 2017.

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Digital Object Identifier (DOI)


Funding Information

JW was supported as a Research Assistant by NSF Grant SES-1562503 throughout the duration of this research.

Related Content

The 2015 PDB is a publicly available Census Bureau dataset located at http://goo.gl/LlcwY7. All R code used to analyze the data is available as Additional files 1, 2.

40537_2017_103_MOESM1_ESM.pdf (730 kB)
Additional file 1.

40537_2017_103_MOESM2_ESM.r (28 kB)
Additional file 2.