The United States is one of the largest per capita water withdrawers in the world, and certain parts of it, especially the western region, have long experienced water scarcity. Historically, the U.S. relied on large water infrastructure investments and planning to solve its water scarcity problems. These large-scale investments as well as water planning activities rely on water forecast studies conducted by water managing agencies. These forecasts, while key to the sustainable management of water, are usually done using historical growth extrapolation, conventional econometric approaches, or legacy software packages and often do not utilize methods common in the field of statistical learning. The objective of this study is to illustrate the extent to which forecast outcomes for commercial, institutional and industrial water use may be improved with a relatively simple adjustment to forecast model selection. To do so, we estimate over 352 thousand regression models with retailer level panel data from the largest utility in the U.S., featuring a rich set of variables to model commercial, institutional, and industrial water use in Southern California. Out-of-sample forecasting performances of those models that rank within the top 5% based on various in- and out-of-sample goodness-of-fit criteria were compared. We demonstrate that models with the best in-sample fit yeild, on average, larger forecast errors for out-of-sample forecast exercises and are subject to a significant degree of variation in forecasts. We find that out-of-sample forecast error and the variability in the forecast values can be reduced by an order of magnitude with a relatively straightforward change in the model selection criteria even when the forecast modelers do not have access to “big data” or utilize state-of-the-art machine learning techniques.

Document Type


Publication Date


Notes/Citation Information

Published in Sustainability, v. 12, no. 10, 3995, p. 1-21.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland.

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Digital Object Identifier (DOI)