Adoption of electronic health records (EHRs) by clinical practices and hospitals in the US has increased substantially since 2009, and offers opportunities for population health researchers to access rich structured and unstructured clinical data on large, diverse, and geographically distributed populations. However, because EHRs are intended for clinical and administrative use, the data must be curated for effective use in research. We describe EHRs, examine their use in population health research, and compare the strengths and limitations of these applications to traditional epidemiologic methods.

To date, EHR data have primarily been used to validate prior findings, to study specific diseases and population subgroups, to examine environmental and social factors and stigmatized conditions, to develop and implement predictive models, and to evaluate natural experiments. Although primary data collection may provide more reliable data and better population retention, EHR-based studies are less expensive and require less time to complete. In addition, large patient samples that can be readily identified from EHR data enable researchers to evaluate simultaneously multiple risk factors and/or outcomes while maintaining study power.

In addition to current advantages, improved capture of social, behavioral, environmental, and genetic data, and use of natural language processing, clinical biobanks, and personal sensing via smartphone should further enable EHR researchers to understand complex diseases with multifactorial etiologies. Integrating emerging technologies with clinical care could lead to innovative approaches to precision public health, reduce health care spending on individuals, and directly improve population health.

Frontiers_Fig1.pdf (51 kB)
Figure 1