This paper provides details on the necessary steps to assess and control data in genome wide association studies (GWAS) using genotype information on a large number of genetic markers for large number of individuals. Due to varied study designs and genotyping platforms between multiple sites/projects as well as potential genotyping errors, it is important to ensure high quality data. Scripts and directions are provided to facilitate others in this process.

Document Type


Publication Date


Notes/Citation Information

Published in F1000Research, v. 5, article 1889, p. 1-9.

© 2016 Ellingson SR and Fardo DW.

This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Digital Object Identifier (DOI)


Funding Information

This work was supported by the National Institutes of Health (NIH) National Center for Advancing Translational Science grant KL2TR000116 and the University of Kentucky Center for Computational Sciences.

Related Content

Software Availability

Zenodo: GWAS: Automated GWAS QC, doi: 10.5281/zenodo.58228.

GitHub: https://github.com/sallyrose0425/GWAS, https://github.com/sallyrose0425/GWAS/blob/master/LICENSE