Abstract

Recently, structural variation in the genome has been implicated in many complex diseases. Using genomewide single nucleotide polymorphism (SNP) arrays, researchers are able to investigate the impact not only of SNP variation, but also of copy-number variants (CNVs) on the phenotype. The most common analytic approach involves estimating, at the level of the individual genome, the underlying number of copies present at each location. Once this is completed, tests are performed to determine the association between copy number state and phenotype. An alternative approach is to carry out association testing first, between phenotype and raw intensities from the SNP array at the level of the individual marker, and then aggregate neighboring test results to identify CNVs associated with the phenotype. Here, we explore the strengths and weaknesses of these two approaches using both simulations and real data from a pharmacogenomic study of the chemotherapeutic agent gemcitabine. Our results indicate that pooled marker-level testing is capable of offering a dramatic increase in power (> 12-fold) over CNV-level testing, particularly for small CNVs. However, CNV-level testing is superior when CNVs are large and rare; understanding these tradeoffs is an important consideration in conducting association studies of structural variation.

Document Type

Article

Publication Date

4-6-2012

Notes/Citation Information

Published on PLOS One, v. 7, issue. 6, e34262.

© 2012 Breheny et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Digital Object Identifier (DOI)

http://dx.doi.org/10.1371/journal.pone.0034262

Share

COinS