DNA sequence variation can influence disease risk and response to drug therapy by altering gene expression, RNA processing, or the amino acid sequence of proteins. Genome-wide association studies use DNA microarrays to investigate the effect of millions of common DNA sequence variants in the human genome, of which the most common type are single nucleotide polymorphisms (SNPs). The genome-wide association approach is statistically powerful [27
] and has led to the discovery of many new genetic variants that underlie variation in human traits (www.genome.gov/26525384
), including a number of cancers [28
]. A number of these disease variants influence the risk of multiple diseases, including shared risk variants for several common cancers [29
]. Thus, the genome-wide studies may have important implications in drug development by assisting to identify novel therapeutic targets and genetic biomarkers that for drug discovery [30
Genome-wide association studies have become possible due to several recent technological advances. Improvements in DNA microarray technology have rapidly reduced the cost of genotyping SNPs, allowing for the testing of up to one million SNPs using a single microarray. At the same time, the HapMap Project validated nearly four million SNPs in multiple diverse populations, and determined the extent of linkage disequilibrium (LD) between SNPs [31
]. LD refers to the non-random association of SNPs, typically those that are closest together. The presence of LD allows for SNPs on genotyping platforms to serve as a proxy for other nearby SNPs [32
]. As a result, current DNA microarrays can assay most common SNPs in the HapMap. In this way, LD reduces both the genotyping costs of genome-wide association studies and the multiple testing burden (see the Biostatistical Analysis section below).
It is important to note that genome-wide association studies are better suited to investigate the potential association of common variants, typically defined as those with a minor allele frequency of greater than 5%, with disease than rare variants [33
]. Since strongly deleterious alleles are likely to face selection pressure, variants with large effects will be rare; common variants will have more modest effects on gene function. Most variants associated with disease in genome-wide association studies are common and have modest or small effects. Because of this, individual variants will not serve as strong predictors of disease risk [34
]. They may, however, explain a large amount of the risk of disease in the population, or population attributable fraction. For this reason, interventions that developed to counteract these risk variants could substantially reduce the incidence of disease in a population.