Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Genet Epidemiol. Author manuscript; available in PMC 2012 April 1.
Published in final edited form as:
Genet Epidemiol. 2011 April; 35(3): 201–210.
Published online 2011 February 9. doi:  10.1002/gepi.20569
PMCID: PMC3076801

Sample Size Requirements to Detect Gene-Environment Interactions in Genome-wide Association Studies


Many complex diseases are likely to be a result of the interplay of genes and environmental exposures. The standard analysis in a genome-wide association study (GWAS) scans for main effects and ignores the potentially useful information in the available exposure data. Two recently proposed methods that exploit environmental exposure information involve a two-step analysis aimed at prioritizing the large number of SNPs tested to highlight those most likely to be involved in a G×E interaction. For example, Murcray et al (2009) proposed screening on a test that models the G-E association induced by an interaction in the combined case-control sample. Alternatively, Kooperberg et al (2008) suggested screening on genetic marginal effects. In both methods, SNPs that pass the respective screening step at a pre-specified significance threshold are followed up with a formal test of interaction in the second step. We propose a hybrid method that combines these two screening approaches by allocating a proportion of the overall genomewide significance level to each test. We show that the Murcray et al. approach is often the most efficient method, but that the hybrid approach is a powerful and robust method for nearly any underlying model. As an example, for a GWAS of 1 million markers including a single true disease SNP with minor allele frequency of 0.15, and a binary exposure with prevalence 0.3, the Murcray, Kooperberg and hybrid methods are 1.90, 1.27, and 1.87 times as efficient, respectively, as the traditional case-control analysis to detect an interaction effect size of 2.0.

Keywords: G×E interaction, case-control, genome-wide association study, efficiency


Many complex diseases (e.g. asthma, diabetes) are likely to be a result of the interplay of genes and environmental exposure[Barrett 2008; Blumenthal 2005; Boks, et al. 2007; Chamberlain, et al. 2006; Cookson 1999; Edwards and Myers 2007; Grarup and Andersen 2007; Hamet, et al. 1998]. Although current genome-wide association studies (GWAS) have identified many potentially important genetic associations that will advance our understanding of human disease, these studies have done little to investigate associations beyond main effects. This is partly due to the lack of powerful methods to identify important genes that interact with other genes or environmental exposures. Recently proposed methods to detect interactions in association studies have shown increased power relative to commonly used approaches[Kooperberg and Leblanc 2008; Li and Conti 2009; Mukherjee and Chatterjee 2008; Murcray, et al. 2009] but have done little to guide investigators in the use of their methods in real studies.

Many GWAS currently underway or completed have been conducted on samples with large amounts of existing environmental data[Herbeck, et al. 2010; Hunter, et al. 2007; Ising, et al. 2009; Scott, et al. 2007; van den Oord, et al. 2008]. The standard GWAS scans for genetic marginal effects and ignores the potentially useful information in the available environmental exposure data. It has been shown that power can be gained by accounting for possible gene × environment (G×E) interactions when scanning for marginal effects[Gauderman and Siegmund 2001; Kraft, et al. 2007]. Therefore, by omitting environmental data in the analysis of a GWAS, investigators might miss possible genetic associations that are modified by some environmental exposure. In addition, as the environmental data are often already available for these large-scale studies, additional testing for interactions to identify novel genetic markers beyond those that would be detected by marginal effect testing alone is very cost effective.

In this paper, we review current methods to detect G×E interactions in case-control studies and propose a hybrid analysis approach that combines two methods shown to be powerful to detect interactions in large-scale GWA studies[Kooperberg and Leblanc 2008; Murcray, et al. 2009]. We assume that a primary marginal genetic effect scan will always be the first analysis for a GWAS and aim to compare methods to detect interactions that would fail to be identified at a strict genome-wide significance level by the marginal effect scan. We compare sample size requirements to achieve adequate power to detect G×E interactions for a wide range of possible scenarios. We make recommendations to investigators about study design and choice of analysis method to optimize power for a GWAS. Finally, we describe freely available software designed to calculate power and sample size for all methods described.


Let D be an indicator of disease status, with D=1 being a case and D=0, a control. Assume E represents an environmental factor of interest. We consider the situations where E is either binary or continuously distributed. For the binary case, E can be either 1 or 0, indicating presence or absence of the environmental exposure, and we define pE as the exposure prevalence, Pr(E=1). For a continuous E, we assume a log-normal distribution, such that ln(E) ~ N(μ = 0, σ2 = 1) to mimic many measures of environmental exposures that are naturally non-negative (e.g. BMI, blood glucose, ambient air pollution). The relative comparisons between methods are similar for normally distributed exposure measures.

Assume S independent single nucleotide polymorphisms (SNPs) have been genotyped on N1 cases and N0 controls (N=N1 +N0). We assume there exists a single disease susceptibility locus (DSL), G, in Hardy-Weinberg equilibrium in the population from which the study subjects are sampled with minor allele frequency qA. We further assume G is coded additively (0, 1, or 2 minor alleles), but all tests described can easily be extended to other modes of inheritance (e.g. dominant, recessive, co-dominant). We use a logistic model to relate G and E to disease, with form


When E is binary, βg represents the log genetic main effect per allele in the unexposed group (E=0), βe is the log environmental main effect in the non-carriers of the susceptible genotype (G=0), and βge is the ratio of the genetic odds ratios comparing exposed to unexposed, ie. ORg|E=1/ORg|E=0. The baseline probability of disease for unexposed non-carriers (E=0, G=0) is exp(β0)/(1+exp(β0)). For the scenario where E is continuous, βg represents the log genetic main effect per allele in the unexposed group (E=0), βe is the log environmental main effect per unit increase in exposure in the non-carriers of the susceptible genotype (G=0), and βge is the ratio of the genetic odds ratios comparing individuals with exposures differing by one unit of measure, ie. ORg|E=e+1/ORg|E=e.


For case-control data, existing strategies to test for gene-environment interaction in a GWAS include those that test every SNP independently for interaction and apply a multiple testing correction for all S tests performed. Alternatively, it has been shown that one can implement a screening step to highlight SNPs likely to be involved in an interaction and correct the experiment-wise Type I error rate only for those SNPs formally tested for interaction[Kooperberg and Leblanc 2008; Murcray, et al. 2009]. Kooperberg and LeBlanc [Kooperberg and Leblanc 2008] suggested screening on genetic marginal effects in the search for G×G interactions, while Murcray et al [Murcray, et al. 2009] demonstrated that screening on a test of association between gene and environment could increase power to detect G×E interactions. These two-step tests differ in their screening strategy and therefore differ by how potentially important SNPs are identified in the screening step. Following is a description of several analytic approaches to detect G×E interactions in a GWAS.

Case-Control Test (CC)

A widely used method to test for G×E interactions in a case-control study is to apply logistic regression to test for a departure from log-additivity of the genetic and environmental factors. Specifically, one would fit model (1) and test the null hypothesis H0: βge = 0 for each of the S SNPs in turn, for example using a likelihood ratio test (with 1 degree of freedom (DF) for additive coding). To control the genomewide Type I error rate, a correction is necessary for the large number (S) of tests. We assume a Bonferroni correction, however other corrections could be used (e.g. controlling FDR[Benjamini and Hochberg 1995], P-ACT[Conneely and Boehnke 2007]). The CC approach will be used as the reference analysis for sample size comparisons.

Case-only Test (CO)

As an alternative to case-control designs, Piegorsch et al[Piegorsch, et al. 1994] proposed testing for gene-environmental interaction in cases alone. They showed that the association between gene and environment in the cases alone estimates the interaction effect under the assumption of independence between the genetic and environmental factors in the population from which the study subjects were sampled. Specifically, from the following model


βco is a consistent estimator of the true log interaction relative risk from a log-linear model when the assumption of G-E independence holds. For a rare disease, βco should be similar to βge from equation (1). This approach has been shown to be more powerful than the case-control analysis but is sensitive to deviations from population-level independence of G and E[Albert, et al. 2001; Chatterjee, et al. 2006; Gatto, et al. 2004; Khoury and Flanders 1996; Li and Conti 2009; Piegorsch, et al. 1994; Wang and Lee 2008]. Although there are likely to be few genes that have a true association with environment in the population, G-E associations can be induced through population structure. For this to occur, the distribution of the environmental exposure must vary across population subgroups (e.g. by ethnic subgroup) in a stratified population or across ancestral origin in an admixed population. If this occurs, then every SNP whose minor allele frequency also varies by subgroup will exhibit G-E dependence in the case sample being studied, thus yielding spurious G×E interaction findings. Adjusting for population stratification may alleviate the impact of population structure [Wang and Lee 2008] but can substantially reduce power of a case-only analysis[Bhattacharjee, et al. 2010]. Nonetheless, for studying an environmental factor that is known not to vary by population subgroups, the case-only analysis can be an efficient choice. We also assume a Bonferroni correction will be used to control the genomewide Type I error rate in a case-only analysis.

Environment-Gene Two-Step (EG2)

Instead of applying the CC analysis of interaction to all S markers, Murcray et al[Murcray, et al. 2009] proposed a two-step analysis that first screens all S markers for those SNPs most likely to be involved in G×E interactions. The motivation for this 2-step approach comes from the likelihood for case-control data, which can be written as follows:


where Asc denotes the event that a subject was ascertained in the case-control sample. The first factor is the traditional case-control likelihood, which, under a logistic regression model takes the form in equation (1). The second factor models the association between gene and environment in the combined case-control sample and is not typically used in the analysis of G×E interactions. In the presence of G×E interaction, the ascertainment of cases at a higher rate than the disease prevalence in the population from which they were sampled will induce an association between G and E in the study sample. Murcray et al[Murcray, et al. 2009] showed that the information in this part of the likelihood can be used to advantage to screen all markers for G×E interaction. Specifically, their procedure involves the following two steps:

  • Step 1 For a binary exposure, fit the logistic model:

    for all S markers to test for G-E association in the combined case-control sample. For a continuous environmental exposure, one can fit the following linear model:


    In either model, one can test the hypothesis, H0: βA=0 with a likelihood ratio test (with 1 or 2 df depending on the coding of G) at a pre-specified significance level αA.

  • Step 2 For those sA SNPs that pass Step 1, fit the full logistic regression model in equation (1), and test the hypothesis H0: βge = 0 at significance level α/sA, i.e. adjusting only for the number of markers formally tested in Step 2.

Since these two steps are statistically independent, as suggested by equation (2) and more formally shown in Murcray et al, the overall procedure maintains the nominal Type I error rate even in the presence of G-E association in the population. Murcray et al. assumed αA = 0.05 in most of their power calculations. In this paper we show that additional power can be gained by optimizing the choice of this Step-1 significance threshold and we show the sensitivity of the two-step procedure to the choice of αA.

Disease-Gene Two-Step (DG2)

Kooperberg et al[Kooperberg and Leblanc 2008] showed improved power to detect gene-gene interactions by screening on marginal genetic effects for the individual markers at a liberal screening p-value. Their second step was the same as Step 2 in the EG2 approach, but restricted to either a) those pairs where only one marker or b) both markers passed a marginal test for genetic effect at a pre-specified significance level αM. Either form of Step 2 would substantially reduce the total number of interaction tests conducted, and thus reduce the correction for multiple comparisons. For G×G interactions, Kooperberg and LeBlanc show that choosing αM=5×10−3 yields excellent power for a wide range of scenarios. We extend this methodology to investigate gene-environment interactions. In this scenario, one would test for G×E interaction for only those markers that show some evidence of a marginal genetic effect. The formal analysis would be as follows:

  • Step 1 Fit the logistic model:

    for all S SNPs to test for a genetic marginal association. Test the null hypothesis, H0: βM=0 with a likelihood ratio test at a pre-specified significance level, αM.

  • Step 2 For those sM markers that pass Step 1, fit the full logistic regression model (equation 1), and test the hypothesis H0: βge = 0 at significance level α/sM.

This approach also allows the investigator the choice of significance level for Step 1, αM. Kooperberg et al[Kooperberg and Leblanc 2008] show that although Steps 1 and 2 may not be independent, any dependence is small enough that the DG2 maintains acceptable Type I error rates across a wide range of scenarios by simulation for gene-gene interactions. We have found that this also holds for G×E interactions.

Hybrid Two-Step (H2)

Finally, we explore a novel 2-step approach that combines the EG2 and DG2 methods into a single analysis by allocating a fraction of the total genomewide Type I error rate to each scan (Figure I). Specifically, any SNP that shows a G-E association in the combined case-control sample at an αA significance level, or a genetic marginal effect at an αM significance level will be formally tested for G×E interaction in Step 2. The allocation of the overall α is defined to be ρα for the Environment-Gene Two-Step (EG2) and (1−ρ)α for the Disease-Gene Two-Step (DG2) procedure, where ρ can take any value between 0 and 1. Specifically, if a SNP passes the screening step of the EG2 method but not Step 1 of the DG2, it would be formally tested in equation (1) at ρα/sA. Similarly, if a SNP passes the screening step of the DG2 method but not Step 1 of the EG2, it would be formally tested in equation (1) at (1−ρ)α/sM. Lastly, if this SNP were to pass both the G-E and marginal G screening steps, it would be tested at the more liberal threshold for Step 2, i.e. p < max(ρα/sA, (1−ρ)α/sM). Under the null hypothesis, this overlap will be negligible. A simple Bonferroni inequality argument shows that the hybrid two-step method controls the genomewide Type I error level at α.

Figure I
Testing Strategy for Hybrid Two-Step Approach


To compare the methods described above, we will compute the sample size required to achieve 80% power to detect a true G×E interaction effect. We report relative efficiency of each method compared to the ‘Case-Control’ test of interaction, defined as the ratio of required sample sizes (RE = NCC/N1, where NCC = number of cases required for the CC method and N1 = number of cases required for the comparison approach). A ratio greater than 1.0 indicates that the comparison approach makes more efficient use of the available data than the CC approach, or equivalently that the comparison approach requires a lower sample size to achieve the same power.

We compute sample size based on direct calculation of the non-centrality parameter conditional on the expected genotype and exposure distribution given assumed population parameters and disease model parameterization[Greenland 1985]. Specifically, all tests described are based on a likelihood ratio test of the form


where [beta]1 maximizes the log likelihood under the alternative hypothesis and [beta]0 maximizes the log likelihood under the null. Power and sample size can be computed from the non-centrality parameter, which can be approximated by[Brown, et al. 1999; Self, et al. 1992]


In (7), [beta]1 and [beta]0 maximize the expected log likelihood under the alternative and null hypotheses, respectively and f(D, G, E) is the fraction of individuals in the sample with disease status D, genotype G, and exposure status E. Gauderman[Gauderman 2002] provides additional details on how to calculate the non-centrality parameter from the expected cell counts. Sample size for the one-step methods (i.e. CC, CO) is calculated by the simple formula N1=(za/2zb)2/δ, where a is the significance level and 1−b is the desired power. Since there does not exist a closed form solution to calculate the required sample size for the two-step methods, these are calculated numerically using a simple Secant Method[Rheinboldt 1998].

We define a ‘base’ model scenario and modify each parameter setting individually to determine the effects of each parameter on sample size comparisons. For the base model, we assume a binary exposure (E) with exposure prevalence (pE) 0.3 and that S =1 million SNPs are genotyped. We assume that the true DSL has minor allele frequency (qA) of 0.15 and an additive (0,1,2) coding of allelic effect in the disease model. We assume a baseline disease prevalence of p0 = 0.01, no genetic or environmental main effects (i.e. Rg = Re = 1), and an interaction effect size of Rge = 2. For this base model, we also assume significance thresholds for Step 1 of the EG2 and DG2 approaches to be αAM=1.0E-4. Additionally, we assume equal allocation of α to each arm of the Hybrid method (ρ=0.5).

Power of the Environment-Gene (EG2) and Hybrid Two Step (H2) tests are dependent on the number of spurious associations that are passed to Step 2. Specifically, as the number of markers passed (sA) increases, the correction for multiple testing is stricter and power declines. To understand the effect this has on sample size requirements for these approaches, we introduce a parameter pge denoting the proportion of the S markers that have a detectable G-E association in the population at the specified significance threshold for Step 1 of the EG2 approach (αA). These detectable G-E associations could arise through true casual relationships, such as genes that predispose individuals to smoke or non-casual associations arising from population stratification. Although we would expect few true casual associations, population stratification could induce many non-casual relationships in a GWAS of one million SNPs. We will examine the effect of increasing pge on the required sample size (N1) of the methods. We will also compare the performance of these procedures for a continuous environmental factor.


The Case-Only (CO) analysis was always the most efficient choice to test for G×E interaction across a range of interaction effect sizes (Figure II), which is not surprising given past reports[Khoury and Flanders 1996; Li and Conti 2009; Piegorsch, et al. 1994; Wang and Lee 2008]. Although this approach is potentially biased in practice, we include it in the comparison of relative efficiencies to show a lower bound for sample size requirements for a GWAS. For small interaction effects (1.5–2.0), there are noticeable differences in the required sample sizes to achieve 80% power for the two-step approaches. To detect an Rge = 1.5, the least efficient two-step strategy is to screen on marginal effects, with a required sample size of N1 = 7,932 cases (and N0 = 7,932 controls) that is only slightly more efficient than the standard CC approach. On the other hand, the Environment-Gene Two-Step (EG2) approach requires 4,468 cases under the same assumed parameter settings. As the interaction effect size gets larger, the relative efficiencies (RE) of the methods remain relatively constant compared to the Case-Control (CC) test (REDG2 ≈ 1.85, REH2 ≈ 1.95, REEG2 ≈ 2.08, RECO ≈ 2.72).

Figure II
Sample Size Required to Achieve 80% Power for Tests of Gene-Environment Interaction in a Genome-wide Association Study by Interaction Effect Size for Binary Environmental Exposure

The EG2 test was often more efficient than the DG2 across a wide range of parameter settings (Table I). The RE of the latter method was more sensitive to assumptions about exposure prevalence and minor allele frequency than the EG2 or Hybrid (H2) approaches as these impact the strength of the induced marginal genetic effect tested in Step 1. Specifically, if the exposure is rare (pE = 0.1), the RE of the DG2 test is 0.36, less efficient than the traditional one-step scan (CC). On the other hand, the DG2 is the most efficient of the two-step methods (RE = 2.0) for a more common exposure, pE = 0.5. Minor allele frequencies (qA), exposure prevalence (pE), and main effects (Rg, Re) have little effect on the RE of the EG2 and H2 approaches. However, the EG2 test is sensitive to disease prevalence, with higher RE for rare diseases (RE=1.90 when p0=0.01) than for a more common disease (RE=1.37 when p0=0.10). The relative efficiency of the EG2 decreased dramatically when a large number of markers had a G-E association in the population (pge>0). For example, if 10,000 markers (pge = 0.01) have a detectable G-E association in Step 1, the RE of the EG2 approach decreases 56%, from RE = 1.90 to 1.34. As the proportion of markers increases to 100%, the relative efficiency of the EG2 approaches 1.0, i.e. requiring equal sample size to that of the CC approach. The H2 has a similar trend, with decreasing efficiency for increasing number of markers with a G-E association, however the decrease is more gradual (a decrease of 28% in RE for pge = 0.01). In general, the H2 is a robust approach that provides either the best or nearly the best efficiency across a wide range of models.

Table I
Sample Size (N) and Relative Efficiency (RE) Required to Achieve 80% Power to Detect a True Gene-Environment Interaction for a Collection of Testing Strategies across a Range of Parameter Settings

Both the EG2 and DG2 scans can be optimized for a set of assumed population parameters as a function of the Step 1 significance thresholds, αA and αM. The EG2 was more efficient for stricter significance thresholds, αA = 1.0E-05 for the base model parameters (RE = 1.89) (Table I). Conversely, the DG2 analysis was more efficient for a more liberal screening threshold, αM = 1.0E-03 (RE = 1.41). There does exist a single optimal choice for both αA and αM for a single set of parameter settings. However, as many of these parameters are unknown at the time of study design and analysis, a robust choice can be made across minor allele frequencies and penetrance models. For the base model, the H2 can be optimized across both significance thresholds with the required sample size across choices for αM being flatter across a range of αA near the optimal choice (Figure III). Specifically, for αM[set membership] (6.0E-04, 1.3E-03) and αA [set membership] (8.0E-06, 1.1E-0.5) the minimum sample size required to achieve 80% power is 1,289 (RE = 2.04). Although these choices of αA and αM define the region of optimal efficiency, the choices of significance thresholds for the screening steps (αA,αM) are robust across a fairly large window around the optimal. For example, for the range considered in Figure III, the least efficient choice for these parameters (αA= 1.0E-06, αM= 5E-03) still had a relative efficiency of 1.93 relative to the traditional case-control (CC) test for interaction (N1 = 1,364, NCC = 2,632).

Figure III
Sample Size Required to Achieve 80% Power for the Hybrid Two-Step Test of Gene-Environment Interaction in a Genome-wide Association Study by Step 1 Significance Thresholds for the Disease-Gene and Environment-Gene Two-Step Tests for a Binary Exposure ...

The H2 approach is robust to the choice of ρ, the allocation of the overall Type I error rate to the EG2 method, with the relative efficiency remaining relatively flat across a range of interaction effect sizes (Figure IV). This pattern holds across a wide range of parameter settings (Table II). Generally, the H2 has the highest efficiency when ρ ≥0.5, except when there exists a non-zero main effect (Rge>1.0), a common exposure (pE=0.5), or when more controls than cases are sampled from the population (e.g. case: control ratio = 1:3). The H2 is often most powerful when ρ=0.9 and is robust to many unknown population parameters, i.e. minor allele frequency and genetic main effect. Except for when there is a sizeable genetic main effect (Rg≥1.3) or for a rare exposure (pE=0.1), the H2 approach was always more powerful than the EG2 or DG2 alone for some value of ρ. Although the H2 approach is often most efficient for an unbalanced allocation of α (ρ≠0.5), in practice, a balanced allocation (ρ=0.5) might be a natural choice to implement. For the parameter settings we considered in Table II, the largest difference between the relative efficiencies for the H2 design with ρ=0.5 compared to the optimal choice of ρ is 0.14 when there exists a sizeable genetic main effect (Rg≥1.3).

Figure IV
Relative Efficiency (RE) to Achieve 80% Power to Detect a True Gene-Environment Interaction for the Hybrid Two-Step Analysis for Various Allocations (ρ) of α to the Environment-Gene Two-Step Across a Range of Interaction Effect Sizes.
Table II
Sample Size (N) Required and Relative Efficiency (RE) to Achieve 80% Power to Detect a True Gene-Environment Interaction for the Hybrid Two-Step Analysis for Various Allocations (p) of a to the Environment-Gene Two-Step.

All of the tests described can be applied to test for interaction between G and a continuous environmental factor. In general, the relative efficiencies of the methods are similar to the binary E situation. Under our base model parameters, the relative efficiencies of the two-step methods are similar when the interaction effect size is of a modest size (Rge = 1.15) (Figure V). For an interaction effect size of 1.3, the EG2, DG2 and H2 tests all converge to be approximately twice as efficient as the traditional CC test as the interaction effect goes to infinity (REEG2 = 2.12, REDG2 = 2.02, REH2 = 1.95).

Figure V
Sample Size Required to Achieve 80% Power for Tests of Gene-Environment Interaction in a Genome-wide Association Study by Interaction Effect Size for a Continuous Environmental Exposure in the Absence of a Genetic Main Effect (Rg = 1.0).


We demonstrated that increased efficiency for testing G×E interactions can be achieved by screening the large number of markers tested using G-E association or genetic marginal screening steps. We proposed a novel hybrid approach that combines the strengths of both screening procedures and showed that it provides increased efficiency across a wide range of parameters. We described how the optimal power can be determined for the two-step tests as a function of the Step 1 significance thresholds (αA, αM) for individual study designs and parameters.

As has been shown previously[Khoury and Flanders 1996; Kraft, et al. 2007; Li and Conti 2009; Piegorsch, et al. 1994; Wang and Lee 2008], the case-only analysis was always more efficient than other methods for detecting G×E interactions. However, this test relies on the critical assumption of G-E independence for all markers tested. Failure to adhere to this assumption results in inflated Type I error rates, and in a GWAS would lead to a significant number of false positive findings.

For a binary environmental factor, the Environment-Gene Two-Step method was often more efficient than both the Disease-Gene and Hybrid Two Step approaches. However, the EG2 is more sensitive to the number of markers that have a population-level association with the environmental factor (either real or induced by population stratification) than the DG2 or H2 approaches. As the number of markers with a population-level G-E association increases, the relative efficiency of the EG2 approaches 1.0 compared to the CC design. Adjustment for population structure using STRUCTURE[Pritchard and Rosenberg 1999; Pritchard, et al. 2000] or EIGENSTRAT[Price, et al. 2006] in Step 1 will reduce the number of SNPs passed to Step 2. Although population-level G-E associations do not affect the relative efficiency of the DG2 approach, the total number of markers that are passed through Step 1 is still a potential concern. Admixture bias in marginal effect scans for GWAS have been widely discussed[Pritchard and Rosenberg 1999; Sarasua, et al. 2009; Wang 2009]. If proper correction is not implemented, the relative efficiency of the DG2 will show similar decreases as the EG2 approach.

The efficiency of the EG2 approach is sensitive to the population prevalence of the disease being studied. This is because the additional power of the EG2 procedure comes from exploiting independent information provided by over-sampling of cases relative to their prevalence in the population. When the population disease prevalence becomes closer to the case ratio in the study sample, the ascertainment of cases becomes less informative. Therefore, the EG2 method is less desirable for more common diseases, such as asthma. The DG2 method is also sensitive to the baseline disease prevalence but the relative efficiency of the DG2 approach converges to 1.0 more slowly than the EG2. The H2 approach is the most robust choice to baseline disease prevalence and would be a good choice for more common disease outcomes.

It is possible that there exists a SNP that is involved in a G×E interaction that also is associated with the environmental factor of interest. For those SNPs with an association and interaction in the same direction, the combined true and induced associations will increase the power of the screening step for the EG2. However, when the association and interaction effects are in opposite directions, the power of the EG2 approach can be poor, as the pooling of these types of effects will negate any power induced by ascertainment of cases in the first step. Although the likelihood of this phenomenon is low, it is a potential weakness of the EG2. The H2 approach would be a good compromise for this scenario since this phenomenon would have little effect on the DG2 screening approach.

Since it is not possible to know whether a true interaction will be accompanied by marginal effects, it may be advantageous to use our Hybrid approach as it has good power to detect a wide variety of penetrance models. Since a marginal effect scan is likely to be the first analysis applied to GWAS data, the screening p-values should be available without any additional computational time or effort. The incorporation of the EG2 would only require an additional scan of associations between G and E in the combined sample. The Hybrid approach is often a compromise between the efficiencies of the Disease-Gene and Environment-Gene Two-Steps, with required sample sizes often falling within the range of the EG2 and DG2 methods. This is beneficial when little is known a priori about the types of interactions that occur for complex diseases.

The methods compared in this paper are not a full catalog of those available to test for G×E interactions. Kraft et al[Kraft, et al. 2007] proposed a 2-df test to jointly assess the presence of a genetic main effect and the G×E interaction. They showed increased efficiency over the case-control test and a pure marginal test in some models. Some GWAS have collected data on several environmental exposures. Using their 2-df test to scan the genome for G×E interaction with each environmental risk factor would result in retesting the genetic main effect each time, which may not be desirable.

In order to capitalize on the efficiency of the case-only analysis and the robustness to deviations from G-E independence of the case-control analysis, work has been done to explore Bayesian alternatives. The goal of these methods is to leverage the power of the case-only test with the unbiasedness of the case-control analysis in order to improve power to detect G×E interaction. Li and Conti[Li and Conti 2009] use Bayes-model averaging (BMA) to average over the powerful case-only analysis and the unbiased case-control test. They show increased power for the BMA approach over the traditional CC test for G×E interaction. Their method, however, is vulnerable to bias if the assumption of independence between gene and environment is violated and proper priors are not used. Mukherjee et al[Mukherjee and Chatterjee 2008] develop an empirical Bayes-type shrinkage estimator to balance bias and efficiency. They achieve smaller mean-squared error (MSE) estimates and reduced bias compared to the CC approach using their proposed method under independence between G and E as well as for modest departures from independence. Like Li and Conti[Li and Conti 2009], they note that for a range of departures from gene-environment independence, their method can still be biased. These methods are also more computationally intensive, so it is unclear if they are adaptable to large-scale genome-wide association studies.

We have developed a general-use software program to compare sample size requirements for detecting G×E interactions in a GWAS. This tool can be used in the design of future studies and to optimize the analysis of existing data using two-step approaches by choosing significance thresholds appropriate to specific studies. To calculate power and required sample size, this software package requires specification of estimable population parameters (i.e. pE, Re), assumptions about the underlying disease mechanism (i.e. p0, Rg, Rge, qA) and study design characteristics (i.e. S, N1, control:case ratio). Using these inputs, the various program outputs include power, required sample size, optimal choices for screening thresholds (αA, αM), and allocation of the experiment-wise Type I error rates (ρ) for the H2 approach. In addition to the methods described here, this software also allows for comparison of power and required sample size for the a pure marginal genetic effect scan for a GWAS as well as the Kraft et al[Kraft, et al. 2007] 2-degree of freedom joint test of the genetic main effect and the interaction effect. Outputs from this package are easily integrated in plotting algorithms in R[Team 2009] or outputted for use in other software programs (i.e. Excel, SAS). This software is written in the software package R[Team 2009] and may be freely downloaded from (


This work was supported by grants P30ES007048, T32ES013678, U01ES015090, and R01ES016813 from the National Institute of Environmental Health Sciences, grants R01HL087680 and 1RC2HL101651 from the National Heart, Lung and Blood Institute, grant U01HG005927 from the National Human Genome Research Institute, and grants R41CA141852 and P30CA014089 from the National Cancer Institute.


  • Albert PS, Ratnasinghe D, Tangrea J, Wacholder S. Limitations of the case-only design for identifying gene-environment interactions. Am J Epidemiol. 2001;154(8):687–93. [PubMed]
  • Barrett JH. Measuring the effects of genes and environment on complex traits. Methods Mol Med. 2008;141:55–69. [PubMed]
  • Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc (B) 1995;57:289–300.
  • Bhattacharjee S, Wang Z, Ciampa J, Kraft P, Chanock S, Yu K, Chatterjee N. Using principal components of genetic variation for robust and powerful detection of gene-gene interactions in case-control and case-only studies. Am J Hum Genet. 2010;86(3):331–42. [PubMed]
  • Blumenthal MN. The role of genetics in the development of asthma and atopy. Curr Opin Allergy Clin Immunol. 2005;5(2):141–5. [PubMed]
  • Boks MP, Schipper M, Schubart CD, Sommer IE, Kahn RS, Ophoff RA. Investigating gene environment interaction in complex diseases: increasing power by selective sampling for environmental exposure. Int J Epidemiol. 2007;36(6):1363–9. [PubMed]
  • Brown BW, Lovato J, Russell K. Asymptotic power calculations: description, examples, computer code. Stat Med. 1999;18(22):3137–51. [PubMed]
  • Chamberlain M, Baird P, Dirani M, Guymer R. Unraveling a complex genetic disease: age-related macular degeneration. Surv Ophthalmol. 2006;51(6):576–86. [PubMed]
  • Chatterjee N, Kalaylioglu Z, Shih JH, Gail MH. Case-control and case-only designs with genotype and family history data: estimating relative risk, residual familial aggregation, and cumulative risk. Biometrics. 2006;62(1):36–48. [PubMed]
  • Conneely KN, Boehnke M. So Many Correlated Tests, So Little Time! Rapid Adjustment of P Values for Multiple Correlated Tests. Am J Hum Genet. 2007;81(6) [PubMed]
  • Cookson W. The alliance of genes and environment in asthma and allergy. Nature. 1999;402(6760 Suppl):B5–11. [PubMed]
  • Edwards TM, Myers JP. Environmental exposures and gene regulation in disease etiology. Environ Health Perspect. 2007;115(9):1264–70. [PMC free article] [PubMed]
  • Gatto NM, Campbell UB, Rundle AG, Ahsan H. Further development of the case-only design for assessing gene-environment interaction: evaluation of and adjustment for bias. Int J Epidemiol. 2004;33(5):1014–24. [PubMed]
  • Gauderman WJ. Sample size requirements for matched case-control studies of gene-environment interaction. Stat Med. 2002;21(1):35–50. [PubMed]
  • Gauderman WJ, Siegmund KD. Gene-environment interaction and affected sib pair linkage analysis. Hum Hered. 2001;52(1):34–46. [PubMed]
  • Grarup N, Andersen G. Gene-environment interactions in the pathogenesis of type 2 diabetes and metabolism. Curr Opin Clin Nutr Metab Care. 2007;10(4):420–6. [PubMed]
  • Greenland S. Power, sample size and smallest detectable effect determination for multivariate studies. Stat Med. 1985;4(2):117–27. [PubMed]
  • Hamet P, Pausova Z, Adarichev V, Adaricheva K, Tremblay J. Hypertension: genes and environment. J Hypertens. 1998;16(4):397–418. [PubMed]
  • Herbeck JT, Gottlieb GS, Winkler CA, Nelson GW, An P, Maust BS, Wong KG, Troyer JL, Goedert JJ, Kessing BD, et al. Multistage genomewide association study identifies a locus at 1q41 associated with rate of HIV-1 disease progression to clinical AIDS. J Infect Dis. 2010;201(4):618–26. [PMC free article] [PubMed]
  • Hunter DJ, Kraft P, Jacobs KB, Cox DG, Yeager M, Hankinson SE, Wacholder S, Wang Z, Welch R, Hutchinson A, et al. A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nat Genet. 2007;39(7):870–4. [PMC free article] [PubMed]
  • Ising M, Lucae S, Binder EB, Bettecken T, Uhr M, Ripke S, Kohli MA, Hennings JM, Horstmann S, Kloiber S, et al. A genomewide association study points to multiple loci that predict antidepressant drug treatment outcome in depression. Arch Gen Psychiatry. 2009;66(9):966–75. [PMC free article] [PubMed]
  • Khoury MJ, Flanders WD. Nontraditional epidemiologic approaches in the analysis of gene-environment interaction: case-control studies with no controls! Am J Epidemiol. 1996;144(3):207–13. [PubMed]
  • Kooperberg C, Leblanc M. Increasing the power of identifying gene x gene interactions in genome-wide association studies. Genet Epidemiol. 2008;32(3):255–63. [PMC free article] [PubMed]
  • Kraft P, Yen YC, Stram DO, Morrison J, Gauderman WJ. Exploiting gene-environment interaction to detect genetic associations. Hum Hered. 2007;63(2):111–9. [PubMed]
  • Li D, Conti DV. Detecting gene-environment interactions using a combined case-only and case-control approach. Am J Epidemiol. 2009;169(4):497–504. [PMC free article] [PubMed]
  • Mukherjee B, Chatterjee N. Exploiting gene-environment independence for analysis of case-control studies: an empirical Bayes-type shrinkage estimator to trade-off between bias and efficiency. Biometrics. 2008;64(3):685–94. [PubMed]
  • Murcray CE, Lewinger JP, Gauderman WJ. Gene-environment interaction in genome-wide association studies. Am J Epidemiol. 2009;169(2):219–26. [PMC free article] [PubMed]
  • Piegorsch WW, Weinberg CR, Taylor JA. Non-hierarchical logistic models and case-only designs for assessing susceptibility in population-based case-control studies. Stat Med. 1994;13(2):153–62. [PubMed]
  • Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38(8):904–9. [PubMed]
  • Pritchard JK, Rosenberg NA. Use of unlinked genetic markers to detect population stratification in association studies. Am J Hum Genet. 1999;65(1):220–8. [PubMed]
  • Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155(2):945–59. [PubMed]
  • Rheinboldt WC. Methods for solving systems of nonlinear equations. Philadelphia: Society for Industrial and Applied Mathematics; 1998.
  • Sarasua SM, Collins JS, Williamson DM, Satten GA, Allen AS. Effect of population stratification on the identification of significant single-nucleotide polymorphisms in genome-wide association studies. BMC Proc. 2009;3(Suppl 7):S13. [PMC free article] [PubMed]
  • Scott LJ, Mohlke KL, Bonnycastle LL, Willer CJ, Li Y, Duren WL, Erdos MR, Stringham HM, Chines PS, Jackson AU, et al. A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Science. 2007;316(5829):1341–5. [PMC free article] [PubMed]
  • Self SG, Mauritsen R, Ohara J. Power calculations for generalized linear models. Biometrics. 1992;48:31–40.
  • Team RDC. R: A Language and Environment for Statistical Computing. 2009.
  • van den Oord EJ, Kuo PH, Hartmann AM, Webb BT, Moller HJ, Hettema JM, Giegling I, Bukszar J, Rujescu D. Genomewide association analysis followed by a replication study implicates a novel candidate gene for neuroticism. Arch Gen Psychiatry. 2008;65(9):1062–71. [PubMed]
  • Wang K. Testing for genetic association in the presence of population stratification in genome-wide association studies. Genet Epidemiol. 2009;33(7):637–45. [PubMed]
  • Wang LY, Lee WC. Population stratification bias in the case-only study for gene-environment interactions. Am J Epidemiol. 2008;168(2):197–201. [PubMed]