We conducted extensive simulation studies to evaluate the power of different study designs for testing three hypotheses: i) null G-E interaction effect, β

_{I} = 0; ii) null genetic effect, β

_{g} = β

_{I} = 0; and iii) null environmental effect, β

_{e} = β

_{I} = 0. We assumed the log additive model for

*G* and used the Wald statistic for all tests based on the closed-form estimates provided in the above sections. First, we assessed the impact of imposing the HWE constraint on the estimation efficiency and power for testing different sets of association parameters under the standard case-control design. We considered the standard prospective method (“Standard”), the method that imposed the G-E independence constraint but not the HWE constraint (“GE-O”), and the method that imposed both the G-E independence and HWE constraints (“GE-HWE”). The comparison of these methods would shed light on the power improvement incurred by the two constraints. Next, with GE-HWE as the method of analysis, we compared the efficiency of four two-phase sampling strategies for testing the three hypotheses above. We considered a range of penetrance models in the form of

(1) by varying the magnitude of OR parameters. For example,

*G* may have an effect only in the presence of

*E*, or

*E* may have an effect only in the presence of

*G*. We first generated data for controls, assuming that

*E* followed a Bernoulli distribution and SNP genotype data

*G* satisfied the HWE. Then we generated (

*G, E*) for cases from the conditional distribution

*p*(

*G, E*|

*Y* = 1) where

In all tests, we set the nominal level at 0.0001, assuming that 500 tests were performed. In practice, the test of β

_{g} = β

_{I} = 0 may be at a different significance level than that for testing β

_{e} = β

_{I} = 0. Here we used the same level mainly to facilitate power comparison. The test for all three hypotheses had type I error rates that were close to the nominal level, as shown in . We generated 5,000 replicates for assessing the power of all tests.

| **Table 2**Type I Error Rates of GE-HWE at the Nominal Level 0.0001^{a}. |

Relative Power of GE-HWE for the Standard Case-Control Design

Panels A and B in demonstrate the relative power of the three methods for testing β_{I} = 0 and β_{g} = β_{I} = 0, where β_{g} = 0 for Panel A and β_{g} = ln(1.2) for Panel B. For testing β_{I} = 0, the power of GE-HWE appeared to be similar to that of GE-O, and both are higher than the standard method with the difference rising sharply with the magnitude of β_{I}. For example, with β_{I} = ln(1.5), the power difference was around 20%. But with β_{I} = 1.8, the power difference was around 60%. For testing β_{g} = β_{I} = 0, the power of GE-HWE and GE-O was very similar but much higher than the standard method. For example, the power difference was around 60% at β_{I} = ln(1.8) and β_{g} = 0 (Panel A) and was around 20% at β_{I} = ln(1.8) and β_{g} = ln(1.2). These data indicate that imposing the HWE constraint in addition to the G-E independence had limited influences on testing genetic effects or G-E interactions under the log-additive model for *G*. Panels C and D display the results for the relative power of the three methods for testing β_{I} = 0 and β_{e} = β_{I} = 0. Regardless of the presence or absence of the main effect of *E* (Panel C: β_{e} = 0; Panel D: β_{e} = log(1.5)), GE-HWE and GE-O have nearly identical power for both tests, and both had higher power than the standard method. This indicates that the HWE constraint hardly has any impact on power for testing β_{e} = β_{I} = 0.

We quantified the relationship between all parameter values and the ratio of power for GE-HWE to that for the standard method using simulation studies. We first obtained the relative power for a wide range of parameter setups. Then we performed linear regression analysis, using the log relative power as the outcome variable and the true parameter values as explanatory variables. The estimated mean log relative power for testing β_{I} = 0, β_{g} = β_{I} = 0, and β_{e} = β_{I} = 0 is 3.5−1.1*p*_{a} −0.33*p*_{e} +0.43β_{g} +0.17 β_{e} −2.88β_{I}, 1.51−0.44*p*_{a} −0.35*p*_{e} −0.57β_{g} −0.15β_{I}, and 1.6 − 0.5*p*_{a} − 0.56*p*_{e} + 0.02β_{g} + 0.44β_{e} − 0.30β_{I}, respectively. Therefore, the magnitude of β_{I} plays a dominant role in the relative power for testing G-E interactions, but the magnitude of β_{g} and β_{e} plays a greater role in testing genetic and environmental effects, respectively.

presents the mean estimates, averaged estimated asymptotic variances, and empirical variances of the three methods, where the data was generated using the same parameter setup as that for panels A and B in . The mean estimates with GE-HWE appeared to be close to the true parameter values. The averaged estimated asymptotic variances for all parameter estimates appeared to be close to their empirical counterparts. The empirical variances of main effect parameters estimated with GE-HWE were generally close to those of GE-O but smaller than that those under the standard method, and that for the interaction parameter β_{I} could be smaller by more than 60%.

| **Table 3**Performance of GE-HWE for Estimation under G-E Independence and HWE |

Power of Design I and Design II for Testing β_{g} = β_{I} = 0 and β_{e} = β_{I} = 0

We investigated efficient two-phase design strategies for testing the genetic effect β_{g} = β_{I} = 0 and environmental effect β_{e} = β_{I} = 0 using GE-HWE for analysis. In each replicate, we first generated (*Y, G, E*) for 1,000 cases and 1,000 controls. Then we created a two-phase sample by selecting an equal proportion of cases and controls into phase II, and either data for *G* (Design I) or *E* (Design II) were deleted for those unselected. For cases, we selected the phase II subset either randomly or following a “balanced design” strategy by stratifying on *E* in Design I or *G* in Design II. The balanced design included all cases with *E* = 1 for a rare exposure in Design I, and it included as equal as possible numbers of cases with *G* = 0, *G* = 1, or *G* = 2 in Design II, respectively. With a small MAF, all cases with *G* = 2 are selected. To further evaluate the impact of control selection on the efficiency of the design, we considered two-phase designs with 300 phase II cases but a varying proportion of phase II controls ranging from 30% to 100%.

displays the power of Design I for testing β_{g} = β_{I} = 0 and β_{e} = β_{I} = 0 as a function of the proportion of phase II cases and/or controls. In general, the power under balanced sampling for testing β_{g} = β_{I} = 0 was much higher than that under random sampling, with the power difference becoming greater at smaller phase II case/control proportions and larger MAF (Panel A). But the difference between the two sampling strategies was small for testing β_{e} = β_{I} = 0 (Panel B). With a fixed subset of phase II cases, the power for testing genetic and environmental effect is nearly identical under both stratified and random sampling of controls (Panels C and D), and it increased with the proportion of selected controls for testing β_{g} = β_{I} = 0 (Panel C) but remained constant for testing β_{e} = β_{I} = 0 (Panel D). These results suggest that sampling stratified on *E* in cases is generally preferred for testing genetic effects or G-E interactions when data on *E* is available on all subjects. Parameter estimates corresponding to Panel C are presented in .

| **Table 4**Estimation with GE-HWE under Design I. The Parameters were the Same as Those Used in , Where 1,000 Cases and 1,000 Controls had data on *E*, and 300 Cases were Selected into Phase II Stratified on *E*. |

displays the power of Design II for testing β_{g} = β_{I} = 0 and β_{e} = β_{I} = 0 as a function of the proportion of phase II cases and controls. In general, for testing β_{g} = β_{I} = 0, the difference between the two sampling strategies appeared to be small (Panel A), and the power remained constant with a varying proportion of phase II controls (Panel C) when the subset of phase II cases is fixed. On the other hand, the power under balanced sampling for testing β_{e} = β_{I} = 0 was much higher than that under random sampling, with the power difference getting greater at smaller phase II case/control proportions and larger prevalence of *E* (Panel B). The power under both balanced and random sampling of controls when the subset of phase II cases was fixed slightly increased with the proportion of selected controls (Panel D). These results suggest that sampling stratified on *G* in cases for ascertaining data for *E* is generally preferred for assessing environmental effects.

Power of Supplemented Designs I and II

displays the power of Supplemented Design I for testing β_{e} = β_{I} = 0 as a function of the number of supplemented controls *m* at different values of *p*_{e}. The magnitude of power increase due to the supplement of additional control data for *E* increased with β_{e}, β_{I}, and *p*_{e}, particularly when *m* was less than 500. For example, with *p*_{a} = 0.2, *p*_{e} = 0.15, β_{g} = log(1.2), β_{e} = β_{I} = log(1.5) (Panel A), supplementing *E* from 500 and 2, 000 additional controls to data from 500 cases and 500 controls led to around 20% and 40% increase in power, respectively. But with β_{e} reduced to log(1.2), the respective increase was only around 5% and 10%. The power of Supplemented Design I for testing β_{I} = 0 and β_{g} = β_{I} = 0 remained constant regardless of the number of supplemented controls (data not shown).

displays the power of Supplemented Design II for testing β_{g} = β_{I} = 0 as a function of *m*, the number of additional controls with data on *G*. Similar as Supplemented Design I, the power increase at a given *m* appeared to be larger with increasing β_{g}. For example, with *p*_{a} = 0.2, *p*_{e} = 0.15, β_{g} = log(1.2), and β_{I} = log(1.5) (Panel A), supplementing *G* from 500 and 2000 controls to 300 cases and 300 controls led to 10% and 24% increase in power, respectively. But with *p*_{a} = 0.2, *p*_{e} = 0.15, β_{g} = log(1.2), and β_{I} = log(1.3), the respective increase was only 7% and 16%. In the absence of genetic main effect (β_{g} = 0), the respective increase became negligible. The increase also became sharper with a greater *p*_{a}. Not surprisingly, the power of Supplemented Design II for testing β_{I} = 0 and β_{e} = β_{I} = 0 remains nearly constant regardless of the number of supplemented controls (data not shown).