|Home | About | Journals | Submit | Contact Us | Français|
The analysis of gene-environment (GxE) interactions remains one of the greatest challenges in the post-genome-wide-association-studies (GWAS) era. Recent methods constitute a compromise between the robust but underpowered case-control and powerful case-only methods. Inferences of the latter are biased when the assumption of gene-environment (G-E) independence fails. We propose a novel empirical hierarchical Bayes approach to GxE interaction (EHB-GE), which benefits from greater power while accounting for population-based G-E dependence. Building on Lewinger et al.'s ( Genet Epidemiol 31:871-882) hierarchical Bayes prioritization approach, the method utilizes posterior G-E association estimates in controls based on G-E information across the genome to adjust for it in resulting test statistics. These posteriori estimates are subtracted from the corresponding G-E association coefficients within cases.
We compared EHB-GE with rival methods using simulation. EHB-GE has similar or greater rank power to detect GxE interactions in the presence of large numbers of G-E associations with weak to strong effects or only a low number of such associations with large effect. When there are no or only a few weak G-E associations, Murcray et al.'s method ( Am J Epidemiol 169:219-226) identifies markers with low GxE interaction effects better. We applied EHB-GE and competing methods to four lung cancer case-control GWAS from the TRICL/ILCCO consortium with smoking as environmental factor. Genes identified by the EHB-GE approach are reasonable candidates, suggesting usefulness of the method.
Genome-wide association studies (GWAS) have identified thousands of genetic risk factors for complex diseases. However, identified variants only explain a part of the total heritability. Unexplained phenotypic variance may be partly due to undetected GxE interactions, which are characterized by a complex interplay of genetic and environmental contributors to the disease. The most effective way to detect GxE interactions in GWAS remains unresolved.
The standard case-control (CC) logistic regression tests for GxE interactions as departure from a multiplicative odds ratio model. Generally, it is underpowered to detect GxE interactions unless genetic and environmental factors are common and the interaction effect is strong [Mukherjee et al., 2012]. Unconditional logistic regression as commonly applied to case-control data does not account for preferential sampling through the cases, which contain most of the information on susceptibility to the disease, and hence underweights the more informative component of the sample. However, unconditional logistic regression is robust to gene-environment association.
The case-only test (CO) [Piegorsch et al., 1994] is a more powerful alternative under the assumptions of a rare disease and G-E independence in the source population. It does not maintain a pre-specified level of Type I error when G-E independence is violated [Albert et al., 2001; Chatterjee and Carroll, 2005; Satten and Epstein, 2004]. Therefore, it cannot be recommended for GWAS, in which G-E association cannot be ruled out but is expected to occur.
The intuitive two-step approach (TWO) first tests all SNPs in controls for G-E independence; based on the result, it employs the case-only or case-control method to test for GxE interaction in the second step. For a study of moderate size, the power to detect G-E dependence is typically low and the approach may still remain significantly biased [Mukherjee and Chatterjee, 2008]. In addition, a procedure failing to account for the two-step uncertainty demonstrates an inflated level of Type I error even when G-E independence is given [Albert et al., 2001]. The correlation between the two steps leads to an inflated Type I error rate overall.
The two-step approach (MUR) by Murcray et al.  combines the power of the case-only test with the protection from bias of the case-control estimator in a two-step procedure with the test statistics being independent from each other. The method screens for G-E association in the whole sample. Then, only SNPs exceeding a given significance threshold are tested for GxE interaction. Unfortunately, the power of the first step highly depends on the case-control ratio (ccr). A large number of controls compared to cases leads to decreasing power of step one and hence loss of power in the overall procedure [Murcray et al., 2011; Kraft et al., 2007].
The empirical Bayes-type shrinkage estimator (MUK-EB) [Mukherjee et al., 2008; Mukherjee and Chatterjee, 2008] combines the robust case-control with the efficient case-only estimator assuming G-E independence. This estimator is approximately robust to the presence of G-E association in the source population and performs comparably to the case-control estimator under large departures from independence. The method does not strictly maintain the correct Type I error rate under G-E dependence and moderate sample size. However, it maintains smaller mean squared error compared to the other estimators mentioned, irrespective of the G-E association.
The aim of this paper is to introduce a novel method to search for GxE interactions in a GWAS, which has the high power of the case-only design for testing GxE interactions, while accounting for population G-E association. We term it the empirical hierarchical Bayes method for GxE interactions (EHB-GE). We compared our method with previously discussed alternative estimators of GxE interaction in a simulation study as well as in real data applications.
Based on the hierarchical mixture model described by Lewinger et al.  we developed a novel empirical hierarchical Bayes method to detect GxE interactions in GWAS. Similarly to Mukherjee et al. , we propose a compromise between the case-control and case-only test of GxE interaction, by subtracting a posterior estimate of the G-E association within controls from that within cases. For an individual, let the indicators D, G and E denote presence (e.g. D = 1) or absence (e.g. D = 0) of disease, minor allele (i.e. carriers) of the considered SNP, or environment, respectively. Estimates of G-E association within cases (likewise for controls with βcontrols) can be obtained from the following logistic regression model
Under the rare disease assumption and G-E independence in the source population, βcontrols = 0. Then testing for βcases = 0 provides a valid case-only test for GxE interaction. However, when such assumptions are not true, the βcontrols coefficients should be estimated and subtracted from the βcases coefficients to result in an unbiased estimate of GxE interaction. We propose the following novel hierarchical test statistic for GxE interaction in GWAS (calculated for each SNP)
where M = 1, ... ,N, are the markers (with total number N), , are the corresponding logistic regression coefficients for G-E association among cases or controls, respectively, with standard deviations , , are the posterior estimators of , and . The corresponding test statistics to test for the association within cases (likewise for controls) are . For G-E associated SNPs , and possibly . The latter can particularly occur in the presence of an environmental main effect or different numbers of cases and controls in the sample. Unequal expected values of the test statistics result in or not being distributed around 0, even if the null hypothesis is true. Thus, association may be misinterpreted as interaction. To overcome this, we use a hierarchical Bayes model for the estimated effects and calculate . The model is given by
where λM are noncentrality parameters of the χ distribution with one degree of freedom (χ1(λM)) and p is the prior probability that marker M is associated with the disease, i.e. λM > 0. Given λM > 0, λM is assumed to have a χ1 distribution with noncentrality parameter θ as the measure of association strength and a scaling parameter σ > 0. Given λM = 0, δ(0) denotes a point mass at zero. The probability density function, the prior probability, marginal distribution, and posterior expected values are derived in Appendix A using an approximation by Kass and Steffey  for the variance. Based on the maximum likelihood estimates (MLE) , the posterior expectation of noncentrality parameter and variance of λM are
In 1000 replications of 3000 individuals each (case-control ratio ccr = 1:1, 1:2 and 2:1) we generated disease as affected/unaffected, environment as exposed/unexposed and SNP genotypes as carrier/non-carrier (all binary) by logistic regression as logit(P(D = 1|E,G)) = pd + βG×EG × E, βG×E = log(ORG×E), where disease prevalence is pd = 0.01, 0.5, 0.1. We simulated 10,000 SNPs in total. A single, so-called “interacting” SNP with ORG×E = 1.2, 1.5, 2, 2.5, and 3 was simulated. Additionally, up to 1000 so-called “G-E” SNPs (NG-E = 0, 1, 5, 10, 25, 50, 75, 100, 200, ..., 1000) with low, medium, and high regression coefficients for G-E association were simulated from a normal distribution N(0, log(1.5)/2) and N(0.7,0.1) for low and medium associations, respectively, and for high associations for the log of the regression coefficients from N(0, log(1.5)/2). We named the other SNPs with no true G-E interaction or association “dummy” SNPs. For the interacting SNP the minor allele frequency (MAF) was pg = 0.1, 0.3, 0.5. For all other SNPs, the MAF was sampled from a Beta distribution B(1,3) truncated to [0.01, 0.5] and the exposure frequency was pe = 0.1, 0.3, 0.5. The frequency of carriers of the minor allele was derived under the assumption of Hardy-Weinberg equilibrium.
We compared EHB-GE and competing methods (CC, CO, TWO, MUK-EB and MUR) by the “rank power”, defined as percentage of simulated replicates where the true interacting SNP is within the top ranking positions, according to absolute value of the corresponding test statistics (top 1, 10, 25, 50 or 100). We plotted rank power of EHB-GE on the x-axis against the difference in rank power between EHB-GE and a competitor on the y-axis. Hence, positive values on the y-axis represent a gain in rank power by EHB-GE. We evaluated rank power of the methods because, especially in the context of GxE interaction, GWAS is often regarded as a screening process, selecting a subset of top SNPs for further investigation [Hirschhorn, 2005; Thomas et al., 2009; Ziegler et al., 2008]. Rank power addresses this issue, evaluating the ability of a particular method to detect the true positive effect for follow-up, even in the situations in which moderate to small effect size does not reach genome-wide significance level.
We first note that, as necessary, rank power for EHB-GE increased with increasing GxE effect size, increasing frequencies pg, pe and decreasing pd. The rank power for different ccrs depended heavily on the number of G-E associations and their effect size. Figure 1 illustrates different ccrs and different G-E association numbers and effect sizes for ORG×E = 2, pg = 0.3, pe = 0.1, pd = 0.5. Overall ccr = 1:1 was the most preferable and ccr = 1:2 the most unfavorable scenario. When only low G-E associations were present (left figure), a larger number of cases than controls was the best. However, for a large number of medium or high G-E associations, the rank power of this unbalanced scenario decreased. The balanced ccr = 1:1 is the best ratio in situations in which a large number of G-E associations in the sample are suspected independently of the G-E association effect size. Although the difference in rank power between the ccrs was larger for larger ORGxE, these tendencies hold when varying the interaction effect size. True GxE interaction is easier to detect for a more frequent environmental factor, but this is counterbalanced by a greater probability of finding a G-E association, which can obscure the true GxE interaction. For ccr = 1:1 and low pe = 0.1, the scenario with low G-E association effect attained the highest power, and for common exposure, pe = 0.3 EHB-GE method even reached the highest power for high association effects (data not shown).
We compared the rank power of EHB-GE with that of five other methods (CC, CO, TWO, MUK-EB and MUR) (see Table 1). In all scenarios without G-E associations (“none”), the method of Murcray et al.  yielded 100% rank power. The power of EHB-GE for these scenarios was nearly the same as for the case-only test. On average MUK-EB and TWO lost around 5 – 10% rank power compared to EHB-GE. If G-E associations were present, EHB-GE always outperformed CC. The performance of MUR depended heavily on the number of population G-E associations and their strength. When they increased, the rank power of MUR decreased rapidly. MUR outperformed others, with nearly 100% rank power when only a small number, e.g. 100, of G-E associations with low effect appeared in the population. The power of EHB-GE and CO was very similar in the absence or presence of few G-E associations with low effects. With increasing number or strength of the G-E associations, EHB-GE retained the power level that lowered for the CO test. Results for ccr = 1:2 and 2:1 are similar (data not shown).
Subsequently for overall comparison of simulated scenarios, we plotted the rank power of EHB-GE versus the difference in rank power of EHB-GE and each of the methods (Figure 2 illustrates these plots for rank power regarding the top 25 ranks and ccr = 1:1). Our EHB-GE approach demonstrated remarkably greater rank power in almost all simulation scenarios compared to the CC test (Figure 2a). The superiority of EHB-GE compared to CO could be confirmed (Figure 2b). A clear triangular structure was observed, in which the diagonal consists of the high association scenarios. Irrespective of the choice of other parameters, CO has almost no rank power in these situations. The points along the horizontal line represent the similarity of CO and EHB-GE given a low number and low strength of G-E associations. The vertical line represents situations in which EHB-GE reached nearly 100% rank power and CO between 0 and 100%. Figure 2c demonstrates the comparison between TWO and EHB-GE. TWO sometimes outperforms EHB-GE in the case of moderate number of G-E associations with medium to high effect size. EHB-GE tends to outperform TWO in the case of low number of such effects. EHB-GE also reached higher power in the presence of G-E associations with low effect size irrespective of their number. The EHB-GE method was remarkably better for almost all the parameter combinations than MUK-EB (Figure 2d), except for some cases of medium G-E association effect. Finally, the rank power of MUR is higher compared to EHB-GE when there are only a small number of G-E associations with low effect size. However, for a larger number of higher G-E association effects, EHB-GE reaches up to 50% more rank power. When there are G-E associations with medium effect size, irrespective of their number, EHB-GE always outperformed MUR for ccr = 1:1 (Figure 2e). The same behavior of these competing tests in terms of the rank power compared to EHB-GE for the top 50 or 100 SNPs and in the presence of an environmental main effect is observed (data not shown).
We applied EHB-GE and the other competing approaches to four lung cancer case-control GWASs from the International Lung Cancer Consortium (ILCCO) / Interdisciplinary Research in Cancer of the Lung (TRICL) consortium. Four participating case-control GWAS were analyzed including the German lung cancer GWAS (GLC) [Holle et al., 2005; Sauter et al., 2008; Wichmann et al., 2005], the Texas lung cancer GWAS (MD Anderson Cancer Center, MDACC) [Amos et al., 2008; Hung et al., 2008; Wang et al., 2008], the Toronto lung cancer GWAS (Samuel Lunenfeld Research Institute, SLRI) [Hung et al., 2008a; Hung et al., 2008b] and the Central Europe lung cancer GWAS (Central Europe IARC, CE-IARC) [Amos et al., 2008]. Some characteristics of all datasets are given in Table 2.
Since tobacco smoking is clearly the main risk factor in lung cancer development, we investigated GxE interactions with smoking in all four GWAS. We defined smoking status of a particular individual by two different approaches, i.e., ever = 1 and never = 0 smokers or heavy = 1 and moderate = 0 smokers. Never smokers were defined as those individuals who consumed no more than 100 cigarettes during their life-time. All others were called ever smokers. The MDACC study collected only ever smokers. We defined moderate smokers as those consuming fewer than or equal to 20 pack-years and heavy smokers as those smoking more than 20 pack-years, where a pack year is calculated as cigarettes per day × years smoked divided by the number of cigarettes (20) per pack [Amos et al., 2008]. Two models with two different smoking variables were applied to each of the GWAS. In the following results, we will use the abbreviations NE for the never versus ever smokers model and MH for the moderate versus heavy model. To guarantee high quality of the data, standard quality control procedures were performed, mainly using PLINK and EIGENSOFT [Price et al., 2006].
For each of the four datasets, the NE and MH models were fitted and each of the competing methods applied. The classical case-control test for GxE interaction did not identify any SNP markers with a genome-wide significant interaction effect, i.e. with p-value < 10−8 in any of the four GWAS. The EHB-GE and case-only test identified one SNP, rs13244987, using the NE model in GLC with genome-wide significant p-value for interaction and one neighboring SNP, rs13438768, with a nearly genome-wide significant p-value slightly larger than 10−8. Both SNPs belong to the miscRNA gene LOC645249. Since this analysis is focused on the methodology and does not contain all available lung cancer GWAS, we shall refrain from in-depth discussions regarding the biology of the findings. To verify that findings by CO approach were not false positive signals, we tested for G-E dependence in controls. None of the markers revealed evidence of strong G-E association for either smoking model (data not shown).
Since in the CC test none of the SNPs demonstrated genome-wide significant interaction, selecting the top SNPs for follow-up is reasonable. In this respect and for each GWAS, we compared the top 100 SNPs yielded by our proposed EHB-GE method to those of the competing methods.
The top 100 SNPs obtained by the different GxE methods within the study and smoking model displayed similar trends. Results of EHB-GE were in close agreement with those of the case-only test. Particularly in SLRI and GLC, the top 100 SNPs ranked by EHB-GE and CO were identical, in CE-IARC and MDACC there were at most three discordant SNPs. Results of both tests moderately coincided with those of the intuitive two-step test and Mukherjee's empirical Bayes-type shrinkage estimator.
As an example, Table 3 displays the results for the top 100 SNPs from the GLC Study. Approximately 75-90 SNPs in the top 100 agreed for EHB-GE and MUK-EB in the NE model, with nearly 60-70 SNPs in the MH model. For TWO, we observed 48-72 common SNPs using NE and 41-64 common ones with MH. With MH we had around 20-40 SNPs in common for EHB-GE and CC. Notably, in SLRI we saw greater discordance of the SNPs for both smoking models compared to the other three GWAS datasets. The intuitive two step approach demonstrated a greater agreement with case-control test of the top 100 SNPs, 40-60 SNPs for MH and around 60 for NE. There were no common SNPs in the top 100 identified by MUR and CC in any of the analyses. For the never versus ever model, the overlap of MUR with other methods was limited to only few SNPs. However, for moderate versus heavy we observed up to 17 common SNPs chosen by EHB-GE and MUR. Please note that, among the top 100 pairs, in GLC at most 25 pairs, in SLRI, CE-IARC and MDACC at most 6 pairs of SNPs were correlated with r2 > 0.8.
Genotyping for the four studies was performed on two different genotyping platforms (see Table 2), leading to availability of some SNPs in one study and not in the other. Therefore, we selected the top 100 genes and not SNPs as previously described, to compare ranking by each GxE method across the GWAS studies. Comparison of the 100 top ranked genes per GxE method between the GWAS studies suggested low concordance. For the CC test and our new hierarchical modeling approach, the results are presented in Table 4. We observed only up to 5 common genes among 100 across studies for any of the methods. The number of common genes occurring for at least two studies within the top 100 ranked genes varied per method from 14 to 26. In total, we noted 73 such genes across the different methods. No two approaches yielded the same sets of genes, indicating a large variability between studies applying these methods.
Recently, much effort has been devoted to developing novel methods to search for GxE interactions capable of reaching high power consistently and maintaining a pre-specified level of Type I error [Mukherjee et al., 2008; Murcray et al., 2009; Murcray et al., 2011, Thomas et al., 2012]. To date, no single method has been found that reliably outperforms others for all sets of parameters [Thomas et al., 2012]. Usually, GxE interactions methods have low power to identify significant interactions, possibly due to the moderate to low true GxE effect sizes, limited sample size, or enormous number of tests performed in a single study. Additionally, the presence of population G-E association and failure of the methods to account for it properly are two of the major reasons for inflated Type I error rate. Although the case-only test has greater power compared to the case-control method, it produces a large number of false positive results when G-E association is present. Thomas et al.  proposed to use the case-only test as a screening tool in an initial discovery sample followed by replication. Unfortunately, this is not always a viable solution, due to the excessive number of false positive findings or lack of additional data. Following up the top ranked SNPs is most often a useful alternative strategy. We evaluated the rank power of different GxE interaction methods. Our simulation study revealed that, in many situations, our novel empirical hierarchical Bayes method for GxE interaction has greater rank power compared to other methods. In all scenarios, the EHB-GE outperformed the traditional CC test. Rank power of EHB-GE declined only slightly with increasing number of G-E associations compared to other methods. The performance of EHB-GE was similar to that of CO if there were no or few weak G-E associations. However, with increasing number or strength of G-E associations, the case-only test lost power. In the situation of no or only a few weak G-E association effects, the method proposed by Murcray et al.  demonstrated the best power. However, the power of this method decreased with increasing number or strength of the G-E associations. MUK-EB and the TWO methods showed slightly lower power than EHB-GE for most scenarios.
We analyzed four lung cancer GWASs and found that the results of EHB-GE closely agree with those of the CO method. Since we expected the CO test to produce false positive findings due to G-E association, this was not anticipated. Hence, we tested for population G-E association in controls. Although with smoking we expected the presence of population G-E association, no such signals were found in controls. Therefore, it is understandable that the results for EHB-GE and CO were similar. To explain the similarity between EHB-GE and CO methods, the hyperparameter estimates of EHB-GE are depicted in Table 5. For GLC and SLRI, the estimates of μ representing the strength of a population G-E association were less than 10−5. Hence, the posterior estimates λ for the single markers M subtracted of the were close to zero; consequently the top 100 SNPs were the same for EHB-GE and CO. For the MDACC and CE-IARC GWAS, the μ estimates were larger, e.g., for the NE model, 0.017 from the CE-IARC and 0.066 from the MH study. Differences across the studies may depend on the actual sample sizes in each of the GWAS, although we cannot prove that. Nevertheless, for CE-IARC and MDACC, we see up to 3 SNPs discordant for CO and EHB-GE in the top 100. Therefore, we can conclude that, when G-E associations are absent, our EHB-GE method performs similarly to the CO test, which is known to have superior power to identify GxE interactions in the absence of distorting G-E associations. Not all the results are displayed. In particular, the inclusion of main effects into the model yielded qualitatively the same results.
The empirical hierarchical Bayes for GxE interactions method demonstrated high power while accounting for population G-E associations. It should be the first choice for follow-up SNP selection, particularly when G-E associations are suspected. Murcray's method is recommended as a reasonable complement to EHB-GE when the G-E effects are expected to be small.
This research was supported by NIH grant TRICL 5U19CA148127-03 and DFG GRK 1034.
Authors of the manuscript have declared no conflict of interest for this work.
The density functions for the hierarchical model and the hyperparameters Θ = (θ, σ, p) are
where (.) is the standard normal density. To obtain the estimates for the hyperparametrs of that model, we need the marginal likelihood function. The marginal distribution is given by
where , . The likelihood of the model is given by and is maximized with respect to Θ. Based on the MLEs , the posterior expectation of λM is straight-forward and the posterior variance can be derived using an approximation by Kass and Steffey 
where is the (j,h)-component of the inverse negative Hessian of the marginal log-likelihood evaluated at the MMLE, , is given by the Jacobian of the posterior expectation with is known and
where the first summand in the above can be calculated and . The Jacobian and Hessian used in the correction term are given by
Finally, . Detailed formulas and program code can be obtained via email from the authors on request.