Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptNIH Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Ann Hum Genet. Author manuscript; available in PMC May 1, 2013.
Published in final edited form as:
PMCID: PMC3334353
Application of a novel hybrid study design to explore gene-environment interactions in orofacial clefts
Øivind Skare,1,2* Astanand Jugessur,1,3* Rolv Terje Lie,2,4 Allen James Wilcox,5 Jeffrey Clark Murray,6 Astrid Lunde,2 Truc Trung Nguyen,4 and Håkon Kristian Gjessing1,2
1Division of Epidemiology, Norwegian Institute of Public Health, Oslo, Norway
2Department of Public Health and Primary Health Care, University of Bergen, Bergen, Norway
3Craniofacial Research, Murdoch Childrens Research Institute, Royal Children’s Hospital, Parkville, Australia
4Medical Birth Registry of Norway, Norwegian Institute of Public Health, Bergen, Norway
5Epidemiology Branch, National Institute of Environmental Health Sciences (NIH/NIEHS), Research Triangle Park, USA
6Departments of Pediatrics, Epidemiology and Biological Sciences, University of Iowa, Iowa City, USA
*These authors contributed equally to this work.
Corresponding author: present address: Department of Biostatistics, University of Oslo, Pb 1122 Blindern, 0317 Oslo, Norway. Phone: +47-98084626; Fax: +47-22851313; oivind.skare/at/
Orofacial clefts are common birth defects with strong evidence for both genetic and environmental causal factors. Candidate-gene studies combined with exposures known to influence the outcome provide a highly targeted approach to detecting GxE interactions. We developed a new statistical approach that combines the case-control and offspring-parent triad designs into a “hybrid design” to search for GxE interactions among 334 autosomal cleft candidate genes and maternal first-trimester exposure to smoking, alcohol, coffee, folic acid supplements, dietary folate, and vitamin A. The study population comprised 425 case-parent triads of isolated clefts and 562 control-parent triads derived from a nationwide study of orofacial clefts in Norway (1996-2001). A full maximum-likelihood model was used in combination with a Wald test statistic to screen for statistically significant GxE interaction between strata of exposed and unexposed mothers. In addition, we performed pathway-based analyses on 28 detoxification genes and 21 genes involved in folic acid metabolism. With the possible exception of the T-box 4 gene (TBX4) and dietary folate interaction in isolated CPO, there was little evidence overall of GxE interaction in our data. This study is the largest to date aimed at detecting interactions between orofacial clefts candidate genes and well-established risk exposures.
Keywords: Birth defects, orofacial cleft, cleft lip, cleft palate, genetic epidemiology
Orofacial clefts comprise a significant proportion of all human birth defects. The extensive surgical, dental, medical and behavioral interventions needed to treat these common craniofacial defects impose a substantial economic and personal health burden (Wehby and Cassell, 2010; Strauss, 1999). First-degree relatives of an affected individual have a 30-40 fold increased risk of clefts compared to the background population (Sivertsen et al., 2008; Grosen et al., 2009). Using data from a large Danish twin study, Grosen and colleagues recently reported heritabilities of 91% for cleft lip with or without cleft palate (CL/P) and 90% for cleft palate only (CPO), with correspondingly small environmental contributions for either type of clefts (9% for CL/P and 10% for CPO) (Grosen et al., 2011). These and other similar studies [reviewed in Dixon et al. (2011); Rahimov et al. (2012)] point to a strong genetic component to clefting. Although the environmental contribution is likely to be smaller, assessing the joint impact of environmental risk factors and susceptibility alleles is important in solving the riddle of why some babies are born with clefts whereas the vast majority are not.
Experiments in animals have long shown that environmental factors can cause clefting (Warkany, 1971). Recognized teratogens in humans include both rare exposures such as phenytoin, valproic acid and thalidomide, as well as more common exposures such as cigarette smoking and heavy alcohol drinking in the first trimester of pregnancy (Murray, 2002). In addition, low dietary intake of B-complex vitamins and deficient or excessive amounts of vitamin A have also been linked to a higher risk of clefting (Hayes, 2002; Munger, 2002). Folic acid on the other hand appears to be protective (Badovinac et al., 2007; Wilcox et al., 2007). However, like vitamin A, the dosage of folic acid appears to be important. Whereas a folate-deficient medium was found to perturb normal cell development of immortalized B-lymphoblasts from five children with CL/P (Bliek et al., 2008), a high dose of folic acid appeared to be equally detrimental to cell development in mice (Pickell et al., 2011). Less data are available on the effects of coffee-drinking on clefting risk. Our own study showed odds ratios of 1.4 (95% CI: 1.0-1.9) for CL/P among mothers who drank less than 3 cups/day and 1.6 (95% CI: 1.1-2.4) for drinking at least 3 cups/day, when compared with mothers who did not consume any coffee (Johansen et al., 2009).
Because environmental exposures are modifiable, unraveling GxE interaction provides an attractive avenue for more targeted interventions in high-risk individuals (Khoury et al., 1995) – a rationale strongly supported by findings in animal models. While the spontaneous CL/P rate among the cleft-susceptible CL/Fr mouse is about 20% compared to less than 10% in the normal C57BL/6J strain (Juriloff, 2002), it can easily be increased to almost 100% at specific doses of 6-aminonicotinamide (a vitamin B3 inhibitor). Just as some mouse strains are more susceptible to external teratogens, human fetuses carrying high-risk alleles may be more sensitive to particular teratogenic agents (Juriloff, 2002; Millicovsky and Johnston, 1981). An interaction between maternal cigarette smoking and the TaqI polymorphism in the gene for transforming growth factor alpha (TGFA) is among the most widely studied GxE interaction in clefting, but several other genes, including MSX1, TGFB3, BCL3, CYP1A1 and GSTT1, have also been examined in conjunction with maternal alcohol consumption, cigarette smoking, medication use, and multivitamin supplementation during pregnancy (these studies are summarized in Supplementary Table S1). Most recently, the first genome-wide search for GxE interaction in CPO revealed suggestive evidence of interaction with MLLT3 and SMC2 on chromosome 9, TBK1 on chromosome 12, ZNF23 on chromosome 18 and BAALC on chromosome 8 (Beaty et al., 2011).
Several methods exist for analyzing GxE interactions, each with its own set of advantages/disadvantages. The case-parent triad design has higher power to detect interactions than the case-control design, and it protects better against the effects of population stratification (Thomas, 2010). However, its main drawback is the difficulty of obtaining DNA samples from parents for phenotypes that are typically late-onset. For rare diseases, the number of available cases may also restrict the power to detect effects, and the lack of independent controls makes it impossible to estimate the main effect of the environmental exposure. To rectify this, additional independent controls can be added to the design, and this approach has been explored by numerous authors (Nagelkerke et al., 2004; Andrieu and Goldstein, 2004; Chatterjee et al., 2005; Goldstein et al., 2006; Dudbridge, 2008; Vermeulen et al., 2009), using variants of case-parent triads or affected/non-affected siblings combined with independent controls.
In this study, we had complete control triads as well as mother-child control dyads. To incorporate all the available information and thus increase power in the current study, we extended the log-linear model for case-parent triads (Vermeulen et al., 2009; Gjessing and Lie, 2006) to include control triads, using a full maximum likelihood model. We focused on six maternal first-trimester exposures (smoking, alcohol, coffee, folic acid supplements, dietary folate, and vitamin A) that had previously shown statistically significant associations with clefting risk in a Norwegian study population (Lie et al., 2008; Deroo et al., 2008; Johansen et al., 2008, 2009; Wilcox et al., 2007). To screen for GxE interaction in the same study population, we applied the above hybrid method to one of the largest available collections of genes and SNPs for orofacial clefts (1315 SNPs in 334 autosomal genes) in a population-based sample of 311 case-parent triads of isolated CL/P, 114 case-parent triads of isolated CPO, and 562 control-parent triads from Norway (1996-2001). This nationwide study represents the largest candidate-gene based scan to date for GxE interactions in orofacial clefts.
Participants in this study derive from a nationwide case-control study of orofacial clefts in Norway (1996-2001). Mothers of babies with clefts were invited to participate in the study through the two surgical clinics that treat all clefts in Norway (in Oslo and Bergen). Participation rate was 88% in the case group. Overall, 377 CL/P and 196 CPO case-parent triads (n=573) were recruited. Of these, 311 were isolated CL/P and 114 were isolated CPO. During the same years, controls were randomly selected from all live births recorded in the Norwegian Medical Birth Registry. Of 1006 eligible control-mothers, 76% (n=763) agreed to participate. Of these, 562 control-parent triads were available for the present analyses. Further details on study design and participants have been provided elsewhere (Wilcox et al., 2007).
After the consent form was returned (approximately one month after the baby’s birth), mothers received a self-administered questionnaire for the assessment of a spectrum of environmental exposures, demographics, reproductive history and maternal health. Details are provided in Wilcox et al. (2007) and the questionnaire is available in its entirety at . The median time from the infant’s delivery to the mother’s completion of the questionnaire was 14 weeks for cases and 15 weeks for controls. After returning the questionnaire, control-mothers were asked to provide cheek swabs from themselves and the proband, as well as from the father and siblings of the proband. Mothers who agreed to provide swabs were mailed a kit containing sterile cotton-tipped sticks and alcohol-containing plastic tubes. Control-fathers were asked to provide buccal swab samples starting in November 1998; thus, a subset of the control families initially recruited consist of mother-child dyads only (n=347).
The study was approved by the Norwegian Data Inspectorate, the Regional Committee on Research Ethics for Western Norway, and the respective Institutional Review Boards of the National Institute of Environmental Health Sciences (NIH/NIEHS) and the University of Iowa. Clinicopathological information from all participating families and biologic specimens for DNA extraction were obtained with the written informed consent of the mothers and fathers. All aspects of this research are in compliance with the tenets of the Declaration of Helsinki for human research (
Maternal exposures and cleft candidate genes
Although the self-administered questionnaires contained a much more comprehensive list of environmental exposures, we opted to focus only on those exposures for which we had prior evidence of association in the same study population. These were maternal first-trimester cigarette smoking, alcohol, coffee, folic acid supplements, dietary folate, and vitamin A (Lie et al., 2008; Deroo et al., 2008; Wilcox et al., 2007; Johansen et al., 2008, 2009). Further, we targeted first-trimester exposures in particular because this window of exposure covers critical stages of craniofacial development [specifically, embryonic lip formation in weeks 5–6 and palatal shelf fusion in weeks 7–10 (Diewert, 1985)]. Table 1 summarizes the main findings from these studies, while Table 2 outlines how the six maternal first-trimester exposures were dichotomized prior to analysis. In addition to these environmental exposures, genotypes for 1315 SNPs in 334 autosomal cleft candidate genes were available from a previous study in which we searched for fetal gene-effects in the same dataset (Jugessur et al., 2009).
Table 1
Table 1
Synopsis of the main findings from our previous analyses of six maternal exposures on the risk of orofacial clefts.
Table 2
Table 2
Characteristics of the maternal first-trimester exposures1.
Statistical analysis
Hybrid study design
We developed a novel study design that combines case-parent triads and control-parent triads into a single hybrid design. In case-parent triads, the association between alleles and disease can be estimated by contrasting transmitted versus non-transmitted alleles, effectively using the non-transmitted alleles as controls. With the hybrid design, alleles of the case children can now be contrasted with alleles of independent controls from the control-parent triads, providing additional statistical power to detect an association. We developed a complete maximum likelihood model for this combined (hybrid) design. The model is a direct extension of a previously described log-linear model for estimating relative risks from case-parent triads alone (Gjessing and Lie, 2006). The current model incorporates independent controls by using the “rare disease assumption”, so that the odds ratio estimated from the case-control comparison can be assumed equal to the relative risk estimated from the case-parent triads. Further, the expectation-maximization (EM) algorithm was used to impute missing genotypes in the data.
Model implementation
We implemented the log-linear model for the hybrid design in the R statistical library HAPLIN (version 4.0). R can be downloaded from the official site at and HAPLIN is available from our web site at To assess GxE interaction between the six maternal exposures and genetic variants in the 334 cleft candidate genes, we tested whether the fetal relative risk estimates were significantly different across the strata of exposed versus unexposed mothers. For each SNP and maternal exposure combination, the effects of GxE interaction were estimated in the following manner. First, the data were stratified according to exposure levels (Table 2). For each of K exposure strata, we used HAPLIN to compute equation M1, the allele relative risk estimate on a log-scale, and the corresponding asymptotic variance estimate equation M2, k = 1; 2; … ;K. To reduce the number of parameters to be tested, we assumed a multiplicative allele-dose effect. Since the strata are independent, it follows from standard asymptotic theory of log-linear models that asymptotically,
equation M3
equation M4
To test the difference between stratum estimates, we used a Wald test. Let D be an r × K contrast matrix, with rK − 1. If follows from the above that asymptotically, equation M5 with equation M6. The Wald test statistic of GxE interaction is defined as
equation M7
which under the null hypothesis of = 0 has a chi-squared distribution with r degrees of freedom.
To be specific, consider the full interaction hypothesis β1 = β2 = … = βK, i.e. that all stratum-specific effects are equal. The appropriate contrast matrix is then the (K − 1) × K matrix
equation M8
which provides a test for β1 = β2, β1 = β3, etc., with a combined K − 1 degrees of freedom. This is an overall test for any effect differences among strata. Note that if the number of strata K is large, the test may lose power due to the large number of degrees of freedom. If, however, the stratum variable is ordinal, for instance formed by categorizing a continuous variable into K categories, a second alternative might be better. In this situation, it is natural to assume that in the presence of an interaction there would be a dose-response relationship implying that the sequence of genetic effect estimates β1; β2; β3; … ; βK would be increasing or decreasing systematically over the strata i = 1; 2; 3; … ;K. For instance, one could assume that βi = β1 + α(i − 1) for some α, and test the null hypothesis α = 0. To achieve this, one could use a 1 × K linear contrast matrix
equation M9
The test statistic in (1) would then have a single degree of freedom, and thus a higher power to detect this particular type of interaction. This approach could be extended to higher-order orthogonal polynomial contrasts, but the linear is presumably the most useful alternative to a full interaction testing.
In the present study, the exposures were dichotomized, and hence K = 2. In addition to the Wald test, a likelihood ratio test (LRT) was used to verify the results. The LRT compares two models; the larger one is estimated allowing different gene frequencies and different allele relative risks over all strata, while the smaller one is estimated allowing different gene frequencies over strata but assuming the same relative risk across all strata.
Multiple testing
We used quantile-quantile (QQ) plots to screen for significant p-values. This provides a simple graphical procedure for the simultaneous evaluation of many tests. If none of the markers are associated with orofacial clefts, the p-values would fall close to the straight diagonal line. They would otherwise deviate upwards from this line if the markers are associated with the disorder.
A related numerical approach is through the control of the false discovery rate (FDR) criterion proposed by Storey and Tibshirani (2003). This leads to a set of “q-values” to replace the original p-values. The q-values thus relate to the false discovery rate, whereas p-values relate to the type I error in standard testing. In the current analyses, we used a q-value below 0.05 to indicate statistical significance. In general, the multiple-testing problem can be approached through an empirical Bayes approach (Efron, 2010).
Even if test (1) is asymptotically unbiased, the asymptotic results may not be valid for small datasets, for instance when one exposure group is exceedingly small. The estimated p-values might thus have a noticeable bias for small or moderate sample sizes. A bias of this type would most likely be apparent as a deviation from the sloping line in a QQ-plot of p-values. To assess this in our data, we obtained p-values for 200 simulated datasets in which the exposure status was randomly permuted. The ordered p-values were then averaged and compared against the expected values under H0. Two different “Cleft type–Maternal exposure” combinations were selected according to where one might expect the highest and lowest bias in our data. First, we chose the “Isolated CPO–Alcohol” combination because of the small sample size and unequal partitioning of the two exposure levels, which might possibly produce a bias. Second, we chose the “Isolated CL/P–Coffee” combination because of the large sample size and equal partitioning between exposure levels. Here, one might expect a small bias. These two combinations should thus indicate the extent of bias in our data.
Finally, we performed pathway-based analyses on 28 detoxification genes and 21 genes involved in folic acid metabolism. For gene categorization, information was collated from several databases, including NCBI Biosystems (, Kegg Pathway (, WikiPathways (, and Panther Classification System ( The 28 detoxification-pathway genes are highlighted in red and the 21 folate-pathway related genes in blue in Supplementary Table S2.
Statistical power
We used simulations to assess the statistical power of the Wald test. The simulations took into account the full procedure of computing all p-values and adjusting for multiple-testing by computing the corresponding q-values. We assumed that, among the exposed mothers, a given fraction of the SNPs would be associated with a moderate increase in risk of having an affected child. All data were simulated using the same number of observations, case/control status, exposure groupings and allele frequencies as in the observed data.
A standard simulation procedure would be to repeat the following a sufficient number of times: Simulate a complete dataset with all 1315 SNPs, estimate the p-values, and then compute the corresponding q-values used to retain or reject H0. However, to calculate the power more efficiently, the following two-step procedure was adopted. First, we considered a single SNP with a specified relative risk of 1.0, 1.2, 1.4, 1.6, 1.8 or 2.0; simulated 250,000 complete sets of case-parent and control-parent triads, and computed the Wald test p-value. For each specified relative risk, we thus obtained a sample of 250,000 p-values. The second step involved drawing a sample of 1315 p-values by drawing a fraction 1−γ of p-values from the collection of p-values with RR = 1:0 and a fraction γ from those with RR ≥ 1:0. For each resulting sample of p-values, we adjusted for multiple testing by computing the corresponding q-values, and retained the smallest q-value. This second step was repeated 10,000 times, and our statistical power estimate was then calculated as the proportion of times the smallest q-value was less than 0.05.
The QQ-plots and q-value plots in Figures 1--66 summarize the results of the HAPLIN analyses for isolated CPO, CLO and CL/P by each of the following maternal first-trimester exposures: smoking, alcohol, coffee, folic acid supplement, dietary folate, and vitamin A. In these panels, there is little evidence for p-values in excess of what would be expected by chance alone, except perhaps for a GxE interaction between isolated CPO and dietary folate (Figure 1). This was further confirmed in our false discovery rate (FDR) assessment, where a q-value of 0.0496 was obtained for the interaction between the T-box 4 gene (TBX4) and dietary folate in the isolated CPO category (Figure 4). No other q-values fell below the 0.05 significance level.
Figure 1
Figure 1
Analysis of GxE interaction for isolated cleft palate only (CPO). The QQ-plots compare p-values (−log10 scale) with an expected uniform distribution under the null (sloping line). The pointwise 95% confidence bounds for the p-values are indicated (more ...)
Figure 2
Figure 2
Analysis of GxE interaction for isolated cleft lip only (CLO). The QQ-plots compare p-values (−log10 scale) with an expected uniform distribution under the null (sloping line). The pointwise 95% confidence bounds for the p-values are indicated (more ...)
Figure 4
Figure 4
Analysis of GxE interaction for isolated cleft palate only (CPO). The plots show sorted q-values from the false discovery rate (FDR) analysis for each maternal first-trimester exposure. Points falling below the q=0.05 line in the plots would indicate (more ...)
Our pathway-based analyses of 21 folic-acid related genes (Figure 7) and 28 genes involved in detoxication (Figure 8) did not reveal any evidence of GxE interaction. P-values from the HAPLIN analyses were combined using Fisher’s method (Fisher, 1958), and these are provided in Table 3. Again, none of the p-values were statistically significant.
Figure 7
Figure 7
Analysis of 21 folate-pathway genes for isolated cleft lip with or without cleft palate (CL/P). The QQ-plots compare p-values (−log10 scale) with an expected uniform distribution under the null (sloping line). The pointwise 95% confidence bounds (more ...)
Figure 8
Figure 8
Analysis of 28 detoxification-pathway genes for isolated cleft lip with or without cleft palate (CL/P). The QQ-plots compare p-values (−log10 scale) with an expected uniform distribution under the null (sloping line). The pointwise 95% confidence (more ...)
Table 3
Table 3
Fisher-combined p-values for pathway-based analyses of 28 detoxification genes and 21 genes involved in folate metabolism.
We assessed for possible bias in theWald test for the two combinations of “Cleft type–Maternal exposure” as described in the Methods section. As expected, we observed the largest bias in the “Isolated CPO–Alcohol” combination (Figure 9), whereas there was an almost perfect match with the p-values expected under H0 in the “Isolated CL/P–Coffee” combination. For either combination, however, the bias was in the direction of having larger p-values than expected.
Figure 9
Figure 9
Simulated data under the null hypothesis of no GxE interaction effects. The QQ-plots compare p-values (−log10 scale) with an expected uniform distribution under the null (sloping line). The pointwise 95% confidence bounds for the p-values are (more ...)
Table 4 displays the estimated statistical power for detecting a GxE interaction for different relative risks and proportion of significant SNPs. Not surprisingly, the power is adequate to detect all but the smallest relative risks and smallest proportion of significant SNPs.
Table 4
Table 4
Estimated statistical power for detecting GxE interaction for different relative risks and proportion of significant SNPs.
Studies of GxE interaction are critical in advancing our understanding of the etiology of orofacial clefts (Zhu et al., 2009) – and yet there have been few successes in actually identifying these important interactions (Khoury and Wacholder, 2009). At least part of the problem has been that the development of robust statistical methods and models for identifying GxE interaction has not kept pace with the rapid advances in molecular genetics (Birnbaum et al., 2009; Grant et al., 2009; Beaty et al., 2010; Mangold et al., 2010; Dixon et al., 2011; Rahimov et al., 2012). Furthermore, the combination of inadequate sample size, study heterogeneity and differential assessments of environmental exposures continues to challenge studies of GxE interaction (Clayton and McKeigue, 2001; Thomas, 2010; Weinberg, 2009). Novel study designs that increase power but are less prone to confounding from stratification, such as the hybrid design developed in this study, are important in advancing the study of GxE interaction.
The self-administered questionnaires used in this study contained a much more comprehensive list of environmental exposures than the six maternal exposures examined here. Instead of a haphazard exploration of all possible combinations of genes and exposures, however, we chose to narrow down our search for GxE interaction by studying only those exposures that had already shown an association with clefting in the same study population (Lie et al., 2008; Deroo et al., 2008; Wilcox et al., 2007; Johansen et al., 2008, 2009). Despite this more targeted approach, there was little statistical evidence overall for GxE interactions between the six maternal first-trimester exposures and the 334 cleft candidate genes tested in our data. One possible exception was an interaction between the T-domain transcription factor gene T-box 4 (TBX4; chr 17q21-q22) and dietary folate in isolated CPO . TBX4 is a member of an evolutionarily highly conserved family of genes that regulate key developmental processes (King et al., 2006). The mouse homolog, Tbx4, regulates limb development and specification of limb identity (Duboc and Logan, 2011). This gene also maps to the 17q21 region that has previously shown significant results in a metaanalysis of 13 genome-wide linkage scans for CL/P (Marazita et al., 2004). Finally, this chromosomal region is also syntenic with the region harboring the mouse clf1 mutation (Juriloff DM, 1996). Although these studies indicate that TBX4 might be more relevant to CL/P than CPO, its connection to maternal dietary folate in our data will need to be verified in other datasets before we can categorically dismiss it as a false positive.
Several studies have previously used the case-parent triad design to investigate GxE interactions in clefting (Jugessur et al., 2003; Shi et al., 2007; Wu et al., 2010). Case-parent triads allow a range of causal scenarios to be investigated with relatively high precision (Gjessing and Lie, 2006). These include fetal and maternal gene-effects, parent-of-origin effects, gene-gene (GxG) interaction, and GxE interaction. For GxE interaction, one compares the transmission of a particular allele or haplotype to an affected offspring between triads of exposed and unexposed mothers. A statistically significant difference between the two transmission patterns would suggest a multiplicative interaction. The use of case-parent triads also overcomes the problem of population stratification by effectively using non-transmitted parental alleles as controls, to be compared with the alleles transmitted to the case child. As both the case and control alleles derive from the same individuals, they are thus guaranteed to be selected from the same population subgroup.
Despite these attractive attributes, a notable limitation of the case-parent triad design is its inability to assess the main effects of an environmental exposure. Comparing genetic effects in the exposed and unexposed triads may reveal interactions, but says nothing about whether the environmental exposure is protective or deleterious. While the case-parent triad design protects against population stratification, it has in general a somewhat lower efficiency than a case-control design. As a consequence, various “hybrid designs” have been proposed to combine the merits of the case-parent triad and case-control design. The full hybrid design involves complete case-parent triads together with complete control-parent triads (not necessarily the same number of controls as cases), but truncated versions that include leaving out the control child and genotyping only his/her parents, leaving out the control-father, or using case-mother dyads together with control-mother dyads have also been proposed (Kazma et al., 2011; Vermeulen et al., 2009; Weinberg et al., 2011; Shi et al., 2008).
Because the hybrid design involves independent controls, it adds more statistical power to the analyses and allows an estimation of the main effects of an exposure. A complete caseparent triad provides two transmitted case alleles and two non-transmitted control alleles, but adding a complete control-parent triad adds four independent control alleles. Since the alleles carried by the control child are already present in his/her parents, a complete controlparent triad thus counts as two full independent controls. Although the hybrid design gains advantages from both the case-parent and the case-control designs, it is also to some extent influenced by population stratification. Since it incorporates a case-control component, the bias in the latter may show up in the overall estimate. While the effect is lower than for the case-control design alone, it may still be noticeable.
The HAPLIN implementation of the hybrid design uses a full maximum-likelihood log-linear model setup, as a direct extension of the case-parent triad model originally implemented in HAPLIN (Gjessing and Lie, 2006). The implementation makes the standard “rare disease assumption” which allows relative risks and odds ratios to be used interchangeably. This assumption is reasonable for orofacial clefts given the relatively low overall risks of CL/P and CPO, thus enabling the relative risk estimates from the case-parent triads to be combined with odds ratio estimates deriving from the case-control comparison. An advantage of the complete maximum-likelihood framework is that imputation of missing genotype data and haplotype reconstruction can be done using the EM algorithm. There is, however, always a price to pay in the form of increased computation time due to more complex model implementation. An additional advantage of the log-linear model setup is that we can obtain explicit estimates for relative risks with asymptotic standard errors. Based on this, interactions may be quantified by computing ratios of each stratum relative risk to a reference stratum. Testing can be performed using the likelihood ratio, Wald, and score tests.
The Wald test for interaction is a flexible approach that allows any set of estimated parameters to be compared across the strata of exposure. In the present application, we only analyzed a single parameter – the log relative risk for a given SNP variant under the assumption of a multiplicative dose response. If haplotype risks were estimated, one might equally well test relative risks linked to more than one haplotype. Similarly, one might test for different haplotype frequencies across strata. The model setup is also simplified by estimating the same model over all strata independently. As an additional check of the software implementation and estimation, we performed a likelihood ratio test for the same interactions. The LRT requires the null model to be estimated explicitly, in this case with a model assuming different SNP frequencies but the same SNP relative risk across strata. The results from the LRT were nearly identical to the Wald test, as would be expected in our model framework.
When testing interactions with a continuous exposure variable, the variable can be grouped into suitable categories, each one large enough that the asymptotic properties of the estimator would be expected to hold true. To increase power, one might test for a trend-type relationship, assuming systematically increasing or decreasing genetic effects in the exposure categories as described in “Model implementation” under “Statistical analysis” in the Methods section. In our setting, we decided to dichotomize all exposures, using a cut-off value consistent with previously detected exposure effects on the risk of clefts. Table 2 shows the cut-off values used for each type of exposure.
Even though the hybrid design affords more statistical power to detect GxE interaction, it may still be too limited with respect to our sample size. Appropriate control for multiple testing places additional constraints on what effect sizes are possible to detect, even in a candidate-gene setting such as in the present study. Our power simulations explore a selection of scenarios with varying size of the relative risk associated with a SNP and the proportion of SNPs exhibiting this strength of association. The results in Table 4 show that we should be able to detect all but the smallest relative risks and proportion of SNPs. We recently studied the role of maternal smoking and variants in nicotine dependence genes using another novel approach that involves the use of instrumental variables (IV) (Wehby et al., 2011). Under the IV model, maternal smoking before and during pregnancy increased the risk of clefting by about 4-5 times at the sample average smoking rate – substantially higher than that found with classical analytic models. This may be because the usual models cannot account for self-selection into smoking based on unobserved confounders. Therefore, a relative risk of 1.8 to 2.0 in our power calculations may be well within the range of expected risks.
Comparing the results of GxE interaction across studies is rarely straightforward, not only because of important differences in study design and methodology, but also because of differences that are unique to the specific populations studied (Clayton and McKeigue, 2001; Thomas, 2010; Weinberg, 2009). The availability of case-parent triads and independent controls (or control families) increases the power to detect interactions while retaining a degree of protection against population stratification. Even in a candidate gene study, correction for multiple testing leads to non-significant overall results, and statistical power to detect interactions is frequently lost due to rare exposure categories or low allele frequencies. While these challenges are well understood, it is still remarkable that orofacial clefts, a phenotype of supposedly very high heritability, remains so hard to decipher.
In conclusion, identifying GxE interactions in complex traits is still fraught with difficulties, but their identification is essential to applying the findings to improve diagnosis, prognosis and therapies/prevention. As large sample sizes continue to accrue through collaboration and biorepositories, statistical power will increase to detect the effects of GxE interaction. The application of powerful new methodologies, such as the one outlined herein, will enable investigators to detect those effects more efficiently while balancing the realities of phenotyping and genotyping costs.
Figure 3
Figure 3
Analysis of GxE interaction for isolated cleft lip with or without cleft palate (CL/P). The QQ-plots compare p-values (−log10 scale) with an expected uniform distribution under the null (sloping line). The pointwise 95% confidence bounds for the (more ...)
Figure 5
Figure 5
Analysis of GxE interaction for isolated cleft lip only (CLO). The plots show sorted q-values from the false discovery rate (FDR) analysis for each maternal first-trimester exposure. Points falling below the q=0.05 line in the plots would indicate statistical (more ...)
Figure 6
Figure 6
Analysis of GxE interaction for isolated cleft lip with or without cleft palate (CL/P). The plots show sorted q-values from the false discovery rate (FDR) analysis for each maternal first-trimester exposure. Points falling below the q=0.05 line in the (more ...)
Supplementary Material
Supp Table S1
Supp Table S2
This research was supported by the National Institutes of Health (DE08559, P60 DE13076, NIH P30 ES05605, and RO1 DE-11948-04), the Norwegian Research Council (NFR 177522/V50), and in part by the Intramural Research Program of the National Institute of Environmental Health Sciences (NIH/NIEHS). We thank all participating families who made this study possible, and Dr. Abee L. Boyles for her comments on an earlier draft of this manuscript. Genotyping services were provided by the Center for Inherited Disease Research (CIDR), which is fully funded through a federal contract from the National Institutes of Health (NIH) to The Johns Hopkins University, Contract Number N01-HG-65403. We thank Ivy McMullen, Corinne Boehm, Kim Doheny, and other CIDR staff involved in this project. We also thank the US National Institute of Dental and Craniofacial Research (NIDCR) for underwriting a significant proportion of the genotyping costs by CIDR.
Electronic-Database Information The R statistical library HAPLIN is available from our web site at
  • Andrieu N, Goldstein A. The case-combined-control design was efficient in detecting gene-environment interactions. Journal of Clinical Epidemiology. 2004;57(7):662–671. [PubMed]
  • Badovinac RL, Werler MM, Williams PL, Kelsey KT, Hayes C. Folic acid-containing supplement consumption during pregnancy and risk for oral clefts: a meta-analysis. Birth Defects Res A Clin Mol Teratol. 2007;79:8–15. [PubMed]
  • Beaty TH, Murray JC, Marazita ML, Munger RG, Ruczinski I, Hetmanski JB, Liang KY, Wu T, Murray T, Fallin MD, Redett RA, Raymond G, Schwender H, Jin SC, Cooper ME, Dunnwald M, Mansilla MA, Leslie E, Bullard S, Lidral AC, Moreno LM, Menezes R, Vieira AR, Petrin A, Wilcox AJ, Lie RT, Jabs EW, Wu-Chou YH, Chen PK, Wang H, Ye X, Huang S, Yeow V, Chong SS, Jee SH, Shi B, Christensen K, Melbye M, Doheny KF, Pugh EW, Ling H, Castilla EE, Czeizel AE, Ma L, Field LL, Brody L, Pangilinan F, Mills JL, Molloy AM, Kirke PN, Scott JM, Arcos-Burgos M, Scott AF. A genome-wide association study of cleft lip with and without cleft palate identifies risk variants near mafb and abca4. Nat Genet. 2010;42:525–9. [PMC free article] [PubMed]
  • Beaty TH, Ruczinski I, Murray JC, Marazita ML, Munger RG, Hetmanski JB, Murray T, Redett RJ, Fallin MD, Liang KY, Wu T, Patel PJ, Jin SC, Zhang TX, Schwender H, Wu-Chou YH, Chen PK, Chong SS, Cheah F, Yeow V, Ye X, Wang H, Huang S, Jabs EW, Shi B, Wilcox AJ, Lie RT, Jee SH, Christensen K, Doheny KF, Pugh EW, Ling H, Scott AF. Evidence for gene-environment interaction in a genome wide study of nonsyndromic cleft palate. Genet Epidemiol. 2011 Epub ahead of print. [PMC free article] [PubMed]
  • Birnbaum S, Ludwig KU, Reutter H, Herms S, Steffens M, Rubini M, Baluardo C, Ferrian M, Almeida de Assis N, Alblas MA, Barth S, Freudenberg J, Lauster C, Schmidt G, Scheer M, Braumann B, Berge SJ, Reich RH, Schiefke F, Hemprich A, Potzsch S, Steegers-Theunissen RP, Potzsch B, Moebus S, Horsthemke B, Kramer FJ, Wienker TF, Mossey PA, Propping P, Cichon S, Hoffmann P, Knapp M, Nothen MM, Mangold E. Key susceptibility locus for non-syndromic cleft lip with or without cleft palate on chromosome 8q24. Nat Genet. 2009;41:473–7. [PubMed]
  • Bliek BJ, Steegers-Theunissen RP, Blok LJ, Santegoets LA, Lindemans J, Oostra BA, Steegers EA, de Klein A. Genome-wide pathway analysis of folate-responsive genes to unravel the pathogenesis of orofacial clefting in man. Birth Defects Res A Clin Mol Teratol. 2008;82:627–35. [PubMed]
  • Chatterjee N, Kalaylioglu Z, Carroll R. Exploiting gene-environment independence in family-based case-control studies: Increased power for detecting associations, interactions and joint effects. Genet Epidemiol. 2005;28(2):138–156. [PubMed]
  • Clayton D, McKeigue PM. Epidemiological methods for studying genes and environmental factors in complex diseases. Lancet. 2001;358(9290):1356–60. [PubMed]
  • Deroo LA, Wilcox AJ, Drevon CA, Lie RT. First-trimester maternal alcohol consumption and the risk of infant oral clefts in norway: A population-based case-control study. Am J Epidemiol. 2008 [PMC free article] [PubMed]
  • Diewert VM. Development of human craniofacial morphology during the late embryonic and early fetal periods. Am J Orthod. 1985;88:64–76. [PubMed]
  • Dixon M, ML M, TH B, JC M. Cleft lip and palate: understanding genetic and environmental influences. Nat Rev Genet. 2011;12(3):167–178. [PMC free article] [PubMed]
  • Duboc V, Logan M. Regulation of limb bud initiation and limb-type morphology. Dev Dyn. 2011;240(5):1017–27. [PubMed]
  • Dudbridge F. Likelihood-based association analysis for nuclear families and unrelated subjects with missing genotype data. Hum. Hered. 2008;66(2):87–98. [PMC free article] [PubMed]
  • Efron B. Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction. Cambridge University Press; 2010.
  • Fisher R. Statistical Methods for Research Workers. Oliver and Boyd; 1958.
  • Gjessing HK, Lie RT. Case-parent triads: estimating single- and doubledose effects of fetal and maternal disease gene haplotypes. Annals of human genetics. 2006:1–15. [PubMed]
  • Goldstein A, Dondon M-G, Andrieu N. Unconditional analyses can increase efficiency in assessing gene-environment interaction of the case-combined-control design. Int J Epidemiol. 2006;35(4):1067–1073. [PMC free article] [PubMed]
  • Grant SF, Wang K, Zhang H, Glaberson W, Annaiah K, Kim CE, Bradfield JP, Glessner JT, Thomas KA, Garris M, Frackelton EC, Otieno FG, Chiavacci RM, Nah HD, Kirschner RE, Hakonarson H. A genome-wide association study identifies a locus for nonsyndromic cleft lip with or without cleft palate on 8q24. J Pediatr. 2009;155:909–13. [PubMed]
  • Grosen D, Chevrier C, Skytthe A, Bille C, Molsted K, Sivertsen A, Murray JC, Christensen K. A cohort study of recurrence patterns among more than 54,000 relatives of oral cleft cases in denmark: support for the multifactorial threshold model of inheritance. J Med Genet. 2009 [PMC free article] [PubMed]
  • Grosen D, Bille C, Petersen I, Skytthe A, Hjelmborg JB, Pedersen JK, Murray JC, Christensen K. Risk of oral clefts in twins. Epidemiology. 2011;22(3):313–319. [PMC free article] [PubMed]
  • Hayes C. Environmental risk factors and oral clefts. Cleft lip and palate: from origin to treatment. 2002:159–169.
  • Johansen AM, Lie RT, Wilcox AJ, Andersen LF, Drevon CA. Maternal dietary intake of vitamin a and risk of orofacial clefts: a population-based case-control study in norway. Am J Epidemiol. 2008;167:1164–70. [PubMed]
  • Johansen AM, Wilcox AJ, Lie RT, Andersen LF, Drevon CA. Maternal consumption of coffee and caffeine-containing beverages and oral clefts: a population-based case-control study in norway. Am J Epidemiol. 2009;169:1216–22. [PMC free article] [PubMed]
  • Jugessur A, Lie R, Wilcox A, Murray J, Taylor J, Saugstad O, Vindenes H, Abyholm F. Cleft palate, transforming growth factor alpha gene variants, and maternal exposures: assessing gene-environment interactions in case-parent triads. Genet Epidemiol. 2003;25:367–74. [PubMed]
  • Jugessur A, Shi M, Gjessing HK, Lie RT, Wilcox AJ, Weinberg CR, Christensen K, Boyles AL, Daack-Hirsch S, Trung TN, Bille C, Lidral AC, Murray JC. Genetic determinants of facial clefting: analysis of 357 candidate genes using two national cleft studies from scandinavia. PLoS ONE. 2009;4:e5385. [PMC free article] [PubMed]
  • Juriloff D. Mapping studies in animal models. In: DF W, editor. Cleft Lip and Palate: From Origin to Treatment. Oxford University Press; New York: 2002. pp. 265–282.
  • Juriloff DM, Harris MJ, M. D. The clf1 gene maps to a 2- to 3-cm region of distal mouse chromosome 11. Mamm Genome. 1996;7(10):789. [PubMed]
  • Kazma R, Babron MC, Génin E. Genetic association and gene-environment interaction: a new method for overcoming the lack of exposure information in controls. Am J Epidemiol. 2011;173(2):225–35. [PubMed]
  • Khoury M, Wacholder S. Invited commentary: from genome-wide association studies to gene-environment-wide interaction studies-challenges and opportunities. Am J Epidemiol. 2009;169(2):227–230. [PMC free article] [PubMed]
  • Khoury MJ, Beaty TH, Hwang SJ. Detection of genotype-environment interaction in case-control studies of birth defects: how big a sample size? Teratology. 1995;51:336–43. [PubMed]
  • King M, Arnold J, Shanske A, Morrow B. T-genes and limb bud development. Am J Med Genet A. 2006;140(13):1407–13. [PubMed]
  • Lie RT, Wilcox AJ, Taylor J, Gjessing HK, Saugstad OD, Aabyholm F, Vindenes H. Maternal smoking and oral clefts: the role of detoxification pathway genes. Epidemiology. 2008;19:606–15. [PubMed]
  • Mangold E, Ludwig KU, Birnbaum S, Baluardo C, Ferrian M, Herms S, Reutter H, de Assis N, Chawa TA, Mattheisen M, Steffens M, Barth S, Kluck N, Paul A, Becker J, Lauster C, Schmidt G, Braumann B, Scheer M, Reich RH, Hemprich A, Potzsch S, Blaumeiser B, Moebus S, Krawczak M, Schreiber S, Meitinger T, Wichmann HE, Steegers-Theunissen RP, Kramer FJ, Cichon S, Propping P, Wienker TF, Knapp M, Rubini M, Mossey PA, Hoffmann P, Nothen MM. Genome-wide association study identifies two susceptibility loci for nonsyndromic cleft lip with or without cleft palate. Nat Genet. 2010;42:24–6. [PubMed]
  • Marazita M, Murray J, Lidral A, Arcos-Burgos M, Cooper M, Goldstein T, Maher B, Daack-Hirsch S, Schultz R, Mansilla M, Field L, Liu Y, Prescott N, Malcolm S, Winter R, Ray A, Moreno L, Valencia C, Neiswanger K, Wyszynski D, Bailey-Wilson J, Albacha-Hejazi H, Beaty T, McIntosh I, Hetmanski J, Tuncbilek G, Edwards M, Harkin L, Scott R, Roddick L. Meta-analysis of 13 genome scans reveals multiple cleft lip/palate genes with novel loci on 9q21 and 2q32-35. Am J Hum Genet. 2004;75:161–73. [PubMed]
  • Millicovsky G, Johnston MC. Maternal hyperoxia greatly reduces the incidence of phenytoin-induced cleft lip and palate in a/j mice. Science. 1981;212:671–2. [PubMed]
  • Munger RG. Maternal nutrition and oral clefts. Cleft lip and palate: from origin to treatment. 2002:170–192.
  • Murray JC. Gene/environment causes of cleft lip and/or palate. Clin Genet. 2002;61:248–56. [PubMed]
  • Nagelkerke N, Hoebee B, Teunis P, Kimman T. Combining the transmission disequilibrium test and case-control methodology using generalized logistic regression. Eur. J. Hum. Genet. 2004;12(11):964–970. [PubMed]
  • Pickell L, Brown K, Li D, Wang X, Deng L, Wu Q, Selhub J, Luo L, Jerome-Majewska L, Rozen R. High intake of folic acid disrupts embryonic development in mice. Birth Defects Research. Part A, Clinical and Molecular Teratology. 2011;91(1):8–19. [PubMed]
  • Rahimov F, Jugessur A, Murray J. Genetics of nonsyndromic orofacial clefts. Cleft Palate Craniofac J. 2012;49(1):73–91. [PMC free article] [PubMed]
  • Shi M, Christensen K, Weinberg CR, Romitti P, Bathum L, Lozada A, Morris RW, Lovett M, Murray JC. Orofacial cleft risk is increased with maternal smoking and specific detoxification-gene variants. Am J Hum Genet. 2007;80:76–90. [PubMed]
  • Shi M, Umbach DM, Vermeulen SH, Weinberg CR. Making the most of case-mother/control-mother studies. Am J Epidemiol. 2008;168:541–7. [PMC free article] [PubMed]
  • Sivertsen A, Wilcox AJ, Skjaerven R, Vindenes HA, Abyholm F, Harville E, Lie RT. Familial risk of oral clefts by morphological type and severity: population based cohort study of first degree relatives. Bmj. 2008;336:432–4. [PMC free article] [PubMed]
  • Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proc Natl Acad Sci U S A. 2003;100:9440–5. [PubMed]
  • Strauss RP. The organization and delivery of craniofacial health services: the state of the art. Cleft Palate Craniofac J. 1999;36(3):189–95. [PubMed]
  • Thomas D. Gene-environment-wide association studies: emerging approaches. Nat Rev Genet. 2010;11(4):259–72. [PMC free article] [PubMed]
  • Vermeulen SH, Shi M, Weinberg CR, Umbach DM. A hybrid design: case-parent triads supplemented by control-mother dyads. Genet Epidemiol. 2009;33:136–44. [PMC free article] [PubMed]
  • Warkany J. Congenital malformations. Malformations of the face. 1971
  • Wehby GL, Cassell CH. The impact of orofacial clefts on quality of life and healthcare use and costs. Oral Dis. 2010;16:3–10. [PMC free article] [PubMed]
  • Wehby GL, Jugessur A, Murray JC, Moreno LM, Wilcox A, Lie RT. Genes as instruments for studying risk behavior effects: an application to maternal smoking and orofacial clefts. Health Serv Outcomes Res Method. 2011 DOI 10.1007/s10742-011-0071-9, in press. [PMC free article] [PubMed]
  • Weinberg CR. Less is more, except when less is less: Studying joint effects. Genomics. 2009;93(1):10–12. [PMC free article] [PubMed]
  • Weinberg CR, Shi M, Umbach DM. Re.: “genetic association and gene-environment interaction: a new method for overcoming the lack of exposure information in controls” American Journal of Epidemiology. 2011;173(11):1346–1347. author reply 1347-1348. [PMC free article] [PubMed]
  • Wilcox AJ, Lie RT, Solvoll K, Taylor J, McConnaughey DR, Abyholm F, Vindenes H, Vollset SE, Drevon CA. Folic acid supplements and risk of facial clefts: national population based case-control study. Bmj. 2007;334:464. [PMC free article] [PubMed]
  • Wu T, Liang KY, Hetmanski JB, Ruczinski I, Fallin MD, Ingersoll RG, Wang H, Huang S, Ye X, Wu-Chou YH, Chen PK, Jabs EW, Shi B, Redett R, Scott AF, Beaty TH. Evidence of gene-environment interaction for the irf6 gene and maternal multivitamin supplementation in controlling the risk of cleft lip with/without cleft palate. Hum Genet. 2010;128(4):401–410. [PMC free article] [PubMed]
  • Zhu H, Kartiko S, Finnell RH. Importance of gene-environment interactions in the etiology of selected birth defects. Clinical genetics. 2009;75(5):409–423. [PubMed]