Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Behav Genet. Author manuscript; available in PMC 2016 July 1.
Published in final edited form as:
PMCID: PMC4459947

Fitting Procedures for Novel Gene-by-Measured Environment Interaction Models in Behavior Genetic Designs


For quantitative behavior genetic (e.g., twin) studies, Purcell proposed a novel model for testing gene-by-measured environment (GxM) interactions while accounting for gene-by-environment correlation. Rathouz et al. expanded this model into a broader class of non-linear biometric models for quantifying and testing such interactions. In this work, we propose a novel factorization of the likelihood for this class of models, and adopt numerical integration techniques to achieve model estimation, especially for those without close-form likelihood. The validity of our procedures is established through numerical simulation studies. The new procedures are illustrated in a twin study analysis of the moderating effect of birth weight on the genetic influences on childhood anxiety. A second example is given in an online appendix. Both the exant GxM models and the new non-linear models critically assume normality of all structural components, which implies continuous, but not normal, manifest response variables.

Keywords: Gene-by-measured environment interaction, Gene-by-environment correlation, Likelihood factorization, Adaptive Gauss-Hermite quadrature, Derivative-free optimization, Twin study

1. Introduction

It is now well-established that genetic vulnerabilities are necessary but not sufficient for the expression of many health problems and most mental disorders, and that moderation of genetic influences by non-genetic factors plays an important role in the degree to which such disorders are expressed (Weaver et al., 2004; Rutter, Moffitt, & Caspi, 2006; Bennett, 2008). Accurately identifying such gene-environment interactions, wherein genetic variation influencing the phenotype of interest is greater under certain environmental conditions than under others, is therefore of critical importance for mental health research (Eaves, Silberg, & Erkanli, 2003). To address this need, gene-environment interaction is now being regularly investigated using quantitative behavior genetic (BG) designs. In these designs, genotype is not observed directly; rather the genetic contribution to one or more phenotypes is inferred based on varying degrees of genetic relatedness among individuals such as twins. In contrast, a measured aspect of the environment, together with the phenotype of interest, is observed, and the measured environment is posited as a putative moderator of genetic influences on the phenotype. Models and methods of analysis for testing such gene-by-measured (GxM) environment interaction based on these designs are of great significance in furthering mental health research (Lahey & Waldman, 2003).

Beginning with an article by Dick et al. (2001) and especially popularized by Purcell (2002), non-linear latent variable models have been intensively used for estimating GxM in quantitative BG designs. In Purcell’s paper, he proposed an important extension of the classic bivariate biometric model to allow quantifying and testing GxM between a measured environment and each of the classic biometric variance components, including additive genetic influences (A), and shared (C) and unshared (E) environmental factors. The importance and novelty of this work is that it also accounted for potential correlations between M and those same A, C, or E variance components. Rathouz, Van Hulle, Rodgers, Waldman, and Lahey (2008) and Van Hulle, Lahey, and Rathouz (2013) have examined statistical aspects of Purcell’s approach and found in preliminary studies that, under some plausible conditions, identification of GxM interaction in Purcell’s model is not fully convincing because models not including such interactions may explain the data as well as Purcell’s interaction model. In applied studies, however, such alternative models have not been regularly considered as explanations for the underlying biological mechanism.

A novel class of such models would greatly enlarge the available model space for the data to distinguish between GxM and equally parsimonious non-GxM mechanisms. The methodological hypothesis is that, in this broader class of models, findings of GxM would be more robust because the opportunity for alternative explanations of the data generating process—with different biologic interpretations—is greater. We describe their entire class of models in Section 2 (Non-Linear Biometric Models for GxM). To illustrate the importance of considering such models in real investigations of GxM, we present an analysis of the moderation of genetic influences on anxiety by birth weight in a community sample of twins in Section 3 (Illustrative Application: Birth Weight and Anxiety). Different models from Section 2 are fitted and compared. A second example is given in an online appendix and is also discussed briefly in Section 3.

Whereas the class of models posited by Rathouz et al. (2008) is biologically interesting, one of the main reasons up to now that some of these models have not been fully investigated is that an important subset of them cannot be estimated or tested in standard structural equation modeling software such as Mplus (Muthén & Muthén, 1998-2012). To rectify this situation, the present article concerns new fitting algorithms for all of these proposed models. A major challenge is that some of the models contain latent multiplicative terms, leading to integrals which cannot be expressed analytically, and a likelihood which is inexpressible in closed-form.

Although the likelihood can be expressed in terms of a multiple integral, large dimensional integration is involved in its calculation, and direct application of numerical techniques to obtain an approximation is not feasible. To address this key issue, our strategy, elaborated in Section 4 (Model Testing and Estimation), is to exploit the special structure of this class of models in order to re-express the likelihood via a novel factorization into a closed-form factor and a factor containing a low-dimensional integral, and then to apply high-accuracy numerical integration techniques to approximate the resulting low-dimensional integral. Technical details including adaptive Gauss-Hermite quadrature (AGHQ) and its specific application to our problem are covered in Appendix A. A common value of k as the number of AGHQ nodes for each dimension of numerical integration is set to perform the calculation. Available routines of derivative-free optimization, implemented entirely in freely available packages in the R statistical software environment (R Core Team, 2013), are employed for numerical maximization, and the inverse of the Hessian matrix, also obtained numerically, is used to estimate standard errors of the parameter estimates.

In Section 5 (Numerical Performance of Model Estimation), we first validate our model estimation and testing procedures via comparisons with those in Mplus for some models that are available in that environment. Expanding to models that are not available in Mplus, we examine the numerical accuracy and stability of our model estimates across varying numbers of integration nodes, with the goal of making recommendations on the required number of such nodes for computationally efficient yet accurate approximations. We note here that the aim of our empirical investigation is to examine the numerical—and not the statistical—performance of our proposed procedures. The final section contains conclusions and a discussion. Statistical operating characteristics are examined in a companion paper (Zheng, Van Hulle, & Rathouz, submitted), which also provides an overarching summary of our findings in the form of guidance to the worker using these models in applications.

2. Non-Linear Biometric Models for GxM

In this section, we briefly revisit the proposed models from Rathouz et al. (2008) for testing and estimating GxM; complete explanation and interpretation of these models are available elsewhere (Rathouz et al., 2008; Van Hulle et al., 2013). We denote the measured—or putatively moderating—environmental variable by M, and the response variable by P for phenotype. Both M and P are observable on each individual in a sample of relative pairs, and are modeled together in a joint bivariate latent structural equation model. Whereas variation in relatedness yields model identifiability, for the most part, we present models in terms of a single individual for compactness of exposition.

The standard biometric model for M is given by


where AM, CM and EM are latent variables representing additive genetic influences, and shared and unshared environmental influences on M (Jinks & Fulker, 1970; Neale & Cardon, 1992). As is common in latent variable models, AM, CM and EM are standard normal latent random variables, independent of one another. Parameter θM = (μM, aM, cM, eM)T for M contains the mean μM of M, and non-negative coefficients of (aM, cM, eM) of (AM, CM, EM). Thus, aM2, cM2 and eM2 are the additive genetic, shared, and non-shared environmental components of variance of M respectively, with total var(M)=aM2+cM2+eM2.

When the data from a relative (e.g., twin) pair are combined, M, AM, CM and EM become two-vectors. The model is therefore expressed more completely as:


where Mj, AMj, CMj and EMj represent the corresponding quantities on twin j, j = 1, 2. In a twin study, the model is identified via extra constraints, specifically AM1 = AM2 for monozygotic (MZ) twins, corr(AM1, AM2)=0.5 for dizygotic (DZ) twins, CM1 = CM2, and corr(EM1, EM2)=0 for all twin pairs. Other relationship pairs require different values of corr(AM1, AM2).

With the model for M specified, GxM interactions are explored through various specifications for P. Purcell’s (Purcell, 2002) Cholesky with GxM (CholGxM) model for phenotype P specifies


where AU, CU and EU are also standard normal latent random variables, independent of each other and of (AM, CM, EM), and μP is the intercept for P. Coefficients aC and αC quantify additive genetic effects on P that are common with those on M, cC and κC quantify additive shared environmental effects on P that are common with those on M, and eC and εC quantify additive non-shared environmental effects on P that are common with those on M. In a parallel manner, (aU, αU), (cU, κU) and (eU, εU) quantify additive genetic, shared and non-shared environmental effects on P that are unique to P. Coefficients aU, cU, and eU are assumed non-negative. Greek coefficients αC, κC, εC, αU, κU and εU capture the interaction of moderator M with the various genetic and environmental factors that act on P. In particular, the magnitude of αC (αU) captures the GxM interaction of M with common (unique) genetic factor AM (AU) in determining P. Thus, GxM can be detected via the statistical hypothesis that αC = αU = 0. Without interaction, Model (2) reduces to the classic bivariate Cholesky (Chol) model, specified as


Rathouz et al. (2008) note that Model (2) does not explicitly contain a “main effect” of M on P, but rather captures such effects indirectly through aC, cC and eC on common genetic and environmental influences AM, CM and EM. Those authors have proposed an alternative, more parsimonious, model that does contain direct effects of M on P. This non-linear main effects model with GxM (NLMainGxM), viz.


is nested within Model (2) but yields a different interpretation in terms of GxM. In sub-model (4), the common factors AM, CM and EM only operate through the manifest value M to influence P. Because of that, genetic and environmental influences on P are mediated through M. When αU = 0 in Model (4), this model does not contain GxM. However, in that situation, Rathouz et al. (2008) and Van Hulle et al. (2013) have shown that if Model (2) is fitted to the data without considering Model (4) as an alternative, GxM may be artifactually detected as non-zero αC.

As indicated by the appearance of unique effects AU, CU and EU, Model (2), (3) and to a partial degree Model (4) are based on the bivariate Cholesky parameterization. As discussed by Johnson (2007), the Cholesky parameterization is more interpretable when there is a clear theoretical, causal, or temporal ordering of the variables M and P in the model. In situations without a clear causal ordering, Loehlin (1996) has suggested using either a “common factor” or a “correlated factors” model. These models, which treat M and P on equal footing, might be simpler and more defensible reference (null) models for analysis of GxM. Denoting by AP, CP, and EP the genetic and environmental influences on P, an alternative to Model (2) obtains by extending the correlated factors model to allow for GxM (CorrGxM), viz.,


Model (5) provides a more straightforward model for testing and quantifying GxM than does Model (2) when M and P do not have a clear causal ordering. Moreover, Model (5) allows for multiplicative effects with M with three fewer parameters than Model (2), which may increase the power to detect GxM. We note that Model (5) is nested in Model (2), and that when αP = κP = εP = 0, Model (5) reduces to Model (3).

The foregoing Cholesky and correlated factors models are available in Mplus and have been studied earlier by Van Hulle et al. (2013). There are, however, two important and novel variants on these models which are not available in standard software. These models have different biological interpretations that do not involve GxM in the sense of (2). Rather they posit that genetic and shared and non-shared environmental effects operate additively and independently—but not linearly—in the model for P. The nonlinear Cholesky (CholNonLin) model is specified as:


Unlike model (2), the defining and unique feature of this model is that additive genetic (AM, AU), and shared (CM, CU) and unshared or (EM, EU) environmental influences do not interact or moderate one another, nor are they moderated by measured environment M, even while each of them does act non-linearly. Rather, like the classical ACE model, the three underlying sources of variation combine additively and independently to influence P. Such a model could arise because the genetic factors common to M and P (i.e. AM) and the shared and non-shared environmental counterparts simply operate on a different scale for M than for P. Alternatively, the model could arise out of gene-by-gene interaction of additive genetic effects—either of AM with itself or of AM with AU, with similar terms for common and unique environmental effects.

In an analogous spirit, one can extend the classic bivariate correlated factors model to include non-linear but additive genetic and environmental influences, yielding the model (CorrNonLin):


In this model, the additive genetic effects AM on M moderate the additive genetic effects AP on P. In the following we describe a likelihood factorization and numerical integration routine suitable for fitting and testing all of the foregoing models, including Models (6) and (7).

3. Illustrative Application: Birth Weight and Anxiety

Here we present a real data analysis to illustrate our approach. The data arise from a random sample of 6-17-year-old twin pairs born in Tennessee and living in one of the state’s five metropolitan statistical areas in 2000-2001 (Lahey et al., 2004). Twin pairs were selected stratified on age and geographic area, and psychopathology information was collected. One of the research interests is to examine the relationship between birth weight and childhood anxiety. In this illustration, we use a quantitative measure of twins’ anxiety symptoms and birth weight in ounces as reported by their biological mothers. The analysis presented here extends the results in Van Hulle et al. (2013) to include newly available models.

Our analysis comprises 541 MZ and 887 DZ twin pairs. Measured environment M is the residual for birth weight after linear regression on gender and ethnicity. Similarly, phenotype P is the residual anxiety measure after regression on gender, ethnicity and age. Models are fitted based on numerical likelihood calculation with k=8 AGHQ nodes. The definition of AGHQ nodes k is covered in the Section 4.

Model fit statistics, including Bayesian information criterion (BIC) (Schwarz, 1978; Raftery, 1995), are summarized in Table 1 in terms of differences relative to the basic Chol model; lower values of −2×log-likelihood or BIC imply a better fit to the data. Whereas we recognize there has been some criticism of the BIC for overly favoring simpler models (Weakliem, 1999), there are several features in its favor in this setting. First, with our moderately large sample size, likelihood ratio tests will tend to detect effects that are statistically important, but not scientifically significant. Second, the BIC allows a basis for comparing non-nested models, which is of critical utility in this setting. Third, when examining a construct such as GxM, we seek results which are robust and replicable; it therefore behooves us to take a conservative stance and favor simpler models unless there is very strong evidence to the contrary.

Table 1
Differences of degrees of freedom (df), −2× log-likelihood, and BIC of fitted models in Tennessee Twin Study relative to the fitted Chol model (terms of fitted Chol model serve as subtrahends)

Turning to Table 1, we note that the Chol model is nested in all other models except the NLMainGxM model. Additionally, the CorrGxM (CorrNonLin) model is nested in the CholGxM (CholNonLin) model. Based on likelihood ratio tests (LRT), CholGxM fits better than CorrGxM and Chol, and CorrGxM also fits better than Chol. Ignoring new models NLMainGxM, CholNonLin, and CorrNonLin, one might conclude that the data are supportive of a GxM effect. The new CorrNonLin model is, however, a strong competitor and also fits better than Chol. Based on LRT alone, one might conclude that CorrNonLin is the best fitting model; this model does not contain GxM. Incorporating BIC values, we would conclude that NLMainGxM and CorrNonLin are roughly equivalent in terms of model fit. However, we have previously shown (Van Hulle et al., 2013) that model NLMainGxM without the GxM effects provides a better fit to the data.

Fitted individual models are shown in Table 2. We note that in the CholGxM model, the coefficient of M × AM is significant, especially as compared to the coefficient of AM, as are the coefficients of M × CM and M × EM. Without considering other models, these effects might be scientifically compelling. The CorrNonLin model is notable in that all of its non-linear terms are significant, suggesting the model is well-suited to the data. In the NLMainGxM model, the coefficient −0.104 of M × AU is found significant based on the standard error from inverting the hessian matrix. However, the entries related to μM in that hessian matrix are all very close to zero, which makes the inverse suffer numerical issues. The standard error from inverting the hessian matrix without those entries suggests an insignificant coefficient of M × AU. To resolve this issue, we turn to LRT, comparing models with and without M × AU, resulting in a p-value greater than 0.1 on 1 degree of freedom. As such, whether one concludes that the CorrNonLin or the NLMainGxM models is the best fitting, the evidence for GxM effects, including M × CM and M × EM, is not greatly supported by the data. In contrast, analysis with only the CholGxM and Chol models from Purcell (2002) would have led to a conclusion of GxM in the Tennessee Twins sample.

Table 2
Fitted models for Tennessee Twins Study. Starred (*) coefficients are significant at the 0.05 level based on Wald tests using the inverse of the Hessian matrix to obtain standard errors

In a second example,1 we analyzed the potential moderating effect of quality of infant and toddler family environment (M) on reading ability in middle childhood (P). This example illustrates use of our algorithm in a study wherein the relationships are primarily siblings and half-siblings. The analysis revealed very important non-linear effects which do not include GxM, highlighting the importance of considering alternative models to those containing GxM.

4. Model Testing and Estimation

We return to estimating models from Rathouz et al. (2008) in this section. The strategy is to exploit the special structure of this class of models in order to re-express the likelihood. Denote θ=(θMT,θPT)T, where θP is the parameter vector for the specified structural equation model for P. Likelihood L for θ arising from data on a sample of relative pairs i = 1, … , n, is given by L(θ)=i=1nf(Mi,Pi;θ). This likelihood involves up to ten latent dimensions per relative pair. In what follows, we develop a decomposition to avoid integration over all but three of those dimensions.

The contribution from one individual pair can be written (suppressing subscripts i) using the factorization


The likelihood has two parts: the contribution from data M and the contribution from P given M. The first part, f(M; θM), is a bivariate normal density, expressible in closed-form, with mean μM and variance-covariance matrix based on (aM, cM, eM) and corr(AM1, AM2). The second part, f(P|M; θ), is more complex, for it might involve non-linear terms and is not necessarily available in closed-form.

From Model (1), (AM, CM, EM) are jointly multivariate normal, and there is a linear relationship between M and (AM, CM, EM). As such, M is bivariate normal, and the conditional distribution of (AM, CM|M) is multi(tri)variate normal as well. The variates are AM1, AM2 and CM1, since CM1 and CM2 are equal. Turning to the distribution of (P|AM, CM, EM), because EM is jointly determined by (AM, CM, M), we have that f(P|AM, CM, EM; θ) = f(P|AM, CM, M; θ). Therefore,


Moreover, for Models (2)-(7) for P, (P|AM, CM, EM) is bivariate normal marginally over either (AU, CU, EU) or (AP, CP, EP); we do not need to explicitly integrate over (AU, CU, EU) or (AP, CP, EP) to obtain (P|AM, CM, EM). The reason for this is that, given (AM, CM, EM), P is linear in (AU, CU, EU) or (AP, CP, EP), and either (AU, CU, EU) or (AP, CP, EP) is multivariate normal given (AM, CM, EM). We conclude that f(P|M; θ) reduces to a trivariate integral given on the right hand side of (9). du Toit and Cudeck (2009) use a similar method to calculate marginal distribution by separating random effects in random coefficient model estimation.

Based on considerations similar to those of Klein and Moosbrugger (2000) in estimating latent interaction effects, we turn to numerical integration techniques—specifically AGHQ—for evaluation of this integral. We show in Appendix A that the contribution to the likelihood from (P|M) can be approximated for all proposed models in Rathouz et al. (2008). The procedure involves three-dimensional AGHQ. Even though each dimension can in theory have a different number of nodes for numerical integration, denoted using k1, k2 and k3 in Appendix A, we fix k1 = k2 = k3 = k in our algorithm, and refer to the value k as the number of AGHQ nodes. Larger values of k yield more accurate approximations, but are more computationally intensive. The choice of k aims to balance accuracy and computational cost. In later simulation-based numerical analysis, we show that moderate values of k (e.g., k=8) yield satisfactory results.

With both f(M; θM) and f(P|M; θ) available, individual likelihood contributions obtain through (8). With the aforementioned numerical integration approach, we use maximum likelihood for estimation and inferences in the class of models. Owing to the lack of a closed-form expression for the likelihood, however, analytic derivatives are especially difficult to obtain. We propose to use a derivative-free optimization technique, the BOBYQA algorithm for bound constrained optimization without derivatives (Powell, 2009). This algorithm has been integrated into an R package minqa (Bates, Mullen, Nash, & Varadhan, 2012) and is directly available in the R environment.

A computational barrier in derivative-free optimization is that many more evaluations of the likelihood are required than when analytic derivatives are available. We alleviate this burden through parallel computing using the parallel facility in R, distributing the likelihood calculation across observations at each step to as many cores as are available. We also provide a variety of options for obtaining initial values; these and other options are described in the documentation for our R package entitled GxM, and the package is available on CRAN (Zheng & Rathouz, 2013).

5. Numerical Performance of Model Estimation

5.1. Comparison of GxM Package in R to Mplus

We conducted computational experiments comparing model estimation with our new R package GxM to that in Mplus using scripts from Van Hulle et al. (2013). As the current standard in structural equation modeling, Mplus can estimate a variety of latent variable models, including those with non-linearities. Models involving the product of latent terms—such as CholNonLin and CorrNonLin—are, however, not available in Mplus. As such, we compared fits for the CholGxM model in this investigation. With a moderate number of AGHQ nodes, we hypothesized that our model estimation results would match those from Mplus very closely.

We simulated data under two data generating mechanisms, labeled A and B:




wherein AM1 = AM2 for MZ twins, corr(AM1, AM2)=0.5 for DZ twins, and CM1 = CM2 and corr(EM1, EM2)=0 for all the twins. In each of 2000 replicates, we simulated data either from 250 MZ twin pairs and 250 DZ twin pairs (n=500), or from 1000 MZ pairs and 1000 DZ pairs (n=2000).

For each replicate, we applied both our R package GxM and Mplus to perform model estimation for CholGxM. Eight (k=8) AGHQ nodes were used in the numerical integration routine. Examining the standard deviation of the pair-wise differences, we see that the log-likelihood values from two approaches are almost identical, especially given the scale of the total log-likelihood (Table 3).

Table 3
Simulated means (standard deviations) of log-likelihood values comparing GxM in R to Mplus for 2000 replicates for two sample sizes and two parameter specifications. Pair-wise differences subtract Mplus results from GxM results

To quantify accuracies of individual parameter estimates, we rely on the root-mean-square error (RMSE). For a given scalar element ξ of parameter vector θ, and letting ξ^(s) denote the estimate of ξ in the s-th replicate from the total S=2000 replicates, we compare the average performance of GxM to Mplus via what we refer to as RMSE “Ratio 1”:

Ratio 1=Σs=1S(ξ^R(s)ξ)2Σs=1S(ξ^Mplus(s)ξ)2.

How close Ratio 1 is to unity indicates the accuracy of parameter estimation using GxM relative to that of Mplus. Results are in the first four columns of Table 4 for all four simulation conditions. The ratios for all parameters are very close to unity, strongly suggesting, together with the log-likelihood results (Table 3), that the two computational approaches are equally valid.

Table 4
Ratio 1 and 2 values (defined in text) for estimated parameters from 2000 replicates for two sample sizes (n = 500 and n = 2000) and two parameter specifications (setting A and B)

As with the likelihood values, it is also of interest to quantify the coherence of the two approaches, even as they provide similar results on average across replicates. For this purpose, we compare the RMSE of the pair-wise differences of estimates from Mplus and GxM, relative to RMSE(ξ^Mplus), the reference variability of estimates from Mplus. This yields “Ratio 2”, viz.,

Ratio 2=Σs=1S(ξ^R(s)ξ^Mplus(s))2Σs=1S(ξ^Mplus(s)ξ)2.

The smaller the value of Ratio 2, the closer the results from the two computational approaches. Results are in the last four columns of Table 4. Values corresponding to intercepts and unshared environmental influences are satisfactorily small, suggesting the differences between fitting results for these parameters are negligible compared with the simulated sampling variabilities. Ratio 2 values corresponding to genetic influences and shared environmental influences are larger, suggesting some differences between GxM and Mplus in the optimization algorithm or in the nature of the likelihood surface near the maximum for these parameters. In conclusion, with a moderate number (k=8) of AGHQ nodes for the CholGxM model, some differences exist. Nevertheless, estimation based on numerical integration and optimization using GxM appears equally valid to that in Mplus. Further, in many settings, GxM and Mplus yield highly concordant results for specific data sets.

5.2. Stability and Computational Time of Model Estimation with Respect to Number of AGHQ Nodes

To further assess the performance of our GxM procedure and to examine its performance in models CholNonLin and CorrNonLin models which are not available in standard structural equations modeling packages, we conducted an investigation of its numerical stability with respect to the number of AGHQ nodes. In this experiment, we fit all of the proposed models through maximization of the likelihood computed with numerical integration, with several choices of number k of AGHQ nodes for each data set and model. We expected that, when the number of nodes is adequate, the fitted models will be close to each other. In addition, closed-form likelihood calculation is available for the Chol, NLMainGxM, CholGxM, and CorrGxM models, so for these models, we include results obtained through maximizing the closed-form likelihood in the pool of comparisons.

We simulated 23 data sets corresponding to different scenarios. These scenarios follow the simulation specifications in Van Hulle et al. (2013), with additional scenarios based on the CholNonLin and CorrNonLin models. These scenarios are based on various combinations of factors with different biological meanings; the configurations are listed in Appendix B. For each configuration, we simulated data for 1000 MZ twin pairs and 1000 DZ twin pairs. When numerical integration is used, k=8, 10, and 15 AGHQ nodes are employed to provide estimates for the parameter vector. When k=15, computations become very time-consuming, but are feasible for a small number (e.g., 23) cases. A possible fourth estimate results from maximizing the closed-form likelihood when it is available. Hence, the likelihood and parameter vector in the CholGxM, Chol, NLMainGxM, and CorrGxM model have 4 estimates, and those in the CholNonLin and CorrNonLin models have 3 estimates. Variation among the three or four estimates is an index of the stability of our model estimation across k.

We first compare the log-likelihood values across different values of k. For each data set and fitted model, the three or four log-likelihood values yield an interval covering these values; the range of this interval provides a direct measure of the dispersion of log-likelihood values. For each model, we calculated individual ranges for all 23 data sets, and summarize them in Table 5 via the median, the 3rd largest (21st out of 23), and the largest range. Whereas the ranges for fitted CholGxM, Chol, NLMainGxM and CorrGxM models are uniformly close to zero, those for the new CholNonLin and CorrNonLin models are relatively larger. The differences, however, remain small in an absolute sense. Multiplying by two puts the differences on a comparable scale as the chi-square statistic for comparing two nested models. We see that whereas using numeric integration could occasionally yield different conclusions in a statistical hypothesis test, this is likely a rare event.

Table 5
Summaries of fitting results using median values from 23 data sets. The first two panels show the log-likelihood range values and maximum norm distances of parameter vectors from different numerical integration settings (k = 8; 10 or 15 AGHQ nodes) and ...

For each data set and fitted model, we also compare fitted parameter vectors across values of k via the maximum norm distance. For two parameter vectors a = (a1, …, ap) and b = (b1, …, bp), the maximum norm distance from a to b is defined as ||ab|| = max(|a1b1|, …,|apbp|). The pairwise combinations of model estimates provide a set of distance vectors. For instance, in fitting the CholGxM model, 4 parameter vector estimates are available, and thus, 6 difference vectors and their corresponding maximum norm values (distances) are produced; we computed the maximum of these pairwise distances, and summaries of these maxima are reported in Table 5 as well; there we report the median, the 3rd largest (21st out of 23), and the largest of the maximum pairwise distances across the 23 data generation conditions. Given the foregoing likelihood results, it is not surprising that the difference values from fitting model CholNonLin and CorrNonLin are larger than others. Yet for most cases, differences are quite modest and acceptable.

The computational time of fitting these models with k =8, 10, and 15 AGHQ nodes is shown in the bottom panel of Table 5. The median values of time cost across 23 conditions using our package GxM with a single 2.4 GHz CPU are listed. The time cost increases with k, but the increase is slower than cubic growth.

6. Discussion and Conclusion

This paper continues the work of Rathouz et al. (2008) and Van Hulle et al. (2013) by developing computational algorithms and associated code to fit alternative nonlinear behavior genetic models to twin data as competitors to a now-classic GxM model proposed by Purcell (2002). The focus here is on numerical implementation of models not available in standard structural equation modeling software. To illustrate our approach, we presented an analysis of the putative moderation of genetic influences on anxiety by birth weight in a community sample of twin children. The implementation of new non-linear latent variable models provides alternative interpretations of the underlying biological relationships. Had these models not been considered, a researcher might have concluded that GxM was an important feature in the structural model for anxiety.

We achieved our model estimation via a novel factorization of likelihood and implementation of AGHQ, and maximized the likelihood using derivative-free optimization. We showed via numerical simulation that our fitting algorithm is equally valid to that of standard software, and is stable for a moderate number integration nodes.

The problem addressed by our novel class of models is related but not equivalent to that of scaling issues leading to loss of power to, or false detection of, gene-by-environment interaction. A comprehensive review is beyond the scope of this paper, but some key references include the compelling work by (Eaves, Last, Martin, & Jinks, 1977; Eaves, 2006; Molenaar & Dolan, 2014). That work for the most part deals with unmeasured environments versus our measured M. It also deals with problems of the scale of the response variable. In this and our prior work, we raise a different problem, namely that alternative latent non-linear structural models could be misinterpreted as GxM. Should scaling problems overlay our latent structural models, additional problems could arise. This is an issue which we will address in future work.

A few comments about the numerical results and future work. First, in numerical experiments, the accuracy of our numerical algorithm was very high for models admitting a closed-form likelihood (even with numerical integration in lieu of the closed-form objective function); corresponding results for models CorrNonLin or CholNonLin were not quite as strong, although it appears acceptable. We note, however, that the data used in these studies was generated from mechanisms involving fairly strong departures from the global null Chol model. Our experience thus far is that all of the fitting algorithms perform more accurately and more efficiently when departures from the Chol model are more modest. Second, we note that the GxM package cannot currently handle missing data. For small amounts of missing data, we recommend using an imputation routine.

Finally, the statistical—as differentiated from the numerical—operating characteristics of these new model fitting algorithms are presented in a companion paper in this same issue of Behavior Genetics (Zheng et al., submitted). To briefly summarize, in that paper, Type I error analysis suggests conservative behavior for models based on the bivariate Cholesky behavior genetic model, and liberal behavior for models based on the bivariate correlated factors models; for some models comparisons, asymptotic convergence to the correct Type I error rate is very slow. Simulations of the bias in parameter recovery are very encouraging as, by and large, maximum likelihood parameters estimators exhibit very little bias. In examination of the ability of the data to discriminate among alternative models with and without GxM, we found that it can be difficult to distinguish GxM from other non-linear effects. Please see (Zheng et al., submitted) for full details; in that work, we also provide some practical comments and guidance to applied researchers based on the present work. In addition, taking a longer view, we also summarize all of our results to date on these GxM methods in a technical report on “Lessons Learned”, again with the researcher using these methods in applied work as the intended audience.2


This study was funded by the NIH grant R21 MH086099 from the National Institute for Mental Health.

Appendix A Likelihood Calculation through Numerical Integration

Adaptive Gauss-Hermite Quadrature

In the calculation of a definite integral, even when the formula for the integrand is known, it may be difficult to find an antiderivative which has a closed-form expression. In such circumstances, numerical integration methods are often applied to obtain approximate results. The Gaussian quadrature rule is one of the most widely used numerical integration techniques to approximate the integral of a function g(x) over a specified domain D with a known weighting kernel [var phi](x). If the integrand g(x) can be well approximated by a polynomial of order 2k − 1 or less, then a quadrature with k nodes suffices for a good estimate of the integral,


The nodes xi and weights wi, i = 1, …, k, are uniquely determined by the domain D and the weighting kernel [var phi](x) (Stroud & Secrest, 1966). In the case wherein the integration domain is the real line and the integration kernel is [var phi](x) = ex2, the resulting quadrature rule is known as Gauss-Hermite quadrature (GHQ).

Because of its close relationship to the normal distribution, GHQ is widely used in statistics. Adaptive GHQ (AGHQ) (Liu & Pierce, 1994; Naylor & Smith, 1982) arises by shifting and scaling the kernel for greater numerical accuracy, strategically placing the nodes xi to emphasize the areas of greatest mass in the integrand function. The advantages of AGHQ over traditional GHQ are shown in the estimation of latent models with nonlinear random effects by Pinheiro and Bates (1995) and Rabe-Hesketh, Skrondal, and Pickles (2005). In this work, we relocate the nodes according to the easily obtainable location and scale of the normal density. Specifically, if YN(m,σ2) and g is a known but complicated function, the expectation of g(Y) can be calculated approximately as


using x=(ym)2σ. Whereas this is not “adaptive” in the strictest sense of Liu and Pierce (1994), we still use AGHQ to represent this technique because of the application of the relocation of nodes.

With regard to numerical evaluation of a multiple integral, a natural way forward is to decompose it into a sequence of nested one-dimensional quadratures and to repeatedly apply (10). Taking integration over domain Rp, we could use kj points in the j-th dimension, j = 1, …, p, and obtain a multi-dimensional version of AGHQ. Specifically, if Y is a p-dimensional random vector which follows a multivariate normal distribution with mean vector m and covariance matrix Σ, the expectation of g(Y), where g(·) is now a multivariate function, obtains approximately as


where xi = (x1i1, …, xpip)T; xj1, …, xjkj are the nodes for the j-th dimension; and the product wi1wip is the corresponding weight for node x(i).

AGHQ in Likelihood Calculation

In the application of AGHQ to approximation of likelihood f(P|M; θ), we incorporate distribution functions from specific models into the integration. We denote y = (AM, CM)T to simplify the notation. Because f(AM, CM|M) is a multivariate normal density function, we set m = E(y|M; θM) and Σ = Cov(y|M; θM), so that the function specified by f(P|y, M) = f(P|Am, CM, M) plays the role of g(y) in (11). Therefore, we have


where conditional distribution function f(P|x(i), M; θ) is computable for all proposed models from Rathouz et al. (2008).

Appendix B Argument Options in R package GxM

Model Option

We consider both bivariate Cholesky models and bivariate correlated factors models, including Chol, CholGxM, NLMainGxM, CorrGxM, CholNonLin and CorrNonLin. The routines for fitting these models are provided in our R package, GxM. For models that do not admit a closed-form likelihood, we apply numerical integration techniques; for models that have closed-form likelihood, both fitting with closed formula and numerical techniques are provided. All models exploit derivative-free optimization.

Zero Set Option

This option provides for constraining some parameters to zero, greatly expanding the number of nested sub-models that are available, and allowing testing of specific parameters via likelihood ratio tests or by comparing BIC values. As explained in the Model section, GxM can be detected by testing statistical hypothesis under which certain parameters are zero. We supply an option named “zeroset” to enable users to fit models with chosen parameter(s) constrained to zero.

Initialization & Priority Option

For optimization problem with high dimensional parameters and non-concave surfaces, it is important to have reasonable and multiple starting points. By setting the non-linear latent terms to zero, all of our proposed models except Model (4) reduce to a common trivial model, and direct parameter estimation such as a method of moment estimator can be applied. This set of estimates serves as a desirable starting point. For Model (4), we use polynomial regression technique to eliminate the main effect of M on P. After replacing the original P with regression residuals, the modified model can also be viewed as a case of the common trivial model. For non-linear models, we further add an intermediate update using a small number (k=3) of AGHQ nodes. Lastly, we provide for the option of leaving the initialization to potential users. With priority level equal to 1, the user-specified initialization would be updated in the intermediate stage. By increasing priority level from 1 to 2, the manually specified initialization would ignore the intermediate update.

AGHQ Nodes Number Option

We provide this option to allow a tradeoff between accuracy and computational intensity. As one may expect, a larger number of AGHQ nodes produces more accurate likelihood values. On the other hand, because the integration is 3-dimensional, the computation cost increases in a cubic manner.

Parallel Computing Option

As an interpreted language, the performance of R in terms of computational speed is not as satisfactory as that for compiled languages. This issue is of concern when using computationally intensive numerical integration and derivative-free optimization techniques. Therefore, we embed parallel processing technique in response to the challenge.

Parallel computing with R is directly supported beginning with release 2.14.0. The package parallel provides convenient functions to perform parallel computing in both explicit and implicit modes. For instance, in the calculation of log-likelihood for GxM models, because of the summation over individual observations as l(θ) = log L(θ) = Σi log f(Mi, Pi; θ), the global log-likelihood computation can be performed in a parallel manner. Users are provided the option to use parallel computation, and if so the number of CPU cores to allocate the computations.

Appendix C Configurations for 23 Scenarios

The configurations of simulation settings for 23 scenarios in numerical analysis is shown in Table 6.

Table 6
Configurations of model settings and data generation for 23 scenarios in numerical analysis. In applicable assignment, we set aM=eM=0.45, cM=0.1, cU=0.2, cP=0.1 and cC = κC = κU = κP = 0.01

Contributor Information

Hao Zheng, Department of Statistics, University of Wisconsin-Madison.

Paul J. Rathouz, Department of Biostatistics and Medical Informatics, University of Wisconsin School of Medicine and Public Health.


  • Bates D, Mullen KM, Nash JC, Varadhan R. minqa: Derivative-free optimization algorithms by quadratic approximation [Computer software manual] 2012 Retrieved from
  • Bennett A. Gene environment interplay: Nonhuman primate models in the study of resilience and vulnerability. Developmental Pychobiology. 2008;50(1):48–59. [PubMed]
  • du Toit SH, Cudeck R. Estimation of the nonlinear random coefficient model when some random effects are separable. Psychometrika. 2009;74(1):65–82.
  • Eaves L. Genotype× environment interaction in psychopathology: fact or artifact? Twin Research and Human Genetics. 2006;9(01):1–8. [PubMed]
  • Eaves L, Last K, Martin N, Jinks J. A progressive approach to non-additivity and genotype-environmental covariance in the analysis of human differences. British Journal of Mathematical and Statistical Psychology. 1977;30(1):1–42.
  • Eaves L, Silberg J, Erkanli A. Resolving multiple epigenetic pathways to adolescent depression. Journal of Child Psychology and Psychiatry. 2003;44(7):1006–1014. [PubMed]
  • Jinks JL, Fulker DW. Comparison of the biometrical genetical, mava, and classical approaches to the analysis of the human behavior. Psychological Bulletin. 1970;73(5):311–349. [PubMed]
  • Johnson W. Genetic and environmental influences on behavior: Capturing all the interplay. Psychological Review. 2007;114(2):423–440. [PubMed]
  • Klein A, Moosbrugger H. Maximum likelihood estimation of latent interaction effects with the lms method. Psychometrika. 2000;65(4):457–474.
  • Lahey B, Applegate B, Waldman I, Loft J, Hankin B, Rick J. The structure of child and adolescent psychopathology: generating new hypotheses. Journal of Abnormal Psychology. 2004;113(3):358–385. [PubMed]
  • Lahey B, Waldman I. A developmental propensity model of the origins of conduct problems during childhood and adolescence. Causes of Conduct Disorder and Juvenile Delinquency. 2003:76–117.
  • Liu Q, Pierce DA. A note on gaussąlhermite quadrature. Biometrika. 1994;81(3):624–629.
  • Loehlin J. The cholesky approach: A cautionary note. Behavior Genetics. 1996;26(1):65–69.
  • Molenaar D, Dolan CV. Testing systematic genotype by environment interactions using item level data. Behavior genetics. 2014;44(3):212–231. [PubMed]
  • Muthén L, Muthén B. Mplus. User’s Guide. 1998-2012
  • Naylor JC, Smith AF. Applications of a method for the efficient computation of posterior distributions. Applied Statistics. 1982;31(3):214–225.
  • Neale M, Cardon L. Methodology for genetic studies of twins and families. 67. Springer; 1992.
  • Pinheiro JC, Bates DM. Approximations to the log-likelihood function in the nonlinear mixed-effects model. Journal of Computational and Graphical Statistics. 1995;4(1):12–35.
  • Powell MJ. The bobyqa algorithm for bound constrained optimization without derivatives. Technical report. Department of Applied Mathematics and Theoretical Physics, University of Cambridge; 2009.
  • Purcell S. Variance components models for geneenvironment interaction in twin analysis. Twin Research. 2002;5(6):554–571. [PubMed]
  • R Core Team R: A language and environment for statistical computing [Computer software manual] 2013 Retrieved from
  • Rabe-Hesketh S, Skrondal A, Pickles A. Maximum likelihood estimation of limited and discrete dependent variable models with nested random effects. Journal of Econometrics. 2005;128(2):301–323.
  • Raftery AE. Bayesian model selection in social research. Sociological methodology. 1995;25:111–164.
  • Rathouz P, Van Hulle C, Rodgers J, Waldman I, Lahey B. Specification, testing, and interpretation of gene-by-measured-environment interaction models in the presence of gene-environment correlation. Behavior Genetics. 2008;38(3):301–315. [PMC free article] [PubMed]
  • Rutter M, Moffitt T, Caspi A. Gene–environment interplay and psychopathology: multiple varieties but real effects. Journal of Child Psychology and Psychiatry. 2006;47(3-4):226–261. [PubMed]
  • Schwarz G. Estimating the dimension of a model. The Annals of Statistics. 1978;6(2):461–464.
  • Stroud AH, Secrest D. Gaussian quadrature formulas. Vol. 374. Prentice-Hall; Englewood Cliffs, NJ: 1966.
  • Van Hulle C, Lahey B, Rathouz P. Operating characteristics of alternative statistical methods for detecting gene-by-measured environment interaction in the presence of gene–environment correlation in twin and sibling studies. Behavior Genetics. 2013;43(1):71–84. [PMC free article] [PubMed]
  • Weakliem DL. A critique of the bayesian information criterion for model selection. Sociological Methods & Research. 1999;27(3):359–397.
  • Weaver I, Cervoni N, Champagne F, D’Alessio A, Sharma S, Seckl J, Meaney M. Epigenetic programming by maternal behavior. Nature Neuroscience. 2004;7(8):847–854. [PubMed]
  • Zheng H, Rathouz P. GxM: Maximum likelihood estimation for gene-by-measured environment interaction models [Computer software manual] 2013 Retrieved from
  • Zheng H, Van Hulle C, Rathouz PJ. Comparing alternative biometric models with and without gene-by-measured environment interaction in behavior genetic designs: Statistical operating characteristics. Behavior Genetics. submitted. submitted. [PMC free article] [PubMed]