Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC3228846

Formats

Article sections

Authors

Related links

Behav Genet. Author manuscript; available in PMC 2011 December 1.

Published in final edited form as:

Published online 2009 December 16. doi: 10.1007/s10519-009-9320-x

PMCID: PMC3228846

NIHMSID: NIHMS338515

Correspondence concerning this article should be addressed to Matthew C. Keller, Department of Psychology, Muenzinger Hall, 345 UCB, Boulder, CO, 80309. Email: moc.liamg@rellek.c.wehttam. Webpage: www.matthewckeller.com

The publisher's final edited version of this article is available at Behav Genet

See other articles in PMC that cite the published article.

The classical twin design (CTD) uses observed covariances from monozygotic and dizygotic twin pairs to infer the relative magnitudes of genetic and environmental causes of phenotypic variation. Despite its wide use, it is well known that the CTD can produce biased estimates if its stringent assumptions are not met. By modeling observed covariances of twins’ relatives in addition to twins themselves, extended twin family designs (ETFDs) require less stringent assumptions, can estimate many more parameters of interest, and should produce less biased estimates than the CTD. However, ETFDs are more complicated to use and interpret, and by attempting to estimate a large number of parameters, the precision of parameter estimates may suffer. This paper is a formal investigation into a simple question: Is it worthwhile to use more complex models such as ETFDs in behavioral genetics? In particular, we compare the bias, precision, and accuracy of estimates from the CTD and three increasingly complex ETFDs. We find the CTD does a decent job of estimating broad sense heritability, but CTD estimates of shared environmental effects and the relative importance of additive versus non-additive genetic variance can be biased, sometimes wildly so. Increasingly complex ETFDs, on the other hand, are more accurate and less sensitive to assumptions than simpler models. We conclude that researchers interested in characterizing the environment or the makeup of genetic variation should use ETFDs when possible.

The observed covariances of twins, adoptees, and their family members are often used to understand the relative importance of genetic and environmental causes of phenotypic variation. The most commonly used genetically informative design is the Classical Twin Design (CTD), which compares the monozygotic (MZ) twin covariance to the dizygotic (DZ) twin covariance to estimate the variation in a trait due to unique environmental effects (*V _{E}*) as well as any two of the three variance components—additive genetic (

There are several appeals to the CTD. For example, MZ and DZ twins serve as natural controls to one another, their data is relatively simple to collect, and shared environmental effects are not confounded with genetic effects, as they are in non-twin familial studies (Martin, Boomsma, & Machin, 1997). Nevertheless, it has long been understood that the CTD suffers from several important limitations (Eaves, Last, Young, & Martin, 1978). For one, * _{A}*,

For these and other reasons, in the 1970’s researchers began exploring extended twin family designs (ETFDs), which require less stringent assumptions and produce less biased estimates than the CTD (Fulker, 1982). These alternative designs use data on parents of twins (Eaves et al., 1978; Neale & Fulker, 1984) and offspring of twins (Nance & Corey, 1976) to better reveal genetic non-additivity and the role of parental environmental effects, and use parents of twins and spouses of twins (Eaves, 1979) to model the effects of assortative mating. Cloninger, Rice, and Reich (1979) first described how to use all of these relative types together in a single model. Their model is the forerunner to the three ETFDs described in this paper: the Nuclear Twin Family Design (NTFD) (Heath et al., 1985), the *Stealth* design (Truett et al., 1994), and the *Cascade* design (Keller et al., 2009). For a more thorough history of twin and family designs, see Eaves (2009).

ETFDs address the limitations of the CTD described above. Compared to the CTD, ETFDs allow for finer grained descriptions of the causes of phenotypic variation, they produce less biased parameter estimates, and more information (increasing statistical power) is gained per additional subject in ETFDs (Posthuma & Boomsma, 2000). Yet, the reduction in bias and more detailed information associated with ETFDs comes at the cost of greatly increased complexity. This complexity is a major problem for instantiating the model into code. For example, such scripts written in Mx (Neale, 1999) can stretch for 50 pages or more, making human errors a virtual certainty regardless of how vigilant the error checking is. We note, however, that a new version of Mx, OpenMx (http://openmx.psyc.virginia.edu/), will be available as a package for the R statistical language in early 2010, and changes in the OpenMx syntax should significantly simplify ETFD code. Nevertheless, the complexity of ETFDs may also obscure logical errors at the heart of the designs; certain expectations may simply have been wrong at the modeling stage. Furthermore, as with all models, ETFDs also must make assumptions in order for their models to be identified, and it is possible that they may perform as bad or worse than simpler models when these assumptions are violated. Finally, the complexity of ETFD models and the number of parameters they attempt to estimate may lead to an unacceptable level of imprecision in estimates caused by the high covariation between the large numbers of estimated parameters (multicolinearity problems). For these reasons, some researchers in behavioral genetics remain skeptical of the value of ETFDs and favor the use of simpler, time-tested models such as the CTD, which are easy to use and interpret and require less data collection.

The goal of this paper is to explore these trade-offs. In particular, we use simulations to gauge the bias, precision, and accuracy of parameters estimated using the CTD and three ETFDs in order to understand whether they work as intended, under what circumstances their estimates are biased, if the increase in information in ETFDs comes at an unacceptable cost in precision, and how violations of assumptions affect parameter estimates. In addition to identifying the central tendency of the parameter estimates, we also explore their spread, covariation, and distributional shapes. Such results can help researchers interpret CTD and ETFD findings with proper circumspection. In summary, this paper is a formal investigation into a simple question: Is it worthwhile to use more complex models such as ETFDs in behavioral genetics?

We seek four properties—the bias, precision, accuracy, and distributions—of parameter estimates derived from the CTD, NTFD, *Stealth*, and *Cascade* designs. Of course, the parameter bias, precision, accuracy, and distributions for a given design change depending on the scenario, so we need to measure these properties under several scenarios that might occur in nature. A given scenario, for example, might simulate specified levels of additive genetic, dominant genetic, and common environmental effects on some hypothetical trait. These scenarios should also violate assumptions of the four designs to check their sensitivities to assumptions. To accomplish these goals, the first author created a program, *GeneEvolve*, that simulates twin family data. The user supplies input for various parameters (e.g., the amount of variation in a phenotype due to various types of genetic and environmental effects) to simulate different scenarios. We obtained simulated twin family data from *GeneEvolve* under several different scenarios that might occur in real life and ran Mx models from the four designs above on this data. We then compared the estimated variance parameters (denoted by _{•}) derived from Mx, to the true variance parameters (denoted by *V*_{•}) simulated using *GeneEvolve.* We iterated this process 500 times for each of 10 different scenarios. In total, 20,000 Mx models were fit (500 iterations × 4 models per iteration × 10 scenarios), taking a total of ~13,000 hours of CPU time.

Table 1 gives the interpretations of the variance parameters discussed in this paper as well as which designs can estimate which variance parameters. For a description of the CTD, see Plomin et al (2001), and for a more detailed description and explanation of these three ETFDs, including algebraic expectations, see Keller et al. (2009).

The NTFD (Figure 1) uses data on MZ twins, DZ twins, and their parents. These three relative classes provide four pieces of information from which parameters are estimated: the covariance between MZ twins, *C*(*MZ*, *MZ*), the covariance between DZ twins, *C*(*DZ*, *DZ*), the covariance between parents, *C*(*spouse*), and the covariance between parents and children, *C*(*Par*, *Child*). This additional information allows the NTFD to estimate * _{A}*,

By using data from MZ and DZ twins and their siblings, parents, offspring, and spouses, 88 sex-specific relative covariances can be estimated. Many of these 88 relative classes are identical except for sex-specific pathways. For example, nephew-aunt covariances between sons of DZ females and their female DZ co-twins are differentiated from nephew-aunt covariances that are between sons of DZ males and female DZ co-twins. The *Stealth* uses these 88 covariance observations to simultaneously estimate sex-specific * _{A}*,

Like the Stealth, the *Cascade* uses information on twins and their siblings, parents, spouses, and children to model all of the variance components modeled by the *Stealth.* However, a limitation of the *Stealth* is that it models only one type of mating (primary phenotypic mating) and only one type of vertical transmission (from parental phenotype to offspring *F*). The purpose of the *Cascade* is to provide a general framework for relaxing the assumptions regarding mate choice and vertical transmission made by the *Stealth*. This is done through the use of latent phenotypes upon which spouses mate or upon which parents influence their children. To keep the number of model comparisons manageable, we focus here on the mating aspect of the *Cascade* rather than the vertical transmission aspects of it. The only difference between Figure 2 (the *Stealth* model) and Figure 3 (the *Cascade* model) is the addition of the latent phenotype () upon which mates assort. Depending on the type of mating or vertical transmission model being used, the path coefficients to are set to either be equal to the path coefficients to *P* or to be equal to zero. For example, to model social homogamy, all genetic path coefficients to are set to zero (*ã* = 0 and = 0) and all environmental path coefficients to are constrained to be equal to the values of the corresponding path coefficients to *P* ( = *f*, = *s*, = *t*, and *ẽ* = *e*). To understand whether social homogamy or primary phenotypic mating best fits the data, the fit of this model can be compared to a model of primary phenotypic assortment, in which *ã* = *a*, = *d*, = *f*, = *s*, = *t*, and *ẽ* = *e*.

*GeneEvolve* (Keller, 2007) is an open source program written in the R programming language (R Development Core Team, 2009) and available at www.matthewckeller.com. *GeneEvolve* accurately simulates genetically informative data as well as complex dynamics in evolutionary genetics. With complicated scenarios, it is difficult or impossible to find expected equilibrium parameter values analytically (e.g., the equilibrium additive-by-additive epistatic genetic variation in a population mating assortatively). Doing so through simulation, however, is straightforward. Given user input, *GeneEvolve* simulates the effects of alleles and environments on individuals’ traits in a population, and allows this population to evolve (meet, mate, and have offspring, who meet, mate, and have offspring, etc…) for many generations, until parameters reach equilibrium. Currently, *GeneEvolve* allows user input of 48 different parameters, including 21 variance and covariance parameters, 3 different types of assortative mating, and 3 different types of vertical transmission.

*GeneEvolve* has an option to create twin and twin relative phenotypes during the final generation of the simulation. We used this option to write out the phenotypic scores of twins and their siblings, spouses, parents, and offspring to flat files (one row per family), which were then used as input into Mx (see below). Each flat file contained a total of ~ 15,000 families (6,500 MZ families and 8,500 DZ families). Although there were a total of 18 potential relative types in each family (two twins, two parents, four siblings, one spouse of twin 1, one spouse of twin 2, four children of twin 1, and four children of twin 2), families had an average of about five non-missing phenotypic scores and each flat file contained a total ~70,000 individuals. These numbers were chosen to reflect the sample sizes and missingness patterns in the combined Australia and Virginia extended twin databases (see Medland & Keller, 2009), which is the largest extended twin family dataset in existence. Missingness in extended twin datasets arises through difficulties in ascertainment as well as variation in age of death and number of children within families. Sample sizes of this magnitude are necessary for making fine-grained distinctions between parameters, especially with respect to sex-specific pathways (Heath et al., 1985; Medland & Keller, 2009), although more modest datasets are adequate for differentiating models that do not require sex differentiated pathways.

Table 2 shows how each of the ten scenarios examined in this project was defined. *V _{E}* was set to .3 for each scenario, and all other variance parameters not shown in Table 2 were set to zero. The variance components inherited by offspring—

Simulated variance parameters associated with 10 different scenarios. Numbers in parentheses are variance parameters at the first generation, which may change by the final (here 20^{th}) generation if vertical transmission or assortative mating occurs (see **...**

We simulated three different modes of assortative mating (see rows 5–8, Table 2). Phenotypic homogamy (also called “primary phenotypic assortment”) occurs when ‘like mates with like’ based on the manifest phenotype. For example, if tall people choose other tall people because they are tall, this would classify as phenotypic homogamy. This is the most commonly modeled type of assortative mating in the behavioral genetics and evolutionary genetics literatures.

Social homogamy refers to mate similarity arising from similar environmental backgrounds. For example, if people marry within religions and choice of religion is not heritable, than any similarity between spouses due to religion (e.g., similar views on abortion) would be due to social homogamy rather than primary phenotypic assortment.

A third possibility, genetic homogamy, occurs if mates choose each other based on the heritable aspect of their phenotypes rather than on their manifest phenotypes (Fisher, 1918; Thiessen & Gregg, 1980). Although seemingly implausible, there are two ways this might occur. The first is if people attempt to control for the effects of the environment when making mate choices (e.g., “He/she is really smart given the environment they come from”). The second is if people base mate choice on some third variable (e.g., overall mate value) that is related to the phenotype of interest purely genetically. This would be an extreme form of ‘good genes’ theories of human mate choice (Miller & Todd, 1998). Consider, for example, assortative mating for intelligence. If people choose mates solely based on mate value (e.g., the first principal component of traits such as health, athleticism, height, facial attractiveness, bodily attractiveness, intelligence, and so forth), and if the inter-relationship between these mate value components is genetic in nature, then similarity between spouses on intelligence would be due to genetic homogamy. Our point is not to argue that genetic homogamy is or is not a likely mode of mate similarity, but rather to note that it is a viable option that should be tested empirically. Of the four twin-family designs discussed here, only the *Cascade* can model genetic and social homogamy.

We also simulated two scenarios that include parameters that could not be estimated in any model (rows 9–10, Table 2). These two scenarios allowed us to test the sensitivity to assumptions for all designs, including the *Stealth* and *Cascade*.

The authors wrote Mx scripts for the CTD (137 lines of code), the NTFD (189 lines of code), and the *Cascade* design (2717 lines of code); the script for the *Stealth* design (2780 lines of code) was written by H. Maes (Maes et al., 2009). These scripts are available at http://www.matthewckeller.com/html/cascade.html. An advantage of the *Stealth* script, not yet instantiated in the *Cascade* script, is that it is set up to fit multivariate data. The advantage of the *Cascade* design, and its original purpose, is the additional flexibility in modeling assortative mating and vertical transmission.

For each simulated dataset run using the NTFD, *Stealth*, and *Cascade* scripts, both a full and reduced model were fit (no reduced models were necessary for the CTD). The full NTFD model estimated * _{A}*,

We compared the parameters estimated from Mx for each design to the true parameters from *GeneEvolve* for each simulation run. This allowed us to empirically determine the bias, precision, and accuracy of the parameter estimates, as well as their distributional shapes and covariances (Casela & Berger, 1990). The *bias* of a statistic is generally defined as *E*(_{•} − *V*_{•}), the expected (i.e., mean) difference between the estimated parameter, _{•}, and the true parameter, *V*_{•}. An alternative is to use the median difference rather than the expected difference, *M*(_{•} − *V*_{•}), which is less influenced by outlier estimates. We chose this latter measure of bias because several outlier _{•}’s in our data are probably artifactual due to the automated way the models were run. Although we discarded estimates from models that gave a “Code Red” (IFAIL=6) in Mx, which occurs when constraints cannot be satisfied and is symptomatic of poorly performing estimation, inspection of Mx output led us to conclude that occasionally (~2–8% of the time, depending on the scenario), Mx poorly recreated the expected covariance matrix and gave bad estimates even when no “Code Red” occurred. Such estimates are artifactual in the present context because they likely could have been averted in most real life modeling contexts by providing different start values, dropping parameters, or by taking other remedial measures to improve the fit.

The *precision* of estimates measures the spread of the estimates around their center, and is typically measured by the standard deviation or variance of the parameter estimates, e.g.,
$\sqrt{\frac{1}{n-1}{\displaystyle \sum _{i=1}^{n}}{({\widehat{V}}_{\u2022i}-E({\widehat{V}}_{\u2022}))}^{2}}$. An alternative which we use for the same reasons mentioned above—namely, that we wish to downweight outliers that are likely to be artifactual—is the median absolute deviation, or MAD, which is equal to *M* (|_{•}* _{i}* −

The *accuracy* of a statistic combines information on both bias and precision to gauge how far away from the true value an estimate typically is. Thus, an estimate can be precise but nevertheless inaccurate if it is biased, or can be unbiased but inaccurate if it is imprecise. As with precision, accuracy is often measured using the variance or standard deviation, except that estimates are judged by how far away they are from the *value of the true parameter* rather than the values of the mean estimates, e.g.,
$\sqrt{\frac{1}{n-1}{\displaystyle \sum _{i=1}^{n}}{({\widehat{V}}_{\u2022i}-{V}_{\u2022})}^{2}}$. In this situation, *accuracy*^{2} = *bias*^{2}+ *precision*^{2} using the first of each of the definitions above. In the present study, we use the median absolute error, *M*(|_{•}* _{i}* −

The distributions of four of the parameter estimates for each of the ten scenarios described in Table 2 are shown in Figures 4–13. These figures do not show * _{T}*,

Results for the ADE and ASE scenarios, which did not violate assumptions in any of the four designs, are shown in Figures 4 and and5.5. A few things should be noted. First, when assumptions of the CTD are not violated (i.e., *V _{C}* = 0 in the ADE scenario and

Figures 6, ,7,7, and and88 show results for three scenarios in which CTD assumptions are violated because both shared environmental and non-additive genetic effects influence a trait simultaneously and, in the final scenario, because assortative mating exists. However, these scenarios do not violate assumptions for any ETFD. The CTD estimates are highly biased in the expected directions (Grayson, 1989; Keller & Coventry, 2005), with additive genetic effects being overestimated by about 50% in these examples and non-additive genetic effects ignored because, for reasons of identifiability, they could not be estimated. Shared environmental effects are underestimated by the CTD in the ADSE scenario, but are overestimated in the ADFE and ADFE & Primary Assortative Mating scenarios. This overestimation is also predictable, and occurs because of the substantial *CV*(*A*, *F*) that is induced by vertical transmission, which mimics shared environment in the CTD (Eaves, Eysenck, & Martin, 1989). As expected, the reduced ETFD models do not show bias whereas the full ETFD models show slight biases for the same reason discussed above. The *Stealth* and *Cascade* estimates are quite accurate in these scenarios, typically being within .05 points of the true parameters. NTFD estimates are less accurate when both * _{A}* and

Figure 9 shows results for a complicated scenario in which *V _{A}*,

Figures 10 and and1111 show results for scenarios identical to scenario 5 except that spousal similarity is due to social homogamy (Figure 10) or genetic homogamy (Figure 11). Thus, these two scenarios violate assumptions for every design except for the *Cascade*, and as expected, all designs other than the *Cascade* produce estimates that are biased to varying degrees. In particular, if spousal similarity is due to social homogamy rather than primary phenotypic assortment, the *Stealth* overestimates *V _{D}* and

Figures 12 and and1313 show results for scenarios in which assumptions were violated in every design. When genetic non-additivity is due to additive-by-additive epistasis rather than dominance (Figure 12), ETFD models tend to overstimate *V _{A}* and slightly underestimate

Non-scalar gene-by-age interactions (Figure 13) can be conceptualized as different genes ‘turning on’ at different ages, and as opposed to scalar gene-by-age interactions, tend to decrease genetic covariation between relatives as a function of the age difference between them. Because siblings and twins tend to be close in age to one another, it is sensible that non-scalar gene-by-age interactions lead to overestimation of *V _{T}* (not shown) and

The information required to estimate parameters is often partially redundant. For example, both *V _{A}* and

A linear regression model predicting * _{A}* in the Cascade from

It is useful to have a sense of how observed covariance estimates translate into estimated parameters. In the CTD, it is obvious that the difference between *C*(*MZ*, *MZ*) and *C*(*DZ*, *DZ*) provides all the information needed to estimate *V _{A}* and

Despite these difficulties, Table 3 provides some insight into how observed covariances are used to estimate parameters in the *Cascade* and *Stealth* models. The table is not exhaustive; for certain parameters (especially * _{A}* and

Our results show that ETFDs work as designed. They are generally unbiased when assumptions are met, and unlike the CTD, they are not overly sensitive to violations of assumptions so long as * _{D}* is interpreted broadly, as an estimate of genetic non-additivity in general (including gene-by-age interaction effects) rather than as dominance in particular. Our results also highlight that the key trade-off in using ETFDs is one of complexity versus accuracy. By attempting to estimate a large number of parameters, many of which use overlapping information, the precision of ETFD estimates suffers (see the full ETFD model estimates in Figures 4–13 and parameter covariances in Figure 14). The ETFD estimates in Figure 8, for example, are much less precise than those from the CTD. Nevertheless, ETFD estimates tend to be unbiased under a much wider range of scenarios than CTD estimates, and because of this, are almost universally more accurate than are CTD estimates. This improved accuracy can be quantified by empirical researchers using ETFDs by comparing a goodness of fit index of an ETFD only estimating a few parameters (e.g.,

The trend of increasing accuracy with increasing complexity repeats itself within the ETFD models: *Stealth* estimates are accurate across a wider range of scenarios than are NTFD estimates (Figure 6), and *Cascade* estimates are accurate across a wider range of scenarios than are *Stealth* estimates (Figures 10–11). For example, the mean accuracy values (lower being more accurate) across the ten scenarios for * _{A}* were .140 for the CTD, .069 for the NTFD, .049 for the

Nevertheless, the question remains: given the increased difficulty in fitting the models and collecting the requisite data, is it worth it to use ETFDs? Our results cannot provide an answer to this question, but they do provide guidance. For all the problems associated with the CTD, the combined CTD parameters of * _{A}* +

For researchers who already have the data needed to fit the *Stealth* or *Cascade* models, our results suggest the *Cascade* model should be used over existing ETFD models. However, an argument could be made from our results that the NTFD represents a good compromise between the accuracy of the *Cascade* and the simplicity of the CTD. NTFD estimates tended to be less precise and slightly more biased than *Cascade* estimates, but these differences were minor compared to the difference between the ETFD estimates as a group and the CTD estimates. Of course, the major limitation of the NTFD is that the source of shared environmental effects (due to sibling effects or vertical transmission from parents) cannot be discerned, and when both shared environmental sources are present, estimated parameters will be biased. In a separate piece (Medland & Keller, 2009), we discuss which relative types provide the most power for detecting different parameters in the *Cascade*, which should be of service to investigators interested in collecting new data for any ETFD (see also Heath et al., 1985).

Hill, Goddard, and Visscher (2008) recently argued that most genetic variance in most traits is additive in nature. If *V _{D}* ~ 0 for most traits, then CTD estimates of

There are several limitations with the current approach to understanding the bias, precision, and accuracy of parameter estimates from twin-family designs. First, as mentioned above, our procedure for automating model fitting meant that the results from reduced ETFD models were optimistic. However, as we argued in the *Methods* section, this probably produced a negligible degree of bias in our results. A more important source of bias in our results, which worked in the opposite direction, is that a human could not guide each fitting process interactively due to the automated way models were fit. A non-negligible number (around 2–8%) of model runs produced outlier estimates, poorly reproduced the observed covariance matrices, and probably failed to find the true maximum likelihood estimates. An experienced modeler could have detected these situations and taken remedial measures, such as changing start values, to improve the fit of the model. This suggests that the ETFD results presented in this paper appear less precise than they will be when fit interactively on real data.

Another limitation to the current approach was that we investigated only a very small portion of the space of possible parameters that might exist in the real world. For example, we did not investigate alternative modes of vertical transmission or spousal similarity due to convergence, both of which can be modeled in the *Cascade*. We also did not investigate any number of alternative scenarios that might occur and cause bias in all the models investigated here, such as mixed models of assortative mating (Reynolds, Baker, & Pedersen, 2000), additional types of gene-environment interactions and correlations, higher-order epistasis, in utero effects, and special MZ-twin environments. This latter issue is particularly important. At the heart of all twin models, including ETFDs, is the comparison between MZ and DZ twins. If some non-genetic factor such as in utero effects increases MZ twin resemblance, all models described in this paper will overestimate * _{A}* and especially

We have argued that the most commonly used design in behavioral genetics, the CTD, is inadequate for understanding the relative magnitude of shared environmental effects or the ratio of additive to non-additive genetic variation. Our results demonstrate that, irrespective of power or sample size, estimates of these two quantities from CTDs cannot be interpreted with any degree of confidence unless strong assumptions—no assortative mating, no gene-environment covariance, and that either non-additive genetic variance or shared environmental variance is zero—have been verified. ETFDs, on the other hand, provide unbiased and fairly accurate estimates of this information. More complex ETFDs, such as the *Cascade*, are unbiased under an even wider range of scenarios and provide additional details on the makeup of shared environmental effects that may itself be of interest. The principal reasons why ETFDs are rarely used in behavioral genetics is that they are more difficult to use and that little extended twin family data exists suitable for their use. We hope that the current paper clarifies the rationale for using ETFDs and encourages researchers to collect extended twin family data when circumstances warrant their use.

^{1}We follow the convention that _{•} is the estimate of the population parameter *V*_{•}.

^{2}Strictly speaking, *C* (*A*, *F*) is a nonlinear constraint and is not freely estimated in ETFDs. It is determined by, and helps to determine, estimated parameters by constraining their inter-relationships in a way that keeps the entire model internally consistent.

- Carey G. Cholesky problems. Behavior Genetics. 2005;35:653–665. [PubMed]
- Casela G, Berger RL. Statistical Inference. Belmont, CA: Wadsworth; 1990.
- Cloninger CR, Rice J, Reich T. Multifactorial inheritance with cultural transmission and assortative mating II: A general model of combined polygenic and cultural inheritance. American Journal of Human Genetics. 1979;31:176–198. [PubMed]
- Coventry WL, Keller MC. Estimating the extent of parameter bias in the classical twin design: A comparison of parameter estimates from extended twin-family and classical twin designs. Twin Research and Human Genetics. 2005;8:214–223. [PubMed]
- Crnokrak P, Roff DA. Dominance variation: Associations with selection and fitness. Heredity. 1995;75:530–540.
- Eaves LJ. The use of twins in the analysis of assortative mating. Heredity. 1979;43:399–409. [PubMed]
- Eaves LJ. Putting the ‘human’ back in genetics: Modeling the extended kinship of twins. Twin Res Hum Genet. 2009;12:1–7. [PubMed]
- Eaves LJ, Eysenck HJ, Martin JM, editors. Genes, culture, and personality: An empirical approach. Londong: Academic Press; 1989.
- Eaves LJ, Last KA, Young PA, Martin NG. Model-fitting approaches to the analysis of human behavior. Heredity. 1978;41:249–320. [PubMed]
- Fisher RA. The correlation between relatives on the supposition of Mendelian inheritance. Transactions of the Royal Society of Edinburgh. 1918;52:399–433.
- Fulker DW. Human genetics, part A: The unfolding genome (Progres in clinical and biological research 103A) New York: Alan R Liss; 1982. Extension of the classical twin method; pp. 395–406.
- Grayson DA. Twins reared together: Minimizing shared environmental effects. Behavior Genetics. 1989;19:593–604. [PubMed]
- Haldane JBS. The causes of evolution. Princeton, N.J: Princeton University Press; 1932.
- Heath AC, Kendler KS, Eaves LJ, Markell D. The resolution of cultural and biological inheritance: Informativeness of different relationships. Behavior Genetics. 1985;15:439–465. [PubMed]
- Hill WG, Goddard ME, Visscher PM. Data and theory point to mainly additive genetic variance for complex traits. PLos Genetics. 2008;4:1–10. [PMC free article] [PubMed]
- Keller MC. PedEvolve: A simulator of genetically informative data implemented in R. Annual Meeting of the Behavior Genetics Association; Amsterdam, NL. 2007.
- Keller MC, Coventry WL. Quantifying and addressing parameter indeterminacy in the classical twin design. Twin Research and Human Genetics. 2005;8:201–213. [PubMed]
- Keller MC, Medland SE, Duncan LE, Hatemi PK, Neale MC, Maes HMM, et al. Modeling extended twin family data I: Description of the Cascade model. Twin Res Hum Genet. 2009;12:8–18. [PubMed]
- Maes HMM, Neale MC, Medland SE, Keller MC, Martin NG, Heath AC, et al. Flexible Mx specifications of various extended twin kinship designs. Twin Res Hum Genet. 2009;12:26–34. [PMC free article] [PubMed]
- Martin NG, Boomsma DI, Machin G. A twin-pronged attach on complex traits. Nature Genetics. 1997;17:387–392. [PubMed]
- Medland SE, Keller MC. Modeling extended twin family data II: Power associated with different family structures. Twin Res Hum Genet. 2009;12:19–25. [PubMed]
- Miller G, Todd PM. Mate choice turns cognitive. Trends in Cognitive Science. 1998;2:190–198. [PubMed]
- Nance WE, Corey LA. Genetic models for the analysis of data from the families of identical twins. Genetics. 1976;83:811–826. [PubMed]
- Neale MC. MX: Statistical modelling. 5. Richmond, VA: Department of Psychiatry; 1999.
- Neale MC, Fulker DW. A bivariate path analysis of fear data on twins and their parents. Acta Genetica Medica Gemellol (Roma) 1984;33:273–286. [PubMed]
- Operario D, Tschann J, Flores E, Bridges M. Brief report: associations of parental warmth, peer support, and gender with adolescent emotional distress. J Adolesc. 2006;29(2):299–305. [PubMed]
- Plomin R, DeFries JC, McClearn GE, McGuffin P. Behavioral genetics. 4. New York: Worth Publishers; 2001.
- Posthuma D, Boomsma DI. A note on the statistical power in extended twin designs. Behavior Genetics. 2000;30:147–158. [PubMed]
- R Core Development Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2009.
- Reynolds CA, Baker LA, Pedersen NL. Multivariate models of mixed assortment: phenotypic assortment and social homogamy for education and fluid ability. Behav Genet. 2000;30(6):455–476. [PubMed]
- Thiessen DD, Gregg B. Human assortative mating and genetic equilibrium: An evolutionary perspective. Ethology and Sociobiology. 1980;1:111–140.
- Truett KR, Eaves LJ, Walters EE, Heath AC, Hewitt JK, Meyer JM, et al. A model system for analysis of family resemblance in extended kinships of twins. Behavior Genetics. 1994;24:35–49. [PubMed]
- Wahlberg P. Chicken Genomics-Linkage and QTL mapping. Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine 2009
- Wright S. Fisher’s theory of dominance. American Naturalist. 1929;63:274–279.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |