|Home | About | Journals | Submit | Contact Us | Français|
Using large twin, family, and adoption studies conducted at the Minnesota Center for Twin and Family Research, we describe our efforts to develop measures of substance use disorder (SUD) related phenotypes for targets in genome wide association analyses. Beginning with a diverse set of relatively narrow facet-level measures, we identified 5 constructs of intermediate complexity: nicotine, alcohol consumption, alcohol dependence, illicit drug, and behavioral disinhibition. The 5 constructs were moderately correlated (mean r = .57) reflecting a general externalizing liability to substance abuse and antisocial behavior. Analyses of the twin and adoption data revealed that this general externalizing liability accounted for much of the genetic risk in each of the intermediate-level constructs, though each also exhibited significant unique genetic and environmental risk. Additional analyses revealed substantial effects for age and sex, significant shared environmental effects, and that the mechanism of these shared environmental effects operates via siblings rather than parents. Our results provide a foundation for genome wide association analyses to detect risk alleles for SUDs as well as novel insights into SUDs.
Substance use disorders (SUDs) are one of the world’s leading public health problems with the World Health Organization (WHO; 2004,WHO; 2008) estimating tens of millions of alcohol and drug abusers and more than a billion smokers worldwide (see related WHO reports available online at http://www.who.int/topics/substance_abuse/en/). In economic terms, hundreds of millions of dollars are lost due to health care costs and lost productivity each year (Harwood, Fountain, & Livermore, 1998; WHO, 2004, 2008). Substance abuse also shortens lives, increases risk for chronic illness, and contributes to broken families, ruined careers, and violent victimization underscoring the need to mitigate the consequences and prevent the development of SUDs (WHO, 2004, 2008). Notably, SUDs remain a major problem despite evidence that public policies can reduce the prevalence of SUDs (WHO, 2008) and effective treatments have been devised for addictions (Nathan & Gorman, 1998). Given that SUDs also exhibit substantial heritable influences (Goldman, Oroszi, & Ducci, 2005), many researchers have focused their efforts on understanding the etiological contribution of genetic factors in the development of SUDs.
Among the many challenges associated with identifying genes that increase risk for SUDs is that these conditions exhibit high levels of comorbidity, that is, co-occurrence at greater than chance levels (Compton, Thomas, Stinson, & Grant, 2007; Hasin, Stinson, Ogburn, & Grant, 2007). For example, a person who exhibits alcohol use-related problems also tends to have similar problems with their use of nicotine and illicit drugs. This creates the potential for interpretative confounds as any association between a risk allele and a SUD could be accounted for by a comorbid SUD. Comorbidity also results in complications when designing the appropriate recruitment strategy such as inclusion/exclusion criteria for case-control studies (e.g., should a gene association study of alcohol dependence exclude participants who also meet criteria for nicotine or illicit drug dependence?). Another issue in genetic studies of substance abuse is how best to define the phenotype for gene association studies, for example, whether to use diagnostic categories, dimensional measures of use, or composite measures of use and abuse (Agrawal et al., 2009). Further, it is important to determine the comparability of diagnostic measures of SUDs and non-diagnostic measures of substance use, as the latter are available in many genetic studies not specifically designed to investigate SUDs. If substance use measures can serve as adequate proxies for the more labor-intensive diagnostic measures, this would provide additional opportunities for replication and better meta-analytic studies (Grant et al., 2009).
One solution to dealing with these challenges is to employ a dimensional approach that recognizes the importance of SUD comorbidity and leverages it to enhance genetic identification efforts. This approach posits that a general liability to all SUDs and related conditions such as antisocial behavior and disinhibited personality traits underlies their comorbidity and should be the target phenotype for gene association studies (Hicks et al., 2004; Krueger et al., 2002). Rather than noise, comorbidity becomes the signal as people with multiple SUDs are the most likely to carry the greatest genetic risk and so become the most informative members of the sample. The different SUDs are then conceptualized as alternative manifestations of this general inherited liability with the final phenotypic expression determined by disorder-specific genetic (e.g., pharmacological sensitivities to specific substances) and environmental risk factors (e.g., availability of specific substances).
There is substantial evidence that supports conceptualizing the comorbidity among SUDs and related conditions as being indicative of a highly heritable, general liability that underlies these commonly co-occurring disorders. Factor analyses of childhood disruptive behaviors (aggression, rule breaking; Achenbach & Edelbrock, 1984), adolescent problem behaviors (delinquency, substance use; Jessor & Jessor, 1977), and diagnoses of adult psychiatric disorders (SUDs, antisocial personality disorder; Krueger, 1999; Markon, 2010) all indicate a single, common factor best accounts for their covariance. Further, structural models of temperament and personality typically include a dimension characterized by traits related to behavioral undercontrol (impulsivity, sensation seeking, aggression) (Markon, Krueger, & Watson, 2005). These strong phenotypic associations have prompted several researchers to posit a general liability that is commonly referred to as externalizing.
Twin studies have demonstrated that the general externalizing factor (i.e., the common variance across measures of SUDs, antisocial behavior, and disinhibited personality traits) is highly heritable (h2 = .80 to .85; higher than any specific SUD), and accounts for much of the genetic risk for SUDs and related phenotypes (Kendler et al. 2003; Krueger et al., 2002; Young et al., 2000). However, each SUD is distinguished by disorder-specific genetic and environmental risk factors. For example, genes involved in the metabolism of alcohol such as ADH and ALDH are associated with risk for alcohol dependence (Higuchi et al., 2004), but do not seem to increase risk for other SUDs and antisocial behavior (Irons et al., 2006). Further, twin-family studies have shown that the similarity between parents and offspring on SUDs, antisocial behavior, and childhood disruptive disorders is best accounted for by the transmission of a highly heritable, general externalizing liability rather than disorder-specific liabilities (Bornovalova et al., 2010; Hicks et al., 2004). Further, research on endophenotypes of SUDs, specifically measures of brain electrophysiology such as reduced P3 amplitude (P3-AR), supports a common genetic liability across externalizing phenotypes. For example, P3-AR is heritable (h2 = .60 to .65; Carlson & Iacono, 2006; Hicks et al., 2007), associated with a diverse set of externalizing phenotypes (Iacono et al., 2002; Iacono & McGue, 2007; Yoon et al., 2006), and predicts the onset of new SUDs (Carlson et al., 2007; Iacono et al., 2002). Additionally, the common variance across externalizing phenotypes accounts for the link with P3-AR (Patrick et al., 2006), and twin studies have shown common genetic effects underlie the P3AR-externalizing association (Hicks et al., 2007). Finally, candidate gene studies have linked specific genes (GABRA2, CHRM2) to multiple externalizing phenotypes (Dick, 2007), with effects strongest for comorbid phenotypes (Dick et al., 2007) or composite measures that more directly index the general externalizing liability (Dick et al., 2008; Stallings et al., 2005). These last findings provide proof of concept for the utility of employing a dimensional approach in gene association studies for SUDs.
The first step to implementing this approach is to delineate the psychometric structure among measures of substance use and abuse including measures of different substances (alcohol, nicotine, illicit drugs) and related conditions such as antisocial behavior and disinhibited personality traits. The goal of such an analysis is to determine which measures are well suited to index the general propensity to use and abuse substances (e.g., number of different drug classes ever tried), and which measures best index risk for a specific substance (e.g., substance specific withdrawal symptoms). Twin and family data can augment these analyses by the parsing of genetic and environmental variance of these measures. Such analyses can more precisely determine the extent to which different measures (e.g., frequency of drinking and symptoms of alcohol dependence) index the genetic risk of a particular SUD construct (e.g., alcohol use-related problems) versus the general externalizing liability.
We report results from such analyses utilizing data from large twin, family, and adoption studies conducted at the Minnesota Center for Twin and Family Research (MCTFR; Iacono, McGue, & Krueger, 2006). The sample of each study is composed of nuclear families that include the parents and two adolescent offspring who are twins, non-twin biological siblings, or adoptive (i.e., unrelated) siblings. Using our extensive assessment battery of SUDs and related externalizing phenotypes, we first examine the psychometric structure of these measures. We assume a hierarchical structure such that by analyzing the patterns of covariance among many relatively narrow or facet-level measures, we will identify a smaller number of broader SUD constructs. Next, we will select the best measures of these SUD constructs for confirmatory and biometric analyses taking advantage of the genetically informative nature of the data. Specifically, we will examine the heritability of the facet measures and SUD constructs, how well each measure taps the genetic risk for the general externalizing liability, and the amount of measure-specific genetic and environmental variance.
Participants were members of ongoing longitudinal family studies conducted at the MCTFR. Each study is designed to include two parents and two offspring. In terms of age, parents are typically in middle adulthood (30 to 50 years) when initially assessed. Offspring differ in age when families are recruited into the studies. However, each study includes an assessment with a target age for offspring in late adolescence (16 to 19 years). To maximize our sample size and minimize complexity, for the present analyses we only used offspring data collected during their assessment in late adolescence. Each study is briefly described below and Table I provides the sample characteristics of each study.
The MTFS is a longitudinal-epidemiological study of twins born in the state of Minnesota from 1972 to 1982 in the case of male twins and 1975 to 1984 in the case of female twins (Iacono et al., 1999). Eligible families are identified using public birth records and located using publically available databases. Families are recruited to participate the year the members of the twin pair turn 11 or 17-years old, and invited to participate in follow-up assessments every 3–4 years. All biological parents are also invited to participate, as are stepparents who have taken a significant role in rearing the twin offspring. All twin pairs are same-sex. The only exclusionary criteria were that families lived within a day’s drive of the University of Minnesota laboratories and that neither twin had a mental or physical impairment that would preclude full participation in the assessment. For any given birth year, over 90% of eligible families were located, and over 80% agreed to participate. For the present analyses, all data for parents were lifetime assessed at their initial visit. For offspring in the 17-year old cohort, data were used from their intake assessment. For offspring in the younger cohort, data were used from their 2nd follow-up assessment at roughly 17-years old (89.6% retention rate).
The MTFS ES was designed to augment the original MTFS samples by increasing the representation of childhood disruptive disorders (Keyes et al., 2009). This was accomplished by using a screening procedure that required half of the participating families to include at least one twin offspring who exhibited elevated symptoms of attention deficit/hyperactivity disorder or conduct disorder. The remaining families were recruited in the same manner as the original MTFS samples, that is, a birth cohort design that included all twins born in Minnesota from 1988 to 1994 with few exclusionary criteria. Eighty two percent of eligible families were located with slightly more than 80% recruited into the study the year the twins turn 11-years old. Consistent with the original MTFS, all twins were same-sex and families were invited to participate in follow-up assessments every 3–4 years. Though all parents are included in the present analyses, only about half of MTFS ES twin participants are included as data had to be available from the ongoing 2nd follow-up (age 17) assessment
SIBS is an adoption study comprised of 409 adoptive families and 208 non-adoptive families (McGue et al., 2007). All families include two siblings and one or both parents. Adoptive families were ascertained from the three largest, private adoption agencies in Minnesota. Non-adoptive families were obtained from Minnesota state birth records and selected to include a sibling pair comparable in age and gender to the adoptive sibling pairs. Families were located using the names of parents obtained from the adoption agency or from birth records using publicly available databases. Eligibility requirements for adoptive families included: (1) an adoptive adolescent currently between the ages of 11 and 21 who had been permanently placed with the family prior to 2 years of age, (2) a second adolescent in the home who was not biologically related to the adopted adolescent. The second adolescent could have been biologically related to one or both parents, or like the first adolescent, adopted into the family prior to age 2. The mean age of placement for all adopted adolescents was 4.7 months (SD = 3.4). Eligibility for the non-adoptive families required having a pair of full biological adolescent siblings. Additional eligibility requirements that applied to all families included living within driving distance to the assessment labs, neither adolescent having a mental or physical disability that would preclude full participation in the assessment, and siblings being no more than 5 years apart in age. For both adoptive and non-adoptive families, siblings could be either same-sex (60.8%) or opposite-sex (39.2%).
Participation rates were 63.2% for adoptive families and 57.3% for non-adoptive families. To assess recruitment bias, demographic information was obtained from 73% of non-participating families including parental education and occupation, percentage of original parents who remained married and behavioral problems in the offspring. The only significant difference was for greater education attainment for mothers of the non-adoptive families suggesting the families are broadly representative of the populations from which they were drawn. Siblings are invited to return for follow-up assessments every 3–4 years. Data ascertained at the siblings’ late adolescent assessment (typically either the intake or the 1st follow-up assessment that had a 94.5% retention rate) were used in the present analyses.
All substance-related measures were derived from data obtained using the Substance Abuse Module (SAM) of the Composite International Diagnostic Interview (Robins et al., 1987). The SAM interview covers DSM symptoms of alcohol, nicotine, and drug dependence, questions regarding quantity and frequency of use, and other problematic substance use behaviors not captured by diagnostic systems. DSM-III-R (American Psychiatric Association, 1987) was the current diagnostic system when the MTFS began, and so is the common diagnostic measure across all studies and assessments. The SAM assesses 11 categories of illicit drugs including: marijuana, amphetamines, barbiturates, tranquilizers, cocaine, heroin, opiates, PCP, psychedelics, inhalants, and gas. Diagnostic kappa reliabilities for all SUDs exceeded .91.
Various preliminary analyses eventually led us to select 17 relatively narrow or facet-level measures that were included in the current analyses. While space limitations preclude a detailed description of this process, it was informed by findings from previous studies (Agrawal et al., 2009; Dawson et al., 2010; Derringer et l., 2008; Grant et al., 2009; Krueger et al., 2002, 2004, 2007; Malone et al., 2002; Markon, 2010; Markon & Krueger, 2005; McGue & Iacono, 2005; Young et al., 2001) and additional measurement and practical considerations including: availability across samples including both parents and offspring; non-redundancy with other measures; adequate prevalence; representative of distinct content domains; inclusion of at least some dimensional measures to differentiate among people low on externalizing and substance abuse; other psychometric characteristics such as distributional properties and pattern of correlations with other measures. The facet measures were originally grouped into the following content categories:
These measures include DSM symptoms of nicotine dependence and frequency and quantity of nicotine use including cigarettes, cigars, pipes, and chewing tobacco during the period of heaviest use. Frequency and quantity are dimensional items inquiring about typical use for a 1-month period of time during the period of heaviest use, specifically, “How many days per month did you smoke (use tobacco)?” And, “How much did you usually smoke/chew per day (i.e., number of cigarettes, cigars, pipefuls, or chews)?”
These variables are dimensional items of drinking ranging from no drinking to high levels of frequency and binge drinking. The measures are number of lifetime intoxications (capped at 999), maximum number of drinks consumed in 24 hours (capped at 50), and frequency of alcohol use as measured in number of drinking occasions (not number of drinks) during period of heaviest use (0 = never to 10 = 3 or more times a day).
The SAM covers multiple diagnostic systems (DSM-III-R, DSM-III, Research Diagnostic Criteria, Feigher Criteria) and non-diagnostic behaviors of alcohol-related problems; therefore, several facet-level scales were constructed to represent the various content domains of problematic alcohol use. The scales were calculated by taking the mean for the SAM items (scored 0 or 1) that constituted each scale. The content scales for alcohol abuse/dependence included in the current analyses were social and occupational problems (e.g., drinking and driving, arrested because of drinking), withdrawal and tolerance (e.g., withdrawal symptoms such as shakes, delirium tremens, etc.), and compulsive drinking and impairment in major life activities (e.g., wanted to stop but couldn’t, little time for anything but drinking).
The SAM covers the use of 11 different drug classes. Marijuana was by far the most widely used illicit drug, while most other drugs had relatively low rates of use. Further, preliminary analyses showed that other than marijuana use, specific drug use variables were mostly redundant with a count of the number of different drug classes a person had ever tried. Therefore, our drug use measures included total number of lifetime marijuana uses (capped at 999) and a count of the number of different drug classes ever tried.
The SAM assesses DSM symptoms of drug abuse/dependence for each of the different drug classes. As drug abuse/dependence is much less prevalent than alcohol abuse/dependence and the SAM incorporates a less variegated assessment of drug use problems, the substance for which the participant reported the most symptoms was used as their measure of symptoms of drug abuse/dependence.
We refer to the non-substance related externalizing phenotypes as behavioral disinhibition. The behavioral disinhibition construct was assessed using the following measures: total number of symptoms of conduct disorder (i.e., antisocial behavior before age 15); symptoms that constitute the adult criteria for antisocial personality disorder (i.e., antisocial behavior after age 15; the symptom of reckless disregard for the safety of self and others was excluded due to its overlap with the SUD criteria of driving under the influence of alcohol or illicit drugs); instances of dissocial behavior assessed through a life events stress interview consisting of the sum of antisocial and non-normative behavior not necessarily captured by diagnostic criteria, which includes ever being suspended or expelled from school, ever being arrested, and early age of sexual intercourse (1 = before age 15, 0.5 = age 15 to 17, 0 = age 18 or older); total score on the Delinquent Behavior Inventory, a 21-item (α = .95) self-report measure inquiring about the commission of various antisocial acts during childhood and adolescence (Taylor et al., 2000); total score on a personality measure called aggressive undercontrol derived using 20 items (α = .84) from the aggression and constraint scales of the Multidimensional Personality Questionnaire (Tellegen & Waller, 2008). The symptom measures of antisocial behavior were assessed using the Structured Clinical Interview for DSM-III-R Axis II (First et al., 1997) and coded as absent (0), present at subthreshold level (0.5), or present at full threshold (1).
Analyses proceeded in three phases. The first was an exploratory phase whereby we sought to delineate the psychometric structure of the 17 facet-level measures and to identify intermediate latent constructs (e.g., nicotine, alcohol, illicit drug, behavioral disinhibition) in the hierarchy of externalizing-related phenotypes using hierarchical cluster analysis. Second, we fit confirmatory factor analytic (CFA) models to the data based on results of the exploratory phase. Finally, we used the twin and sibling data to fit biometric models to estimate the heritability and the genetic architecture among the facet measures, intermediate-level, and higher-order constructs identified in the phenotypic analyses.
Due to the high skew and kurtosis of several facet measures, we applied a log(x +1) transformation to all variables prior to analyses (except for aggressive undercontrol scores which exhibited a sufficiently normal distribution). Hierarchical cluster analyses were conducted on the facet-level correlation matrix using the ICLUST algorithm (Revelle, 1979). Cluster analysis has been successfully used in structural analyses of psychopathology (Krueger et al., 2007; Markon, 2010) especially in exploratory stages, and relative to exploratory factor analysis more clearly delineates the hierarchical relationships among highly correlated variables (Bacon, 2001). Clustering proceeds by first combining the two highest correlating items (i.e., facets). Next, the highest correlating pair of remaining items (including the new two-item cluster) forms the next cluster. Items and clusters continue to combine as long as the general factor saturation (Revelle’s β defined as the minimum split-half correlation of a set of items/facets) of the resulting higher-order cluster increases, though this default stopping criterion can be adjusted to increase flexibility in delineating the hierarchical structure among the facet measures. To further verify the group structure, we also repeated the cluster analysis after removing the variance due to the general factor using ωh, a slightly more accurate estimate of general factor saturation than (Zinbarg, Revelle, Yovel, & Li, 2005). The loadings of the facets on the general factor were estimated by first factoring the data, rotating a minimum of three factors obliquely, and applying a Schmid-Leiman transformation (Schmid & Leiman, 1957; Zinbarg, Yovel, Revelle, & McDonald, 2006). Cluster analyses and general factor loading estimates were completed using the ICLUST and omega functions in the psych package (Revelle, 2009) available in R (R Core Development Team, 2009).
CFA models were fit in Mplus (Muthen & Muthen, 2007) using a maximum likelihood estimator with robust standard errors and the cluster option to account for the family structure of the sample. Full information maximum likelihood was used to accommodate missing data. Model fit was evaluated using standard fit indices including a χ2 fit statistic adjusted for non-normal data, the Comparative Fit Index (CFI; > .90 adequate fit, > .95 very good fit), the Root Mean Square Error of Approximation (RMSEA; < .08 adequate fit, < .05 very good fit), Square Root Mean Residual (SRMR; < .08 adequate fit, < .05 very good fit), and the Bayesian Information Criterion (BIC; lower values indicate better fit).
Following analyses to delineate the psychometric structure of the measures, we used the offspring data to fit standard biometric models to examine the heritability and genetic architecture of the facet and higher-order externalizing measures. These models parse the phenotypic variance into additive genetic (A), shared environmental (C), and nonshared environmental (E) variance. Additive genetic variance refers to genetic effects summed across loci and is inferred based on greater phenotypic similarity for sibling pairs that shared a greater number of segregating alleles (rmz > rdz and rbio > radot). Shared environmental variance is due to environmental effects that contribute to phenotypic similarity between siblings and is inferred if rdz > ½rmz and radot > 0. Nonshared environmental variance is due to environmental effects that contribute to differences among siblings (including measurement error) and is inferred if rmz < 1.0. These models can be easily extended to the bivariate using a Cholesky decomposition to parse the variance that is shared between two measures versus the variance that is unique to each measure. These models also yield genetic, shared environmental, and nonshared environmental correlations that index the extent of overlap between the two measures on their respective variance component. We had 2 goals: (1) estimate the genetic and environmental variance components for the intermediate-level externalizing constructs, and (2) estimate the extent to which variance in the facet measures and intermediate-level constructs was attributable to the general externalizing liability versus measure-specific variance. All biometric models were fit to the sibling and twin data using the computer program Mx (Neale et al., 2004) using full information maximum likelihood estimation to accommodate missing data.
First, we fit a cluster analytic model for the 17 facets using the default stopping criterion (i.e., the β for the resulting cluster must increase beyond the minimum of the two subclusters). This revealed a hierarchical structure of 3 intermediate-level clusters whose correlations were accounted for by a single broad cluster (β = .71, ωh = .78 values indicative of substantial general factor variance). The largest of the intermediate-level clusters included all the legal substance use and abuse measures (alcohol and nicotine) with additional intermediate-level clusters defined by the behavioral disinihibition and illicit drug facets. When setting stricter clustering criteria, the facets also exhibited coherent 4- and 5-cluster structures, with the nicotine facets separating from the alcohol facets in the 4-cluster model, and the alcohol use and alcohol dependence facets further splitting in the 5-cluster model. Table II lists the facet to cluster loadings (corrected item-to-total correlations) for the 5-cluster model as it provides the most well differentiated structure. Though the facets exhibit positive correlations with each cluster, each facet also exhibits an especially high loading on a single cluster with each cluster defined by 3 to 5 facets. The one exception is intoxications, which has virtually equivalent loadings (.77 and .78) on clusters defined by alcohol consumption and alcohol dependence facets, respectively. Correlations among the 5 clusters were moderate to high (mean r = .52).
To further examine group structure, we removed variance from the correlation matrix due to the general factor. Using the omega function in the R psych package, we estimated the general factor loadings of the 17 facets, and repeated the ICLUST technique on the residual correlation matrix. The 5-cluster solution was clearly apparent, though intoxications clustered with alcohol consumption (.57) rather than alcohol dependence (.47). Forcing a solution of fewer than 5 clusters lead to decreases in β of .2 or more. The correlations among the 5 residual clusters were low (mean r = .07). This indicates that while the facets are linked by a general externalizing factor, their latent structure is best characterized by 5 constructs differentiated by significant independent variance. Therefore, a 3-level hierarchy of 17-facets, 5 intermediate constructs, and a general externalizing factor provides the most appropriate structure.
Based on the results of the exploratory analyses, we next fit a 5-factor CFA model using the 17 facet-level measures. The 5-factor model provided a good fit to the data, χ2(108) = 4901, RMSEA = .070, CFI = .942, SRMR = .042, BIC = −36,909. The factor correlations ranged from .50 to .65 (mean = .57). Lifetime intoxications is the only facet that required a cross-loading on multiple factors, loading equally on the two alcohol-related factors. A 4-factor model that combined the two alcohol-related factors failed to achieve standard thresholds indicative of an adequate fit (χ2 = 11,079, RMSEA = .103, CFI = .868, SRMR = .055, BIC = −28,891), and models with fewer factors yielded fit statistics that were much worse. A model that accounted for the correlations among the 5 factors with a single higher-order externalizing factor also provided a good fit to the data, χ2(113)= 5173, RMSEA = .070, CFI = .939, SRMR = .047, BIC = −36,614. Figures 1 provides a graphical depiction of the 3-level hierarchical factor model with the 17-facets, 5 intermediate-level factors, and the higher-order externalizing factor. To ensure our results were robust, we also fit exploratory and confirmatory models separately to offspring and parent subsamples, using raw scores rather than log transformed measures, and after regressing out the effect of age and sex on the facet-level measures. In all cases, the structure and model fitting results were comparable to those reported here (details available upon request to B. M. Hicks).
Table III reports the effects of age and sex on each of the intermediate-level and the higher-order externalizing phenotypes as estimated in structural equation models fit in Mplus using the same estimator with robust standard errors that was used in the CFA models. As substance use and abuse tends to increase from adolescence into young adulthood and then decline beginning in middle adulthood (Bachman et al., 1997; Chassin et al., 2004), we included both linear and quadratic age effects as well as their interactions with sex. Due to the large sample size, only effects with a p-value < .001 are reported as significant. To provide a visual depiction of the effects of age and sex, Figures 2 and and33 plot the mean values for alcohol consumption (largest age effect) and behavioral disinhibition (largest sex effect) factor scores by age, separately for men and women (as all variables exhibited a similar pattern only two are plotted for exemplars). The size of the circles is proportional to the number of observations for each age.
Men had higher scores than women on each externalizing phenotype with moderate to large effects. The one exception was for nicotine, which exhibited only a small, non-significant gender difference. For each substance-related phenotype, the linear age effect was positive and the quadratic age effect was negative, indicating an initial increase in substance use-related behaviors from adolescence to young adulthood followed by a decline from middle to older adulthood. For behavioral disinhibition, both the linear and quadratic age effects were negative indicating an initial decline from adolescence to young adulthood followed by an accelerated decline from middle to older adulthood1. The age × sex interaction was positive for each externalizing phenotype indicating men exhibited a greater initial increase in substance use-related behaviors. The age2 × sex interaction was significant for nicotine (+) and illicit drugs (−). This indicates that the gender difference on nicotine was greater at the older ages, while the gender difference on illicit drugs was smaller at the older ages. As these are cross-sectional data based on lifetime assessments, both of these trends are likely due to historical changes in societal use of nicotine (gender gap has narrowed) and illicit drugs (low prevalence in both genders for the oldest members of the sample). Overall, the combined effects of age and sex were moderate to large for each externalizing phenotype (R2 = .090 to .288) indicative of the importance of developmental stage in the emergence of SUDs as well as their greater prevalence in men.
The correlations among family members for the externalizing phenotypes are listed in Table IV. Correlations were estimated for the latent substance abuse and behavioral disinhibition variables using CFA models fit in Mplus. All facet measures were first regressed on sex, age, age2, and their interactions. Parent-child correlations were not significantly different for mothers and fathers so we report the mean of these effects in the table. Parent-offspring correlations were constrained to be the same for all family types with biological offspring (MZ twins, DZ twins, non-twin offspring) and allowed to differ for families with adoptive offspring. The mother-father correlation was constrained to be the same across all family types. The sibling correlation was allowed to vary across all family types.
The pattern of correlations was fairly consistent for each externalizing phenotype. First, the mother-father correlation was moderate in magnitude (mean = .44). Second, the parent-offspring correlation was small to moderate (mean = .25) for biological offspring, and near zero (mean = .06) for adoptive offspring. In terms of effect size, the sibling correlations followed a consistent pattern with the following order from largest to smallest: MZ twins (mean = .77), DZ twins (mean = .48), non-twin biological siblings (mean = .33), and adoptive siblings (mean = .28). A purely additive genetic model of family resemblance posits that heritability is equal to the MZ correlation and twice the correlation between 1st degree relatives (i.e., parents and biological offspring, DZ twins, and non-twin biological siblings), and that the correlation among non-biological relatives is zero (i.e., parents and adopted offspring, adoptive siblings). The family correlations in our samples indicate deviations from this additive genetic model primarily in regards to the presence of shared environmental effects with adoptive sibling correlations greater than zero being especially strong evidence of these effects.
Next, we fit biometric models to the MZ twin, DZ twin, non-twin biological sibling, and adoptive sibling correlations for each externalizing phenotype2. As these analyses are primarily meant to be descriptive in regards to partitioning heritable and non-heritable variance rather an attempt to establish causal influences, we do not report fit indices or compare the fit of reduced or alternative models (e.g., models with non-additive genetic or special twin environment parameters) though these results are available upon request. ACE components were estimated for latent intermediate-level externalizing phenotypes by fitting biometric factor models in Mx. For the higher-order externalizing phenotype, we also fit a biometric factor model using factor scores on intermediate-level phenotypes (scores estimated in Mplus) as indicators of the latent externalizing factor. Consistent with the family correlations, each externalizing phenotype exhibited moderate to large heritability (mean = .56), and significant though smaller shared environmental (mean = .19) and nonshared environmental effects (mean = .25). Only the alcohol dependence phenotype failed to exhibit significant shared environmental effects.
Finally, we examined the extent to which individual measures indexed the general externalizing liability versus measure-specific variance. To do so, we fit a series of bivariate Choleksy models in Mx with factor scores on the higher-order externalizing factor entered first and each intermediate and facet-level measure entered second. Figure 4 provides a graphical depiction of the Cholesky model using the higher-order externalizing factor and the lower-order nicotine factor as example phenotypes. This allowed us to estimate the phenotypic, genetic, and environmental correlations between the general externalizing liability and each measure. We then estimated the amount of variance in each measure that was attributable to the general externalizing liability and the amount of variance that was measure-specific.
The results of these analyses are reported in Table V. Each intermediate-level phenotype exhibited substantial overlap with the general externalizing liability as indexed by the phenotypic (mean = .86), genetic (mean = .89), and shared environmental (mean = .95) correlations. The general externalizing liability accounted for the majority of genetic variance in each intermediate-level phenotype (mean = 80%), though each also exhibited significant measure-specific genetic effects (mean = 20%). The general externalizing liability also accounted for virtually all the shared environmental variance in each intermediate-level phenotype. Nonshared environmental effects primarily differentiated the intermediate-level phenotypes from each other and the general externalizing liability.
The facet-level measures exhibited a similar pattern though the correlations with the general externalizing liability were slightly lower than the intermediate-level phenotypes. Again, the general externalizing liability accounted for the majority of genetic variance (mean = 69%), and virtually all the shared environmental variance in the facet measures. However, there was greater distinctiveness for three of the behavioral disinhibition measures, specifically, conduct disorder, the Delinquent Behavior Inventory, and aggressive undercontrol. For each of these variables, the general externalizing liability accounted for less than half the genetic variance, and for conduct disorder less than half the shared environmental variance. Notably, two of these measures assess child and adolescent antisocial behavior rather than late adolescent and adult behaviors while the third has a unique assessment modality (i.e., self-report of current personality functioning rather than a lifetime behavioral measure assessed by interview). Finally, the general externalizing liability accounted for only a small portion of the nonshared environmental variance (mean = 28%) in the facet-level measures. This suggests greater measurement error at the facet-level compared to the intermediate-level phenotypes though substantive nonshared environmental effects might also be contributing to the specific phenotypic manifestation as measured by the facets of the general externalizing liability.
The impetus for this report is an upcoming genome wide association study of SUDs using the MCTFR samples. Relative to genotyping and statistical genetic analyses, the measurement strategy for genome wide association studies are often given less emphasis despite important complexities involved in phenotype definition. For SUDs, this includes how best to conceptualize comorbidity or the co-occurrence among different SUDs. Here, we propose a dimensional approach whereby the overlap among measures, that is, the risk factors that underlie the comorbidity among SUDs becomes the target phenotype for genetic association analyses. Therefore, we undertook a comprehensive review of our phenotypic measures to delineate their psychometric structure. That is, we examined the patterns of overlap across many substance use measures to identify and measure a smaller set of constructs that assessed both the general liability to SUDs and disorder-specific risk.
Our analyses identified 5 constructs at an intermediate-level of complexity: nicotine, alcohol consumption, alcohol dependence, illicit drug, and behavioral disinhibition. Four index manifest risk for SUDs. The fifth construct, behavioral disinhibition, was defined by non-substance use characteristics that are highly associated with and often precede SUDs, specifically, measures of antisocial behavior and disinhibited personality traits. As such, behavioral disinhibition provides an alternative measure of underlying risk for SUDs. This is an especially useful measure for people who carry risk alleles for SUDs, but who either have yet to manifest SUDs (e.g., children and adolescents) or will not manifest SUDs. These 5 intermediate-level constructs were highly correlated, and their correlations were well accounted for by a single higher-order externalizing factor. These measures exhibited moderate to high heritability and small to moderate shared environmental variance. The use of multiple measures typically reduces measurement error, and the highly familial nature of our measures (i.e., combined additive genetic and shared environmental effects, mean = .77) and the low nonshared environmental variance suggests our measures have low measurement error and good psychometric properties.
Each intermediate-level construct was highly saturated with the general externalizing liability, though each also exhibited measure-specific genetic and nonshared environmental variance. The substantial overlap between the intermediate-level factors and higher-order externalizing factor suggests that a composite measure of any SUD or antisocial behavior can serve as an adequate index of the genetic risk associated with the general liability to SUDs. Also, based on the phenotypic and genetic correlations with the externalizing factor, dimensional measures of substance use appear to be comparable to diagnostic measures in terms of indexing the general liability to SUDs.
Alcohol was an exception, however, as measures of use and dependence split into separate factors. This could be due to including a greater number of alcohol measures in the analyses relative to nicotine and illicit drugs, or to the large number of adolescents in the sample, few of whom exhibit the more severe symptoms of alcohol dependence. More substantive reasons could also play a role. For example, few people who are regular smokers fail to exhibit symptoms of dependence, and the relatively low prevalence of illicit drug use in a general population sample likely induces a large correlation between use and abuse/dependence symptoms. In contrast, large numbers of people regularly use alcohol but do not exhibit abuse/dependence symptoms. Also, the biometric analyses revealed important distinctions as alcohol consumption exhibited substantial shared environmental effects while alcohol dependence was more heritable with no shared environmental effects. Future analyses that examine the external correlates of the 5 intermediate-level constructs will be needed to further establish their convergent and discriminant validity.
A particularly interesting finding from our analysis of correlations among family members was the robust correlations between adoptive siblings, which provide strong evidence of shared environmental effects. Notably, the parent-adoptive offspring correlations were near zero indicating the mechanism of family environmental influence is through siblings rather than parents. In contrast, the small to moderate parent-biological offspring correlations indicates that parent-child similarity on externalizing phenotypes is primarily a function of genetic transmission.
The significant shared environmental effects for each externalizing phenotype except alcohol dependence were somewhat unanticipated. Previous studies using a subset of the current sample (the 17-year old cohort of the MTFS) and a slightly different set of facet measures reported the heritability of the general externalizing factor at .81 with no shared environmental effects (Hicks et al., 2004; Krueger et al., 2002). One reason for these differences may be the inclusion of the adoptive siblings, which enhances power to detect shared environmental effects. However, when we estimated the heritability of the externalizing factor with the same measures used in previous studies (adult antisocial behavior, conduct disorder, alcohol and drug dependence, and aggressive undercontrol), our findings were very similar to previous reports (a2 = .76, c2 =.06, e2 = .18). This suggests that rather than the inclusion of adoptive siblings, the greater shared environmental effects are due to the content of the facet-level measures used to define the factors, in particular the dimensional measures of substance use and measures of child/adolescent antisocial behavior. Specifically, previous studies tend to find that shared environmental effects contribute to the initiation of substance use and moderate use (Kendler et al., 2008; McGue, Elkins, & Iacono, 2000; Rhee et al., 2003) and to child/adolescent antisocial behavior (Jacobson et al., 2000; Lyons et al., 1995), but few studies detect shared environmental effects for SUDs (Heath et al., 1997; Kendler et al., 1992; Prescott & Kendler, 1999) or adult antisocial behavior (Lyons et al., 1995). Using such measures to define intermediate and high-order factors then will also likely result in the detection of significant shared environmental effects for the factor-level constructs.
A final noteworthy aspect of our findings of shared environmental effects is that the general externalizing factor accounted for virtually all the shared environmental effects on the intermediate-level factors and facet-level measures (see Table V). This finding is consistent with previous studies using the MTFS twins that have found large shared environmental effects on a general pattern of adolescent problem behavior that includes early initiation of substance use, police contact, and precocious sexual behavior (McGue, Iacono, & Krueger, 2006). This general shared environmental effect likely reflects environmental factors associated with adolescent substance use initiation. Interestingly, conduct disorder exhibited substantial measure-specific shared environmental effects, suggesting there may be distinct shared environmental effects on child antisocial behavior and adolescent substance use.
The potential role of shared environmental effects on substance use initiation and moderate use raises the question of how best to conceptualize substance exposure in genetic studies of SUDs. This is a challenging issue, and while the ideal solution is not yet known, there are multiple approaches that are all based on sound arguments. One approach is to exclude participants without sufficient exposure to substances (e.g., smoking at least 100 cigarettes, moderate levels of regular drinking), the logic being that it is impossible to determine whether a person will exhibit genetic sensitivity or resistance to addiction without a minimal level of exposure (Bierut et al., 2007). Another approach is to use complex stage models that specify the progression from initiation, regular use, and dependence, and incorporate overlapping and unique genetic and environmental effects at each stage (Aprawal et al., 2005; Heath et al., 2002). Alternatively, a dimensional approach holds that it is important and informative to measure the full range of severity, and simply conceptualizes lack of initiation and infrequent use as the low end of this severity continuum.
The first approach can be problematic when conducting a genome wide association study with existing population-representative samples such as the MCTFR that do not employ a case-control design with minimal substance exposure requirements. Specifically, excluding participants based on a lack of adequate exposure can substantially reduce sample size and statistical power. Also, an approach of excluding people based on lack of adequate exposure is more likely to detect risk alleles that are specific for a given substance. In contrast, our approach uses the general externalizing liability as the target phenotype. As the general externalizing liability is highly related to substance use initiation (McGue et al., 2001), people with low levels of exposure become informative regarding the low end of the severity continuum. Finally, stage models add complexity and provide more of an etiological model than phenotypic scores for gene association. Also, these studies have tended to find substantial overlap between initiation and progression especially for nicotine and marijuana use (Agrawal et al., 2005; Fowler et al., 2007), suggesting a modest distinction that can likely be incorporated using a composite measure. Notably, the overlap between initiation and progression of alcohol use tends to be more moderate (Fowler et al., 2007), but again a composite approach can likely incorporate these distinctions especially if separate composites are employed for initiation/regular use and more severe dependence symptoms, an approach consistent with our structural analyses.
We also found that age and sex accounted for substantial portions of variance in our externalizing measures. This is a function of two well-replicated epidemiological findings. The first is the greater prevalence of substance use-related problems and antisocial behavior in men (Compton et al., 2007; Hasin et al., 2007; Kessler et al., 1994). The other is the natural history of substance use and antisocial behavior. Specifically, substance use tends to initiate in middle-to-late adolescence, increase sharply and peaks in the early 20’s, plateaus and then declines in the mid to late 20’s (Bachman et al., 1997; Chassin et al., 2004). Though we analyzed cross-sectional data, we observed age-related changes in mean-level substance use remarkably consistent with this pattern. That is, each substance use measure exhibited rapid increases in mean-levels from adolescence to adulthood (linear age effect), followed by a peak and plateau around age 30 with a modest decline thereafter (quadratic age effect). The slightly older age of for the peak and plateau of substance use is due to the large gap in the ages we assessed for the data set (i.e., there were almost no observations between ages 21 to 29). Behavioral disinhibition exhibited a slightly different pattern (i.e., negative linear and quadratic age effects) than the substance use measures, likely due to mean-levels of antisocial behavior and disinhibited personality traits tending to peak in late adolescence rather than young adulthood (Moffitt, 1993; Roberts et al., 2006).
The continued decline in mean-levels of substance use and behavioral disinhibition with increasing age is somewhat surprising given the participants are providing lifetime reports. This effect is likely due to either cohort effects or retrospective bias in reporting. That is, the greater the passage of time between a person’s period of heaviest use and their assessment, the more likely he or she is to under report his or her actual use. Given a period of heaviest use in early adulthood and this retrospective bias, one would expect the patterns of mean-level differences we observed.
These age-related changes are a signal of important developmental processes that underlie SUDs, and an important direction for future research is to begin to delineate how genetic risk for SUDs unfolds over the course of development. An important initial step will be to directly incorporate the effects of age and sex into the scores of phenotypic measures that exhibit notable sex and age graded changes such as substance use rather than use age and sex as covariates. This is key, because the same measure may have a different meaning depending upon the age and sex of the person. For example, ever consuming 20 alcoholic drinks on one occasion is indicative of greater severity and more predictive of alcoholism for a 17-year old girl than for a 45-year old man. Recently, researchers have made important advances in psychometric techniques—such as moderated nonlinear factor analysis—that directly incorporate sample heterogeneity (e.g., age and sex) into the measurement of latent constructs (Bauer & Hussong, 2009). Such techniques make it possible to directly compare the scores of all participants in a sample regardless of age and gender. Another future direction in regards to incorporating development will be to utilize the longitudinal data that cover key developmental transitions for substance use. For example, the MCTFR offspring participants are assessed regularly from age 11 (prior to substance initiation) to age 30 (when patterns of persistent versus desistent SUDs are emerging). Potential approaches include using trajectory groups or individual growth parameters such as the intercept (initial status) and slope (increase over time) as target phenotypes for gene association analyses (Dick et al., 2009).
Another direction for future research is to develop additional phenotypes of underlying risk rather than manifest risk for SUDs for use in the genome wide association studies, thereby avoiding confounds related to substance exposure or age and sex differences in phenotypic expression. The MCTFR approach will include two such measures of underlying risk: a pre-morbid liability index and psychophysiological endophenotypes. The pre-morbid liability index will include traits and behaviors present prior to the initiation of substance use that best predict later SUDs (r = .55 with the higher-order externalizing factor; Hicks et al., 2010). This pre-morbid liability index has the additional utility of providing a benchmark of individual-level risk to investigate gene-environment interplay over the course of adolescence, a key developmental period for environmental processes associated with the initiation and escalation substance use. The pscyhophysiological endophenotype will include measures of brain electrophysiology including P3-AR, its delta and theta time-frequency components, and resting EEG activity in the beta frequency band (Gilmore, Malone, & Iacono, 2010). The endophenotypes provide the unique aspect of being laboratory-based measures (i.e., do not rely on self or informant reports) that putatively tap the neurological basis of risk for SUDs. The logic of this strategy is that augmenting manifest measures of substance abuse with additional measures of underlying risk will provide a comprehensive rather than redundant strategy in the attempt to identify risk alleles from genome wide association analyses.
Finally, it is essential that environmental risk be incorporated into these designs to detect gene × environment interactions between substance abuse measures and risk alleles identified in genome wide association analysis. Pursuant to this goal, we have undertaken structural analyses of our extensive assessment of environmental risk. Similar to our analyses of the substance use and behavioral disinhibition measures, we have identified a small set of intermediate-level constructs (deviant influences, family dysfunction, family instability, protective factors) that are all correlated and well captured by a single composite measure of overall environmental risk, which is consistent with the notion of cumulative risk exposure across time and multiple domains (Sameroff et al., 1993). Additionally, using twin models of gene × environment interaction, we found that genetic risk for externalizing increases in the context of environmental adversity regardless of the domain of environmental risk (Hicks et al., 2009), indicating a nonspecific mechanism of gene-environment interplay in risk for externalizing disorders, and additional support for using a composite measure of environmental risk. This work then provides the empirical and theoretical framework for integrating the environment and specific risk alleles to delineate gene-environment interplay in the development of SUDs.
To conclude, our psychometric analyses identified 5 constructs of intermediate complexity to serve as targets in genome wide association analyses. These 5 constructs were all correlated, and our biometric analyses revealed each provides a good index of the genetic risk for the general externalizing liability to SUDs and antisocial behavior though each also exhibited specific genetic and environmental risk. It should be noted the study has important limitations. Though we have the advantage of a very large sample, it is not representative of the greater United States population, and so there remain questions about generalizability to other demographic groups. Also, for the sake of simplicity and the need for brevity, we have restricted our results to relatively straightforward analyses that most efficiently met our goals of delineating the psychometric and genetic architecture of SUD measures for genetic association analyses. Certainly, more complex analyses could be conducted that might garner additional insights into the measures and constructs. Finally, as discussed in future directions, much more work needs to be completed to incorporate environmental risk and developmental processes along with genetic risk into understanding the etiology of SUDs. We hope our findings provide another incremental step in achieving that ultimate goal.
This work was supported in part by USPS grants U01 DA024417, R01 DA005147, R01 DA013240, R01 AA009367, R01 AA011886, and R01 MH066140. Brian M. Hicks was supported by K01 DA025868. Stephen M. Malone was supported by K01 AA015621.
1The analysis reported in the text differs slightly from the figure depicting mean behavioral disinhibition scores as a function of age. Specifically, the analysis discussed in the text fits a linear regression model to the full sample and detects overall negative effects for both age and age2. In contrast, for the purpose of better visual presentation, the figure depicts fitting separate linear regression models for offspring (showing a linear increase) and parent (showing a linear decrease) parent participants.
2To ensure the biometric analyses would not be affected by any potential group differences in mean levels of externalizing behaviors, we tested for group differences between offspring from the MTFS and SIBS samples on the lower-order factors and higher-order externalizing factor. Offspring from the MTFS and SIBS exhibited comparable scores with the mean Cohen’s d = −.03 (range −.13 to .09).