Cannabis dependence is the third leading contributor to admissions to chemical dependency treatment settings (Treatment Episode Data Set, 2003
) and rates of past year cannabis dependence in the U.S. population have increased by 18% since 1991–1992 (Compton et al., 2004
). Several studies report strong heritable influences on cannabis dependence [h2
of 50–70%] (Agrawal and Lynskey, 2006
). In an effort to identify contributors to this composite heritability, multiple linkage and association studies have been conducted. Several promising linkage regions (e.g. chromosome 3) exist. Candidate gene studies have largely focused on variants in the gene encoding the cannabinoid receptor CNR1 – results remain equivocal (Agrawal and Lynskey, 2009
). To our knowledge, there are currently no published genomewide association studies (GWAS) of cannabis dependence. The goal of this investigation is to conduct association analysis between 948,142 single nucleotide polymorphisms (SNPs) and lifetime DSM-IV cannabis dependence using data on 708 cases with DSM-IV cannabis dependence and 2346 controls who report lifetime cannabis use but do not meet criteria for DSM-IV dependence.
Data for this study come from the Study of Addiction: Genes and Environment (SAGE) (Bierut et al., 2010
), which was one of the 8 Phase 1 studies of the Gene Environment Association (GENEVA) consortium (Cornelis et al., 2010
). The study was designed primarily to study DSM-IV alcohol dependence as well as other correlated addiction phenotypes. All phenotypic data were collected using the Semi-Structured Assessment for the Genetics of Alcoholism Interview (SSAGA) (Bucholz et al., 1994b
). Reliability of SSAGA diagnoses of substance use disorders is good (kappas of 0.7 and higher)(Bucholz et al., 1994a
). Case status for the analyses reported here was defined as a lifetime history of DSM-IV cannabis dependence, modified to include cannabis withdrawal (i.e. 3 or more of 7 criteria clustering within a 12-month period). Controls had used cannabis at least once in their lifetime but did not meet criteria for DSM-IV dependence criteria (however, those meeting criteria for dependence on other psychoactive substances, including alcohol, were not excluded). Those who had never used cannabis even once in their lifetime (N=946) were omitted (however, re-doing analyses including them as unaffected showed similar results).
DNA samples were genotyped on the Illumina Human 1M beadchip by the Center for Inherited Diseases Research (CIDR) at Johns Hopkins University. 948,658 SNPs passed data cleaning procedures – further within-sample filtering for autosomal and X-chromosome markers yielded 948,142 markers. HapMap genotyping controls, duplicates, related subjects, and outliers were removed from the sample set. Two thousand and nineteen subjects reported being European American and 1,035 reported being African American. Further details are available in a related publication (Bierut et al., 2010
Logistic regressions analyses were conducted in PLINK (Purcell et al., 2007
), controlling for sex, age (defined, using quartiles, as 3 dummy measures representing 34 years or younger, 35–39 years, 40–44 years, with 45 years and older as the reference group), study source (see Bierut et al., 2010
for details) and two principal components (computed in EIGENSTRAT/EIGENSOFT, (Price et al., 2006
)) indexing continuous variation in ethnicity. The overall genomic inflation factor was 1.014 after controlling for principal components representing ethnicity, suggesting minimal stratification effects. Genotypes (917,694 autosomal SNPs and 30,448 SNPs on the X-chromosome) were coded as 0, 1 or 2 copies of the reference allele so that risk associated with genotype was modified with each additional copy of the reference allele.
DSM-IV cannabis dependent cases were significantly more likely to be males (68% vs. 43%) and were less likely to report completion of high school (18% vs 10% who did not complete). Consistent with ascertainment, cases were significantly more likely than controls to meet criteria for alcohol (94% vs 46%), nicotine (78% vs 46%) and cocaine (73% vs 21%) dependence, greater use of other illicit drugs (88% vs 47%) as well as report a younger age at initiation of cannabis use (14.5 vs 17.4 years) and more recent (44% vs 19% using in the past year) cannabis involvement. Thus, cases show marked phenotypic similarity to the highly heritable TypeII/B cluster of cannabis dependent subjects (Ehlers et al., 2009
The lowest p-values were obtained for polymorphisms on chromosome 17 (rs1019238 and rs1431318, p-values < E−7
; rs8065311, p-value < E−6
; rs9894332 and rs10521290, p-values < E−5
). The top SNP lies in the ankyrin-repeat and fibronectin type III domain containing 1 (ANKFN1
) gene, which was previously identified in a genomic study of general vulnerability to substance use disorders (Johnson et al., 2009
). The genomic control p-value for rs1019238 was 7.3 × 10−7
suggesting adequate control for ethnic variation. The remaining SNPs (except rs10521290) are non-genic but are in strong LD (r2
of 0.75–0.88) with SNPs in ANKFN1
. Fibronectin repeats are important constituents of a variety of proteins (Bloom and Calabro, 2009
) and ankyrin repeats mediate protein interactions (Li et al., 2006
). While there are links between ANKFN1
(phosphatase and tensin homolog) in tumorigenesis (Syed et al., 2010
), the specific role that ANKFN1
might play in the etiology of cannabis dependence is unknown.
presents odds-ratios for the full sample, as well as results stratified by ethnicity, for SNPs that yielded the most promising results. A number of genes on chromosome 12 (including the p- and q-arm) were associated, including carbohydrate (chondroitin 4) sulfotransferase 11 (CHST11). None of the SNPs that emerged as our top findings have known functional significance. Association results across the full autosomal genome and the X chromosome are shown in .
Table 1 Association results of top 30 SNPs from a genomewide association study of 708 DSM-IV cannabis dependence cases and 2346 controls from SAGE. (All allele frequencies and odds-ratios reflect analysis of increasing risk associated with the reference allele). (more ...)
Manhattan plot showing genomewide association results for autosomal SNPs and SNPs on the X chromosome for DSM-IV cannabis dependence in SAGE.
While our primary analyses accounted for population stratification, we also show, in , findings within each ethnic group. A number of the SNPs showed ethnic variations in allele frequency. In a majority of instances, odds-ratios were comparable across the European- and African-American samples. However, 5 SNPs were found to have exceedingly low MAFs in the European-American sample suggesting that their overall significance in the full sample may be attributable to their effects in the smaller African-American sample.
A limitation of this study is that SAGE was ascertained for alcohol dependence. This led to a high level of comorbidity in the cannabis dependent cases and exposed controls. However, the use of controls with other forms of substance dependence, but not cannabis dependence, protected against signals that may have been less specific. In addition, a series of analyses was conducted that supported the specificity of these findings. First, SNPs in ANKFN1
were not significant when case status was alcohol (also nicotine and cocaine) dependence without comorbid cannabis dependence. Second, we created a polydrug dependence diagnosis, and results were not significant when cannabis dependent individuals were excluded from those with other polydrug dependence. While a sample ascertained for cannabis dependence would be ideal, there are challenges associated with the recruitment of such a sample (Agrawal and Lynskey, 2009
), and to our knowledge, no such samples currently exist. Further, while cannabis dependence is highly heritable, none of our association signals reach genomewide significance, nor do the composite of the top 30 signals explain a considerable proportion of this heritable variance in cannabis dependence. The Benjamini-Hochberg False Discovery Rate (Benjamini and Hochberg, 1995
) for the top signal was .325. Power computations reveal that at MAFs ranging from 15–40%, association signals with odds-ratios exceeding 1.45, such as our top SNP, would be detected. Thus, both replication and meta-analysis are required to confirm or refute these findings.
Characterizing the genetic underpinnings of cannabis dependence remains elusive – to this end, results from GWAS might provide clues into alternative biological pathways that shape the etiology of cannabis dependence.