Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Addict Biol. Author manuscript; available in PMC 2012 July 1.
Published in final edited form as:
PMCID: PMC3117436

A Genomewide Association Study of DSM-IV Cannabis Dependence


Despite twin studies showing that 50–70% of variation in DSM-IV cannabis dependence is attributable to heritable influences, little is known of specific genotypes that influence vulnerability to cannabis dependence. We conducted a genomewide association study of DSM-IV cannabis dependence. Association analyses of 708 DSM-IV cannabis dependent cases with 2,346 cannabis exposed nondependent controls was conducted using logistic regression in PLINK. None of the 948,142 SNPs met genomewide significance (p < E−8). The lowest p-values were obtained for polymorphisms on chromosome 17 (rs1019238 and rs1431318, p-values at E−7) in the ANKFN1 gene. While replication is required, this study represents an important first step towards clarifying the biological underpinnings of cannabis dependence.

Cannabis dependence is the third leading contributor to admissions to chemical dependency treatment settings (Treatment Episode Data Set, 2003) and rates of past year cannabis dependence in the U.S. population have increased by 18% since 1991–1992 (Compton et al., 2004). Several studies report strong heritable influences on cannabis dependence [h2 of 50–70%] (Agrawal and Lynskey, 2006). In an effort to identify contributors to this composite heritability, multiple linkage and association studies have been conducted. Several promising linkage regions (e.g. chromosome 3) exist. Candidate gene studies have largely focused on variants in the gene encoding the cannabinoid receptor CNR1 – results remain equivocal (Agrawal and Lynskey, 2009). To our knowledge, there are currently no published genomewide association studies (GWAS) of cannabis dependence. The goal of this investigation is to conduct association analysis between 948,142 single nucleotide polymorphisms (SNPs) and lifetime DSM-IV cannabis dependence using data on 708 cases with DSM-IV cannabis dependence and 2346 controls who report lifetime cannabis use but do not meet criteria for DSM-IV dependence.

Data for this study come from the Study of Addiction: Genes and Environment (SAGE) (Bierut et al., 2010), which was one of the 8 Phase 1 studies of the Gene Environment Association (GENEVA) consortium (Cornelis et al., 2010). The study was designed primarily to study DSM-IV alcohol dependence as well as other correlated addiction phenotypes. All phenotypic data were collected using the Semi-Structured Assessment for the Genetics of Alcoholism Interview (SSAGA) (Bucholz et al., 1994b). Reliability of SSAGA diagnoses of substance use disorders is good (kappas of 0.7 and higher)(Bucholz et al., 1994a). Case status for the analyses reported here was defined as a lifetime history of DSM-IV cannabis dependence, modified to include cannabis withdrawal (i.e. 3 or more of 7 criteria clustering within a 12-month period). Controls had used cannabis at least once in their lifetime but did not meet criteria for DSM-IV dependence criteria (however, those meeting criteria for dependence on other psychoactive substances, including alcohol, were not excluded). Those who had never used cannabis even once in their lifetime (N=946) were omitted (however, re-doing analyses including them as unaffected showed similar results).

DNA samples were genotyped on the Illumina Human 1M beadchip by the Center for Inherited Diseases Research (CIDR) at Johns Hopkins University. 948,658 SNPs passed data cleaning procedures – further within-sample filtering for autosomal and X-chromosome markers yielded 948,142 markers. HapMap genotyping controls, duplicates, related subjects, and outliers were removed from the sample set. Two thousand and nineteen subjects reported being European American and 1,035 reported being African American. Further details are available in a related publication (Bierut et al., 2010).

Logistic regressions analyses were conducted in PLINK (Purcell et al., 2007), controlling for sex, age (defined, using quartiles, as 3 dummy measures representing 34 years or younger, 35–39 years, 40–44 years, with 45 years and older as the reference group), study source (see Bierut et al., 2010 for details) and two principal components (computed in EIGENSTRAT/EIGENSOFT, (Price et al., 2006)) indexing continuous variation in ethnicity. The overall genomic inflation factor was 1.014 after controlling for principal components representing ethnicity, suggesting minimal stratification effects. Genotypes (917,694 autosomal SNPs and 30,448 SNPs on the X-chromosome) were coded as 0, 1 or 2 copies of the reference allele so that risk associated with genotype was modified with each additional copy of the reference allele.

DSM-IV cannabis dependent cases were significantly more likely to be males (68% vs. 43%) and were less likely to report completion of high school (18% vs 10% who did not complete). Consistent with ascertainment, cases were significantly more likely than controls to meet criteria for alcohol (94% vs 46%), nicotine (78% vs 46%) and cocaine (73% vs 21%) dependence, greater use of other illicit drugs (88% vs 47%) as well as report a younger age at initiation of cannabis use (14.5 vs 17.4 years) and more recent (44% vs 19% using in the past year) cannabis involvement. Thus, cases show marked phenotypic similarity to the highly heritable TypeII/B cluster of cannabis dependent subjects (Ehlers et al., 2009).

The lowest p-values were obtained for polymorphisms on chromosome 17 (rs1019238 and rs1431318, p-values < E−7; rs8065311, p-value < E−6; rs9894332 and rs10521290, p-values < E−5). The top SNP lies in the ankyrin-repeat and fibronectin type III domain containing 1 (ANKFN1) gene, which was previously identified in a genomic study of general vulnerability to substance use disorders (Johnson et al., 2009). The genomic control p-value for rs1019238 was 7.3 × 10−7 suggesting adequate control for ethnic variation. The remaining SNPs (except rs10521290) are non-genic but are in strong LD (r2 of 0.75–0.88) with SNPs in ANKFN1. Fibronectin repeats are important constituents of a variety of proteins (Bloom and Calabro, 2009) and ankyrin repeats mediate protein interactions (Li et al., 2006). While there are links between ANKFN1 and PTEN (phosphatase and tensin homolog) in tumorigenesis (Syed et al., 2010), the specific role that ANKFN1 might play in the etiology of cannabis dependence is unknown.

Table 1 presents odds-ratios for the full sample, as well as results stratified by ethnicity, for SNPs that yielded the most promising results. A number of genes on chromosome 12 (including the p- and q-arm) were associated, including carbohydrate (chondroitin 4) sulfotransferase 11 (CHST11). None of the SNPs that emerged as our top findings have known functional significance. Association results across the full autosomal genome and the X chromosome are shown in Figure 1.

Figure 1
Manhattan plot showing genomewide association results for autosomal SNPs and SNPs on the X chromosome for DSM-IV cannabis dependence in SAGE.
Table 1
Association results of top 30 SNPs from a genomewide association study of 708 DSM-IV cannabis dependence cases and 2346 controls from SAGE. (All allele frequencies and odds-ratios reflect analysis of increasing risk associated with the reference allele). ...

While our primary analyses accounted for population stratification, we also show, in Table 1, findings within each ethnic group. A number of the SNPs showed ethnic variations in allele frequency. In a majority of instances, odds-ratios were comparable across the European- and African-American samples. However, 5 SNPs were found to have exceedingly low MAFs in the European-American sample suggesting that their overall significance in the full sample may be attributable to their effects in the smaller African-American sample.

A limitation of this study is that SAGE was ascertained for alcohol dependence. This led to a high level of comorbidity in the cannabis dependent cases and exposed controls. However, the use of controls with other forms of substance dependence, but not cannabis dependence, protected against signals that may have been less specific. In addition, a series of analyses was conducted that supported the specificity of these findings. First, SNPs in ANKFN1 were not significant when case status was alcohol (also nicotine and cocaine) dependence without comorbid cannabis dependence. Second, we created a polydrug dependence diagnosis, and results were not significant when cannabis dependent individuals were excluded from those with other polydrug dependence. While a sample ascertained for cannabis dependence would be ideal, there are challenges associated with the recruitment of such a sample (Agrawal and Lynskey, 2009), and to our knowledge, no such samples currently exist. Further, while cannabis dependence is highly heritable, none of our association signals reach genomewide significance, nor do the composite of the top 30 signals explain a considerable proportion of this heritable variance in cannabis dependence. The Benjamini-Hochberg False Discovery Rate (Benjamini and Hochberg, 1995) for the top signal was .325. Power computations reveal that at MAFs ranging from 15–40%, association signals with odds-ratios exceeding 1.45, such as our top SNP, would be detected. Thus, both replication and meta-analysis are required to confirm or refute these findings.

Characterizing the genetic underpinnings of cannabis dependence remains elusive – to this end, results from GWAS might provide clues into alternative biological pathways that shape the etiology of cannabis dependence.


Funding support for the Study of Addiction: Genetics and Environment (SAGE) was provided through the NIH Genes, Environment and Health Initiative [GEI] (U01 HG004422). SAGE is one of the genome-wide association studies funded as part of the Gene Environment Association Studies (GENEVA) under GEI. Assistance with phenotype harmonization and genotype cleaning, as well as with general study coordination, was provided by the GENEVA Coordinating Center (U01 HG004446). Assistance with data cleaning was provided by the National Center for Biotechnology Information. Support for collection of datasets and samples was provided by the Collaborative Study on the Genetics of Alcoholism (COGA; U10 AA008401), the Collaborative Genetic Study of Nicotine Dependence (COGEND; P01 CA089392), and the Family Study of Cocaine Dependence (FSCD; R01 DA013423). Funding support for genotyping, which was performed at the Johns Hopkins University Center for Inherited Disease Research, was provided by the NIH GEI (U01HG004438), the National Institute on Alcohol Abuse and Alcoholism, the National Institute on Drug Abuse, and the NIH contract "High throughput genotyping for studying the genetic contributions to human disease" (HHSN268200782096C). Other support includes DA23668 (A.A.), DA18860 & DA18267 (M.T.L.).


Table of Author Contributions


Financial Disclosures

Drs. LJ Bierut, J. Rice, A. Goate and S Saccone are listed as inventors on the patent "Markers for Addiction" (US 20070258898): covering the use of certain SNPs in determining the diagnosis, prognosis, and treatment of addiction. Dr. Bierut has acted as a consultant for Pfizer, Inc. in 2008. All other authors report no competing interests.

Reference List

  • Agrawal A, Lynskey M. The Genetic Epidemiology of Cannabis Use, Abuse and Dependence: A Review. Addiction. 2006;101:801–812. [PubMed]
  • Agrawal A, Lynskey MT. Candidate genes for cannabis use disorders: findings, challenges and directions. Addiction. 2009;104:518–532. [PMC free article] [PubMed]
  • Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B. 1995;57:289–300.
  • Bierut L, Agrawal A, Bucholz K, Doheny KF, Laurie CC, Pugh EW, Fisher S, Fox L, Howells B, Bertelsen S, Hinrichs A, Almasy L, Breslau N, Culverhouse R, Dick D, Edenberg H, Foroud T, Grucza RA, Hatsukami D, Hesselbrock V, Johnson E, Kramer J, Krueger R, Kuperman S, Lynskey MT, Mann K, Neuman R, Nothen M, Nurnberger J, Jr, Porjesz B, Ridinger M, Saccone NL, Schuckit M, Tischfield J, Wang JC, Rietschel M, Goate A, Rice JP. A Genome-wide Association Study of Alcohol Dependence. Proc Natl Acad Sci U S A. 2010 In Press. [PubMed]
  • Bloom L, Calabro V. FN3: a new protein scaffold reaches the clinic. Drug Discov Today. 2009;14:949–955. [PubMed]
  • Bucholz KK, Cadoret R, Cloninger CR, Dinwiddie SH, Hesselbrock VM, Nurnberger JI, Jr, Reich T, Schmidt I, Schuckit MA. A new, semi-structured psychiatric interview for use in genetic linkage studies: a report on the reliability of the SSAGA. J Stud Alcohol. 1994a;55:149–158. [PubMed]
  • Bucholz KK, Cadoret RJ, Cloninger RC, Dinwiddie SH, Hesselbrock V, Nurnberger JI, Reich T, Schmidt I, Schuckit MA. A New, Semi-Structured Psychiatric Interview For Use In Genetic Linkage Studies. J Stud Alcohol. 1994b;55:149–158. [PubMed]
  • Compton WM, Grant BF, Colliver JD, Glantz MD, Stinson FS. Prevalence of marijuana use disorders in the United States: 1991–1992 and 2001–2002. JAMA. 2004;291:2114–2121. [PubMed]
  • Cornelis MC, Agrawal A, Cole JW, Hansel NN, Barnes KC, Beaty TH, Bennett SN, Bierut LJ, Boerwinkle E, Doheny KF, Feenstra B, Feingold E, Fornage M, Haiman CA, Harris EL, Hayes MG, Heit JA, Hu FB, Kang JH, Laurie CC, Ling H, Manolio TA, Marazita ML, Mathias RA, Mirel DB, Paschall J, Pasquale LR, Pugh EW, Rice JP, Udren J, van Dam RM, Wang X, Wiggs JL, Williams K, Yu K. The gene, environment association studies consortium (GENEVA): maximizing the knowledge obtained from GWAS by collaboration across studies of multiple conditions. Genet Epidemiol. 2010 [PMC free article] [PubMed]
  • Ehlers CL, Gilder DA, Gizer IR, Wilhelmsen KC. Heritability and a genome-wide linkage analysis of a Type II/B cluster construct for cannabis dependence in an American Indian community. Addict Biol. 2009;14:338–348. [PMC free article] [PubMed]
  • Johnson C, Drgon T, McMahon FJ, Uhl GR. Convergent genome wide association results for bipolar disorder and substance dependence. Am J Med Genet B Neuropsychiatr Genet. 2009;150B:182–190. [PubMed]
  • Li J, Mahajan A, Tsai MD. Ankyrin repeat: a unique motif mediating protein-protein interactions. Biochemistry. 2006;45:15168–15178. [PubMed]
  • Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38:904–909. [PubMed]
  • Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, Sham PC. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81:559–575. [PubMed]
  • Syed AS, D'Antonio M, Ciccarelli FD. Network of Cancer Genes: a web resource to analyze duplicability, orthology and network properties of cancer genes. Nucleic Acids Res. 2010;38:D670–D675. [PMC free article] [PubMed]
  • Treatment Episode Data Set. Substance Abuse Treatment Admissions by Primary Substance of Abuse, According to Sex, Age Group, Race, and Ethnicity, funded by the Substance Abuse and Mental Health Services Administration, DHHS. 2003. The latest data are available at 800-729-6686 or online at