|Home | About | Journals | Submit | Contact Us | Français|
Alcohol dependence is a complex disease, and although linkage and candidate gene studies have identified several genes associated with the risk for alcoholism, these explain only a portion of the risk.
We carried out a genome-wide association study (GWAS) on a case-control sample drawn from the families in the Collaborative Study on the Genetics of Alcoholism. The cases all met diagnostic criteria for alcohol dependence according to the Diagnostic and Statistical Manual of the American Psychiatric Association Fourth Edition (DSM-IV); controls all consumed alcohol but were not dependent on alcohol or illicit drugs. To prioritize among the strongest candidates, we genotyped most of the top 199 SNPs (p ≤ 2.1 × 10−4) in a sample of alcohol dependent families and performed pedigree-based association analysis. We also examined whether the genes harboring the top SNPs were expressed in human brain or were differentially expressed in the presence of ethanol in lymphoblastoid cells.
Although no single SNP met genome-wide criteria for significance, there were several clusters of SNPs that provided mutual support. Combining evidence from the case-control study, the followup in families, and gene expression provided strongest support for the association of a cluster of genes on chromosome 11 (SLC22A18, PHLDA2, NAP1L4, SNORA54, CARS, and OSBPL5) with alcohol dependence. Several SNPs nominated as candidates in earlier GWAS studies replicated in ours, including CPE, DNASE2B, SLC10A2,ARL6IP5, ID4, GATA4, SYNE1 and ADCY3.
We have identified several promising associations that warrant further examination in independent samples.
Alcohol dependence (alcoholism) is a major health and social issue affecting 4-5% of the United States population at any given time (Li et al., 2007), with a lifetime prevalence of 12.5% (Hasin et al., 2007). Alcohol dependence is characterized by serious problems in multiple domains. Family, twin and adoption studies have consistently demonstrated a substantial genetic contribution to disease etiology (Kendler et al., 1994; McGue, 1999; Nurnberger et al., 2004; Pickens et al., 1991). Alcohol dependence is a complex disease in which both genetic and environmental factors affect susceptibility.
Strategies for identifying genes in which variations contribute to the risk for alcohol dependence have employed linkage analysis or candidate gene approaches (Edenberg and Foroud, 2006). These methods have led to the identification of several genes associated with alcohol dependence, including GABRA2 (Covault et al., 2004; Drgon et al., 2006; Edenberg et al., 2004; Enoch et al., 2008; Enoch et al., 2006; Fehr et al., 2006; Lappalainen et al., 2005; Lind et al., 2008; Soyka et al., 2008), ADH4 (Edenberg and Foroud, 2006; Edenberg et al., 2006; Guindalini et al., 2005; Kuo et al., 2008; Luo et al., 2006; Luo et al., 2005b), GABRG3 (Dick et al., 2004), CHRM2 (Luo et al., 2005a; Wang et al., 2004), NFKB1 (Edenberg et al., 2008b), OPRK1 (Edenberg et al., 2008a; Xuei et al., 2006), PDYN (Xuei et al., 2006), NPY2R (Wetherill et al., 2008), ANKK1/DRD2 (Dick et al., 2007b); CHRNA5 (Wang et al., 2009); GRM8 (Chen et al., 2009), TACR3 (Foroud et al., 2008) and GABRR2 (Xuei et al., 2010). However, the effect of variation in each of these genes is small, and much of the genetic contribution to the risk for alcoholism remains to be discovered.
We completed a genome-wide association study (GWAS) to identify genes contributing to alcohol dependence, defined according to DSM-IV criteria (American Psychiatric Association, 1994). Both the alcohol dependent cases and the controls have completed a rigorous clinical evaluation, with controls being drinkers who are free from alcohol dependence or abuse by several diagnostic classification systems (DSM-IIIR, DSM-IV, ICD-10) (American Psychiatric Association, 1987; American Psychiatric Association, 1994; World Health Organization, 1993), and also free of dependence on illicit drugs. To prioritize among the most promising single nucleotide polymorphisms (SNPs) identified from the GWAS, we genotyped the top SNPs in a sample of alcohol dependent families and performed family based association analysis, and tested whether they fell within or near genes whose expression is affected by alcohol exposure, and were expressed in brain.
Alcohol dependent probands were ascertained through alcohol treatment programs and evaluated at seven centers in the United States: Indiana University, State University of New York Health Science Center Brooklyn, University of Connecticut, University of Iowa, University of California/San Diego, Washington University/St. Louis and Howard University as part of the Collaborative Study on the Genetics of Alcoholism (Begleiter et al., 1995; Edenberg, 2002; Foroud et al., 2000; Reich et al., 1998). The institutional review boards of all participating institutions approved the study. After providing informed consent, the probands and their relatives were administered a validated poly-diagnostic instrument, the Semi-Structured Assessment for the Genetics of Alcoholism (SSAGA) (Bucholz et al., 1994; Hesselbrock et al., 1999), which allows assessment of alcohol dependence under several diagnostic systems. Details of the ascertainment and assessment have previously been published (Begleiter et al., 1995; Foroud et al., 2000; Reich et al., 1998) and are available at zork.wustl.edu/niaaa/coga_instruments/resources.html.
The same seven centers also recruited community probands through driver's license records, random mailings to employees and students at a university, and attendees at medical and dental clinics. After providing informed consent, the community probands and their relatives were administered the same SSAGA interview as the alcohol dependent probands and their families.
For the GWAS, a sample of genetically unrelated cases and controls were selected from this pool of alcohol dependent and community ascertained families. Cases were all alcohol dependent by DSM-IV criteria at some time during their lives, and were selected from the families ascertained through alcohol dependent probands. The selected controls were required to have consumed alcohol but not to have a diagnosis of alcohol abuse, dependence or harmful use by any of the 4 diagnostic systems assessed in the SSAGA (Feighner, DSM-IIIR, DSM-IV, ICD-10) at any time in their lives. Because there might be shared genetic vulnerability with other substance use disorders, the selected controls also could not meet criteria for DSM-IIIR or DSM-IV diagnoses of abuse or dependence on cocaine, marijuana, opioids, sedatives, or stimulants. Controls were selected from both the community recruited families and the families recruited through an alcohol dependent proband, but could not share a known common ancestor with a case. Controls older than 25 years were preferentially selected, to increase the probability that they were beyond the peak age of risk for developing alcohol dependence. A subset of the COGA participants were assessed on multiple occasions; if a subject was assessed more than once, only one whose diagnosis or lack thereof was consistent across time was selected.
Genotyping was performed by the Center for Inherited Disease Research (CIDR). DNA sources included blood (n=1453) and lymphoblastoid cell lines (LCL, n=492). Genotyping was performed using the Illumina Infinium II assay protocol (Gunderson et al., 2006) with hybridization to Illumina HumanHap1M BeadChips (Illumina, San Diego, CA, USA), which contain 1,199,187 markers with a mean spacing of 2.4 kb. Allele cluster definitions for each SNP were determined using Illumina BeadStudio Genotyping Module version 3.1.14 and the combined intensity data from 96% of study samples. The resulting cluster definition file was used on all study samples to determine genotype calls and quality scores. Genotype calls were made when a genotype yielded a quality score (Gencall value) of 0.15 or higher. The final raw dataset released by CIDR to the investigators and to dbGaP contained 1,041,465 SNPs for 1,945 unique DNA samples from case and control subjects. Twenty seven samples were removed due to poor sample quality before release of the dataset on dbGaP (http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap; Accession number: phs000125.v1.p1). Blind duplicate reproducibility was 99.97%.
Starting with the dbGaP data, additional quality control measures were applied to both the samples and the SNPs. Samples having genotypes for at least 98% of the SNPs were considered for inclusion in analyses. These samples were rigorously checked for cryptic relatedness, population stratification, and related issues. Thirteen additional samples were removed from further analyses due to poor sample quality (n=4) or cryptic relatedness among subjects (n=9). A principal component-based analysis was performed in PLINK (Purcell et al., 2007) to cluster these samples along with HapMap reference samples (CEU, YRI, CHB, and JPT) to assign the study subjects to groups of predominantly European and African ancestry. The final European American (EA) sample included 847 alcohol dependent cases and 552 controls (n=1,399 individuals). The African American (AA) sample contained 345 cases and 140 controls (n=485 individuals). The remaining 21 individuals did not cluster with either of the two samples and were not analyzed.
SNPs with a call rate of 98% or greater in the EAs (n=1,015,550) were included and subjected to further quality control. From these, SNPs were removed if the minor allele frequency was less than 0.01 in the combined case and control dataset (n=161,048) or if among those with minor allele frequency ≥ 0.01 there was significant deviation from Hardy Weinberg equilibrium (p<10−4) in the controls (n=1,127). The final dataset for analysis of EAs consisted of 853,375 SNPs that passed all quality control measures. Similar quality control thresholds were applied to the 1,010,230 SNPs with a call rate of 98% or greater in the AAs. This resulted in removal of 68,600 SNPs with a minor allele frequency below 0.01 in the combined AA case and control sample, and 332 SNPs that demonstrated significant deviation from Hardy Weinberg equilibrium in AA controls. The final dataset for the AA sample consisted of 941,298 SNPs. Positions of all SNPs are from dbSNP build 130, genome build 36.3.
Given the limited number of non-EA subjects in the sample, and that self-defined ethnicity carries non-genetic as well as genetic differences, we focused our primary analyses on the larger EA subsample. Logistic regression covarying for gender was employed to test for the association of each SNP with alcohol dependence, using an additive model. Odds ratios and p-values were computed to assess the strength of the association. All analyses were performed using PLINK (Purcell et al., 2007).
Previously, stronger association results were often obtained when the alcohol-dependent cases were limited to those with greater correlates of disease severity, such as earlier age of onset, comorbid drug dependence or higher likelihood of conduct disorder symptoms (Agrawal et al., 2006; Dick et al., 2007a; Edenberg et al., 2008b; Foroud et al., 2008). We performed secondary analyses to identify potential associations with early onset alcohol dependence. A median split of our sample (Table 1) was performed and those alcohol dependent cases with an onset of 22 years or younger (n=454) were classified as having earlier onset of alcohol dependence. Logistic regression using the additive model with sex as a covariate was employed to test for association between the genotyped SNPs and the phenotype of early onset alcohol dependence.
Previously, a set of 262 genetically informative, multiplex alcohol dependence pedigrees, each with at least three first degree relatives who met lifetime criteria for both DSM-IIIR alcohol dependence (American Psychiatric Association,1987) and Feighner definite alcoholism (Feighner et al., 1972) were selected from the larger COGA sample and used in linkage and family based association studies (Edenberg et al., 2004; Edenberg and Foroud, 2006; Foroud et al., 2000; Reich et al., 1998); 59 additional trios were also genotyped. After reviewing the results from the primary analysis of DSM-IV alcohol dependence, the 199 SNPs with the smallest p values for association (p ≤ 2.1 × 10−4) were selected for genotyping in this family-based sample. Genotypes were completed on 180 of these SNPs; the others did not design into assays or did not provide useable data.
Some members of these families, primarily probands, were included in the GWAS case control sample. There were 426 individuals included in the EA GWAS sample (326 cases; 100 controls) that were also a member of the families used for replication. The analytic methods applied to detect association in the case control and family-based tests of association are statistically independent (Abecasis et al., 2000; Vogel et al., 2009; Zuo et al., 2009).
Genotyping assays were designed for the Sequenom MassArray system (Sequenom, San Diego, CA) using MassArray Assay Design Software. Genotyping used iPLEX Gold assays (Sequenom, San Diego, CA), with alleles discriminated by mass spectrometry. Assays were tested on two independent groups, 40 unrelated EA individuals and 40 unrelated African-American (AA) individuals obtained from Coriell Cell Repositories. SNPs that were not in Hardy Weinberg equilibrium in both groups were not genotyped in the COGA families. All SNPs were tested for Mendelian inheritance using the program PEDCHECK (O'Connell and Weeks, 1998).
Family based association analyses were performed using the Pedigree Disequilibrium test (PDT) (Martin et al., 2000) as implemented in the program UNPHASED (version 2.404) (Dudbridge, 2003). Affected individuals were defined as those meeting criteria for DSM-IV alcohol dependence. The PDT utilizes data from all available trios in a family, as well as discordant sibships. We report results using the PDTaverage option, which weights each family equally in computing the overall test statistic. Analysis was also performed classifying as affected only those DSM-IV alcohol dependence cases who also met criteria for early onset, defined as with the case control sample, as onset by age 22 or younger. After examining results in the family sample, five additional SNPs (cSNPs or SNPs in 3′ UTRs) from regions with evidence for replication were selected for genotyping in the family sample.
In a separate study, we are analyzing the effects of ethanol exposure on gene expression in immortalized lymphoblastoid cell lines (LCLs). Details of the gene expression studies will be reported separately; here we note whether there was an effect of alcohol exposure on the expression of any of our top candidate genes. Briefly, 42 LCLs from subjects in the COGA study were either treated for 24 h with 75 mM ethanol or left untreated. The RNA was extracted and purified, and aliquots were analyzed on Affymetrix GeneChip® Human Genome U133 Plus 2.0 microarrays following standard procedures. Results were analyzed by ANOVA with factors of treatment, phenotype (cells from alcoholics or controls) and sex (Edenberg et al., in preparation). False discovery rates were calculated using the Storey qvalue package (Storey and Tibshirani, 2003).
Gene expression was assayed using standard Affymetrix procedures on samples from 9 different brain regions (prefrontal cortex, cerebral cortex, thalamus, visual cortex, hippocampus, amygdala, caudate, putamen and cerebellum) of 4 individuals (1 each male alcoholic, female alcoholic, male control, female control), obtained from the NIAAA-supported brain bank at the Tissue Resource Center located in the Neuropathology Unit of the Department of Pathology, University of Sydney, Australia. Hybridization was to GeneChip® Human Gene 1.0 ST Arrays. Genes were presumed expressed in brain if the average expression level of the 4 arrays from any region was higher than the average expression level of the negative probe sets on the arrays (Edenberg et al., in preparation). Genes from the expression studies were matched to genes from the SNP microarrays by matching gene symbols based on Affymetrix annotations.
The final EA case-control analytic sample included 1399 individuals. Cases all met lifetime DSM-IV criteria for alcohol dependence; many were also dependent upon other drugs (Table 1). Controls all had consumed alcohol, but were not affected by alcohol or drug abuse or dependence; most were well past the primary age of risk for alcohol or illicit drug disorders (Table 1). Thus the controls provide a strong phenotypic contrast to the cases.
Association of the SNPs with alcohol dependence was carried out employing an additive model, with sex as a covariate. The λ value (inflation factor) after these adjustments was 1.049; the Q-Q plot for this analysis is shown in Figure 1. None of the SNPs analyzed met conventional criteria for genome-wide significance. However, eleven SNPs had p-values <10−5, and a total of 93 SNPs had p-values ≤10−4 (Supplemental Table 1 shows data from all SNPs with p≤10−3). There were many genes or regions in which surrounding SNPs provided additional support for association (Supplemental Table 1). A region of chromosome 11 (Figure 2A) had many SNPs among those with lowest p values. This was not due to a single SNP in high LD with other SNPs; including the most significant SNP (rs4758533) as a covariate in the logistic regression model decreased the evidence of association with several SNPs near the right end but did not substantially alter the evidence of association with most of the SNPs in the region (see also Figure 2B, which shows that most of the significant SNPs in the chromosome 11 region are not in high LD with rs4758533).
Secondary analysis tested for association within the subset of cases that met criteria for early onset of alcohol dependence (≤22 years; n=454). No SNP met genome-wide significance. The SNPs with the smallest p values in the two analyses substantially overlap: 25 of the 100 SNPs with p ≤10−4 for alcohol dependence also had p ≤10−4 for early onset alcohol dependence, despite the smaller sample size (Supplemental Table 1). Notably, the odds ratios for the early onset phenotype and the dependence phenotype are very similar: of the top 199 SNPs for dependence, 166 had odds ratios for both phenotypes within ± 0.10 of each other.
Data from the AA case control sample provided nominal evidence of association for DSM-IV alcohol dependence for six SNPs among the top 985 from European Americans; these were in TMEM132C, EPHA7, OPA3, KCNMA1 (encoding a potassium large conductance calcium-activated channel important in neuronal excitability), DMRTA2 and SPTA1 (Supplementary Table 1). EPHA7 encodes ephrin receptor A7; ephrin receptors are tyrosine kinases implicated in the development of the nervous system. A secondary analysis of the smaller AA sample showed a different set of SNPs that provided evidence for association with alcohol dependence (Supplementary Table 2). Among the 790 SNPs at p≤ 10−3, 41 were nominally significant in the EA group; prominent among them were 7 SNPs (several, but not all, in high LD) in the region of SELL and SELE, two selectins (adhesion/homing receptors), 4 in LOC91431, encoding the protein prematurely terminated mRNA decay factor-like, and SNPs in PPARG (peroxisome proliferator-activated receptor gamma), CTNN2 (catenin, alpha 2). Also of note is a SNP between the leptin receptor (LEPR) and PDE4B, a cAMP-specific phosphodiesterase that has been implicated in schizophrenia and bipolar disorder.
Ten of the 180 SNPs genotyped in the sample of 262 genetically informative, alcohol dependent families provided nominally significant (p ≤ 0.05) evidence of association with alcohol dependence (Table 2, Supplementary Table 1); 8 of these were also nominally significant for the early onset phenotype. Five of these SNPs lie in one region of chromosome 11 (Figure 2A), which contained CARS (all 3 tested SNPs replicated), OSBPL5, NAP1L4. The other SNPs were in BBX, SLC9A8, OPA3, TOX2, and near CD53. Several additional SNPs near other clusters of significant SNPs were also genotyped and also provided evidence for association (Supplementary Table 1), including additional SNPs in and near CARS, OSPBL5, SLC9A8, SLC2A14, SOX5, COL8A1 and ADCK1.
Fifteen of the SNPs with the smallest p values for alcohol dependence (p ≤ 2.1 × 10−4) provided nominal evidence of association in the family sample with the early onset alcohol dependence phenotype. In addition to the SNPs noted above, these included SLC37A3, KCNMA1, CDH8, ZNF608 and API5. A few candidate genes in which a SNP had p≤2.1 × 10−4 in the GWAS or cluster of SNPs with p≤10−3 were also tested in families, and significant support was found for CAT, and GRIN2C (Supplementary Table 1).
We also tested whether the genes associated with alcohol dependence were differentially affected by ethanol exposure in LCLs. Forty of the SNPs listed in Table 2 were in genes expressed in brain and also significantly affected by ethanol in LCLs. One of the SNPs in CARS, rs729662 (synonymous, Pro706 in exon 19), was ranked 80 in the GWAS and was also significant in the family data (Supplemental Table 1).
Although the overall number of subjects was modest for a GWAS, the sample studied here had extensive phenotypic characterization and a strong contrast in substance dependence between affected and unaffected subjects. The affected subjects all met DSM-IV criteria for alcohol dependence, whereas the controls did not meet criteria for alcohol dependence or abuse or harmful use, nor were they dependent on other illicit drugs. Many of the affected subjects (56.5%) were also dependent upon an illicit drug; while it is possible that risk for a more general substance dependence contributed to the association signal, testing for this was not judged to be feasible, given the sample size and reduction in power that would result. None of the SNPs tested met conventional criteria for genome-wide significance, probably due to the modest size of the sample. Therefore, it was important to use other lines of evidence to prioritize potential candidates. Regions in which several SNPs provided evidence for association were found. Evidence of association, with similar estimates of effect size (odds ratios), was found for many SNPs when the sample of alcohol dependent individuals was dichotomized and only those with early onset alcohol dependence were defined as cases. Early onset generally marks a group of more severe cases, often with comorbid psychiatric problems. Analyses of this subgroup have yielded smaller p values, despite the reduced sample size, in some previous studies (Agrawal et al., 2006; Dick et al., 2007a; Edenberg et al., 2008b; Foroud et al., 2008). We assessed segregation within alcohol dependent families as a means to prioritize among our SNP results. We also used gene expression data as additional lines of support, testing whether a gene in or nearest the SNP was expressed in human brain samples. We tested whether the expression of the gene in LCLs was altered by ethanol exposure; 80% of the genes differentially expressed in LCLs after ethanol exposure are also expressed in at least one of the 9 brain regions we analyzed (Edenberg et al., in preparation).
Overall, the convergence of evidence supports a region on chromosome 11 as the strongest candidate. It contained many of the top SNPs from the case control sample, and several SNPs genotyped in the family sample support the association of this region with alcohol dependence and the early onset of dependence (Table 2; Figure 2A,B; Supplementary Table 1). This region contains 6 genes: SLC22A18 (solute carrier family 22, member 18, a poly-specific organic cation transporter), PHLDA2 (pleckstrin homology-like domain, family A, member 2, important in regulating placental growth), NAPIL4 (nucleosome assembly protein 1-like 4, a likely histone chaperone), CARS (cysteinyl-tRNA synthetase), OSBPL5 (oxysterol-binding protein-like protein 5, an intracellular lipid receptor that interacts with retinoic acid and estradiol), and SNORA54 (small nucleolar RNA, H/ACA box 54). The four genes that were assayed (PHLDA2, NAPIL4, CARS, OSBPL5) are expressed in the human brain. In LCLs, expression of SLC22A18 and PHLDA2 are increased 9-14% by ethanol exposure (FDR < 0.05), while expression of NAP1L4 is decreased 8% by ethanol (FDR <10−5; Edenberg et al., in preparation). These lines of evidence suggest that one or more of these genes is a good candidate for affecting the risk for alcoholism. Although not previously considered candidates for affecting the risk for alcoholism, several of these genes have functions that might relate to alcoholism, such as growth regulation, cation transport and lipid signaling. Careful dissection of a Quantitative Trait Locus (QTL) hotspot on mouse distal chromosome 1 (Mozhui et al., 2008) that includes QTLs affecting alcohol dependence (Buck et al., 2002), alcohol withdrawal (Crabbe, 1996), and alcohol induced locomotor activity (Downing et al., 2003) showed that genes located in the QTL had a trans effect on expression in nervous tissue of a set of aminoacyl tRNA synthetases (Mozhui et al., 2008). As noted by Mozhui et al. (2008), the nervous system depends heavily on finely-tuned protein metabolism in dendrites and axons (Chang et al., 2006; Malgaroli et al., 2006). Thus, CARS is a reasonable candidate.
One of the top-ranked SNPs was in the promoter region of BBX (Table 2); this SNP was also significant in the family sample, for both alcohol dependence and early onset (Table 2). Several additional SNPs just upstream of BBX were associated with both phenotypes in the GWAS (Supplementary Table 1). BBX is widely expressed in human tissues, and encodes the human homolog of Drosophila Bobbysox, an HMG-BOX transcription factor. The potential role of BBX in affecting risk for alcohol dependence is likely to be complex; BBX mRNA levels are themselves significantly increased by alcohol exposure (13% increase in lymphoblastoid cell lines, FDR = 9 × 10−5; Table 2). A SNP in intron 1 of KCNMA1 provided evidence in subjects of both European and African ancestry, and in the family sample (Table 2), KCNMA1 is expressed in brain, and encodes potassium large conductance calcium-activated channel, subfamily M, alpha member 1, a protein important in controlling neuronal excitability.
We examined results in our case control sample for genes that were previously associated with alcoholism in our family-based sample. Among the best-supported findings was the ANKK1/DRD2 region (Dick et al., 2007b), in which 24 SNPs were nominally associated with dependence and 22 with early onset; SNPs extending through TTC12 were also significant, supporting results from Yang et al. (Yang et al., 2007). Three GWAS SNPs in NFKB1 were nominally associated with early onset alcoholism, as were more extending upstream, consistent with our previous work (Edenberg et al., 2008). Thirty-two SNPs in GRM8 (Chen et al., 2009), encoding metabotropic glutamate receptor 8, were associated with early onset of dependence. There was also nominal support for PDYN (Xuei et al., 2006), CHRNA3 and CHRNA5 (Wang et al., 2009), and (among AA only) CHRM2 (Wang et al., 2004). In the ADH gene cluster (Edenberg et al., 2006), SNPs in ADH4 were not significant in this GWAS, although 15 SNPs (particularly in the region between ADH1B and ADH1A) were. Genes encoding GABAA receptors continued to provide evidence for association with alcoholism. A SNP in GABRR2 was among those with the smallest p value in this GWAS (6 × 10−5); that SNP was not itself significant in the earlier family study, although other SNPs in the gene were (Xuei et al., 2010). Although GABRA2 has been associated with alcohol dependence (Edenberg et al., 2004) in many samples (Bauer et al., 2007; Covault et al., 2004; Drgon et al., 2006; Enoch et al., 2008; Enoch et al., 2006; Fehr et al., 2006; Lappalainen et al., 2005; Soyka et al., 2008), there was no evidence in this case-control GWAS, although 5 SNPs in GABRG1 were nominally associated with alcoholism, and 11 with early onset (cf Covault et al., 2008). Six SNPs in GABRG3 (Dick et al., 2004) were associated with alcohol dependence in the EA sample, and 17 SNPs in the much smaller AA sample. In GABRA1 (Dick et al., 2006), we found 3 SNPs associated with alcohol dependence and 4 with early onset. Many SNPs in and near GABRG2 were also associated with dependence (and early onset). Thus, overall, there continues to be evidence that variations in GABAA receptors affect alcohol dependence.
Several SNPs nominated as candidates in earlier GWAS studies replicated in ours. Johnson et al. (Johnson et al., 2006) reported 51 regions, containing 181 SNPs, that provided the strongest evidence for association in a GWAS of alcohol dependence using pooled samples of 120 cases and 160 controls drawn from the COGA study (we do not know the overlap with the present study). Among the 68 of those SNPs for which we had data, 4 provided evidence of replication (p<0.02; Table 3). CPE encodes carboxypeptidase E, present in the central nervous system (Lynch et al., 1990), which catalyzes an important step in the processing of peptide hormones and neurotransmitters (Hook et al., 2008). A pair of closely linked SNPs in DNASE2B, encoding deoxyribonuclease II beta, provided evidence for replication, and other SNPs in that region provided further support. SLC10A2, which encodes a sodium/bile acid cotransporter, also replicated; expression of SLC10A2 is regulated in part by retinol (Neimark et al., 2004). Three other SNPs replicated when the subgroup with early-onset alcoholism was analyzed: rs35164 just downstream of CDH11 (a type II classical cadherin), rs1927384, which lies between FGF14 (fibroblast growth factor 14) and TPP2 (tripeptidyl peptidase II), and rs6729553 in DNAH6 (axonemal dynein heavy chain 6, a microtubule-associated motor protein important in retrograde axonal organelles (Schnapp and Reese, 1989)).
Treutlein et al. (2009) recently reported results from a GWAS of German alcoholics. Their cases were male alcoholics who had been hospitalized for treatment or prevention of severe withdrawal. We had data for 114 of their top 140 SNPs (121 at p<10−4 in their study and 19 they nominated from rodent studies); 14 were significant in our primary analysis of alcohol dependence; 11 of these were also significant in our smaller subset of early-onset alcoholics (Table 4). Only one of these, rs13273672 in GATA4, was among the 15 SNPs for which Treutlein et al. reported confirmation in their follow-up (Treutlein et al., 2009). Among the SNPs for which we had replication, 6 have the same risk allele; these lie in or near ARL6IP5, ID4, GATA4, SYNE1, ADCY3 and PRKCA; three of these (GATA4, ADCY3, SYNE1) were among the top 6 with our early onset phenotype. These are all good candidate genes; two regulate transcription (ID4, GATA4), two (ADCY3, PRKCA) regulate important second messenger systems, one (ARL6IP5) inhibits the glutamate transporter EAAC1 and one (SYNE1) is associated with autosomal recessive spinocerebellar ataxia 8. PRKCA expression is lower in the nucleus accumbens of alcohol-preferring P rats after operant ethanol self-administration (Rodd et al., 2008), and is reduced by chronic alcohol in vertebrae of Sprague-Dawley rats after chronic binge exposure to ethanol (Himes et al., 2008). We have confirming evidence for three of these (PRKCA, ADCY3, ARL6IP5) in our African-American sample.
A recently submitted manuscript (Bierut et al., in press) describes results from the Study of Addiction: Genetics and Environment consortium, which has performed a GWAS using the phenotype of alcohol dependence in a sample of cases and controls ascertained through three different studies (dbGaP accession phs000092.v1.p1). One of the three studies was COGA, with 612 EA alcohol dependent individuals and 413 EA controls included in both our analysis and the SAGE analyses. Two other studies also provided EA cases: one recruited cases with nicotine dependence through a population screening design (n=343 alcohol dependent EA cases) and the other recruited cases with cocaine dependence through treatment centers (n=278 alcohol dependent EA cases). While 52% of EA COGA subjects also reported another substance dependence, the inclusion of cases in the SAGE analysis recruited for different primary diagnoses will likely introduce a number of novel genes contributing to alcohol dependence and another comorbid condition, either nicotine dependence or cocaine dependence. For this reason, while we might hope for some commonality between the results from the COGA and SAGE studies due to the overlapping samples, it is not surprising that in practice results of GWAS from each study are quite different.
Overall, although we did not detect any SNP that met genome-wide significance, we have assembled several different lines of evidence to prioritize SNPs and genes from among the results for further study. In a multi-stage follow-up to the GWAS, we: a) analyzed a set of top SNPs by analyzing transmission in a family sample, which provided additional support for some of the SNPs; b) determined which among the top SNPs were expressed in human brain; and c) determined which top SNPs are affected by exposure to alcohol in human LCLs. The convergence of evidence from the GWAS, family-based association analyses and the response of genes to ethanol exposure provides a set of interesting candidate genes for further analyses in larger samples.
SNPs that were significant for alcohol dependence in the European-American GWAS study at p ≤ 10−3 are listed in order along the genome (build 36.3). Genes within 15 kb of the SNP are listed (blank means no gene is within 15 kb); gene name is based upon the gene symbol. MAF = minor allele frequency. Alcohol dependence is the p value for association in the GWAS. Early onset alcohol dependence is the p value for the GWAS with affected status defined as having an onset of alcohol dependence at or below the median (22 years). Analyses in the Family sample use these same phenotypic definitions but are analyzed by PDT; only some SNPs were tested (see text). The MAF, p-value and odds ratio for alcohol dependence in the smaller African-American sample of the GWAS are presented for comparison.
SNPs that were significant for alcohol dependence in the African-American GWAS study at p ≤ 10−3 are listed in order along the genome (build 36.3). Genes within 15 kb of the SNP, based upon the genome build (blank means no gene is within 15 kb). MAF = minor allele frequency. Alcohol dependence is the p value for association in the GWAS. Because of the small number of subjects, we did not analyze early onset. The MAF and p-values for alcohol dependence in the European-American sample of the GWAS are presented for comparison.
We thank Kim Doheny and Elizabeth Pugh from CIDR and Justin Paschall from the NCBI dbGaP staff for valuable assistance with genotyping and quality control in developing the dataset available at dbGaP.
The Collaborative Study on the Genetics of Alcoholism (COGA), Principal Investigators B. Porjesz, V. Hesselbrock, H. Edenberg, L. Bierut, includes ten different centers: University of Connecticut (V. Hesselbrock); Indiana University (H.J. Edenberg, J. Nurnberger Jr., T. Foroud); University of Iowa (S. Kuperman, J. Kramer); SUNY Downstate (B. Porjesz); Washington University in St. Louis (L. Bierut, A. Goate, J. Rice, K. Bucholz); University of California at San Diego (M. Schuckit); Howard University (R. Taylor); Rutgers University (J. Tischfield); Southwest Foundation (L. Almasy), and Virginia Commonwealth University (D. Dick). A. Parsian and M. Reilly are the NIAAA Staff Collaborators. We continue to be inspired by our memories of Henri Begleiter and Theodore Reich, founding PI and Co-PI of COGA, and also owe a debt of gratitude to other past organizers of COGA, including Ting-Kai Li (currently a consultant with COGA), P. Michael Conneally, and Raymond Crowe, for their critical contributions. This national collaborative study is supported by NIH Grant U10AA008401 from the National Institute on Alcohol Abuse and Alcoholism (NIAAA) and the National Institute on Drug Abuse (NIDA).
Funding support for GWAS genotyping, which was performed at the Johns Hopkins University Center for Inherited Disease Research, was provided by the National Institute on Alcohol Abuse and Alcoholism, the NIH GEI (U01HG004438), and the NIH contract “High throughput genotyping for studying the genetic contributions to human disease” (HHSN268200782096C). Family-based genotyping was performed using the facilities of the Center for Medical Genomics at Indiana University School of Medicine, which is supported in part by the Indiana Genomics Initiative of Indiana University (INGEN®); INGEN is supported in part by The Lilly Endowment, Inc.
Brain tissues were received from the New South Wales Tissue Resource Centre, which is supported by the National Health and Medical Research Council of Australia, The University of Sydney, Prince of Wales Medical Research Institute, Neuroscience Institute of Schizophrenia and Allied Disorders, National Institute of Alcohol Abuse and Alcoholism (Grant R01 AA12725) and NSW Department of Health.