|Home | About | Journals | Submit | Contact Us | Français|
CHRNA4, the gene that encodes the nicotinic acetylcholine receptor α4 subunit, is a potential candidate gene for nicotine dependence (ND). However, studies of the association of CHNRA4 with smoking behavior have shown inconsistent results. Our meta-analysis of linkage studies of smoking behavior identified a genome-wide significant linkage of the phenotype maximum number of cigarettes smoked in a 24-hour period to a region (20q13.12-q13.32) harboring CHRNA4. This motivated us to examine the association of CHRNA4 with smoking behavior in two independent samples. In this study, we examined five single nucleotide polymorphisms (SNPs) within CHRNA4 and three smoking-related behaviors: one quantitative trait [cigarettes smoked per day (CPD)], and two binary traits [DSM-IV diagnosis of ND and dichotomized Fagerstrom test of ND (FTND)], in 1,249 unrelated European-Americans (EAs) and 1,790 unrelated African-Americans (AAs). Using the combined sample with sex, age and race as covariates, the synonymous SNP rs1044394 was significantly associated with ND (P = 0.001) and FTND (P = 0.01). Rs2236196, which has a low correlation with rs1044394, was also significantly associated with CPD (P = 0.003). The pattern of association for these SNPs was similar in AAs and EAs. After correction for multiple testing, the association between rs1044394 and ND in the combined sample remained significant (P = 0.033). In summary, our study supports association between CHRNA4 common variation and ND in AA and EA samples. Additional studies will be necessary to evaluate the role of rare variants at CHRNA4 for ND.
Cigarette smoking is the most preventable cause of mortality in the world. Despite increasing awareness of the risks associated with smoking, the World Health Organization (2002) estimated that around 1.2 billion people worldwide were active smokers. Nicotine contained in cigarettes is a rewarding drug and the primary substance responsible for continued smoking and addiction (Benowitz 2008). Thus, an understanding of the various factors that influence the risk of nicotine dependence (ND) is critical to the prevention and cessation of smoking. Although the etiology of ND is complex, a genetic contribution to ND has been well established from twin studies. The heritability was estimated to be 0.56 for all smokers on average in a meta-analysis of 17 twin studies (Li et al., 2003). Numerous ND candidate loci or genes have been identified by genome-wide linkage studies (for review see Li et al., 2008), candidate gene association studies (Gelernter et al., 2006; Saccone et al., 2007; Conti et al., 2008; Weiss et al., 2008; Bergen et al., 2009; Saccone et al., 2009a,b) and recent genome-wide association studies (GWAS) (Berrettini et al., 2008; Thorgeirsson et al., 2008; Caporaso et al., 2009; Liu et al., 2009; Vink et al., 2009; Tobacco and Genetics Consortium 2010; Liu et al., 2010; Thorgeirsson 2010).
The gene CHRNA4, which encodes the nicotinic acetylcholine receptor α4 subunit, is a candidate gene for ND risk. The gene maps to 20q13.3 and contains 6 exons spanning ~17kb of genomic DNA. CHRNA4 is highly expressed in the central nervous system, and its protein product is part of high affinity receptors (α4β2) for nicotine. It plays a major role in tolerance, reward, and the modulation of mesolimbic dopamine function, all of which are critical to the development of ND (Tapper et al., 2004). Although some genetic association studies support the role of CHRNA4 in smoking-related behaviors (Feng et al., 2004; Li et al., 2005; Hutchison et al., 2007; Breitling et al., 2009), there have been conflicting results (Ehringer et al., 2007; Weiss et al., 2008; Etter et al., 2009). GWAS studies on smoking behavior have not indentified CHRNA4 gene as a risk locus (Berrettini et al., 2008; Thorgeirsson et al., 2008; Caporaso et al., 2009; Liu et al., 2009; Vink et al., 2009; Tobacco and Genetics Consortium 2010; Liu et al., 2010; Thorgeirsson 2010). Our meta-analysis of linkage scans of smoking behavior identified a genome-wide significant linkage for the maximum number of cigarettes smoked in a 24-hour period in a region (20q13.12-q13.32) harboring CHRNA4 (Han et al., 2010). This motivated us to evaluate the association of CHRNA4 with smoking behavior in two independent samples.
Subjects were recruited for participation in studies of the genetics of cocaine, opioid, and alcohol dependence from the communities around five sites in the United States: the University of Connecticut Health Center (UConn), Yale University School of Medicine (Yale), the Medical University of South Carolina (MUSC), the University of Pennsylvania (UPENN) and McLean Hospital (McLean). Subjects were interviewed using an electronic version of the Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA) instrument (Gelernter et al., 2005; Pierucci-Lagha et al., 2005). Most controls were recruited at the same recruitment sites (excluding McLean) and were screened to exclude those with a diagnosis of DSM-IV substance abuse or dependence, and a major Axis I psychiatric disorders. The study protocol was approved by the institutional review board at each site. A complete description of the study was provided to all subjects, who then gave written informed consent. Subjects were paid for their study participation.
A total of 1,249 unrelated European-Americans (EAs) and 1,790 unrelated African-Americans (AAs) were genotyped and included in analysis. The number of included subjects across the five recruiting sites are 1233, 1226, 310, 249 and 21 for UConn, Yale, MUSC, UPENN and McLean, respectively. Subjects were considered “affected” based on smoking behavior as defined by a DSM-IV diagnosis of ND and the Fagerstrom Test of ND (FTND). Lifetime, as opposed to current, measures were considered. Cigarettes smoked per day (CPD) was also tested for genetic association as a quantitative trait. Never-smokers are included in the current study. The diagnosis of DSM-IV ND was made using a computerized algorithm that applied the DSM-IV diagnostic criteria for the disorder (American Psychiatric Association, 1994). The FTND score (Heatherton et al., 1991) (range 0–10) was derived from the SSADDA, in which the FTND questions were embedded. Based on prior work (Saccone et al., 2009b), we dichotomized FTND scores as follows: and FTND score ≥4 was defined as a “case”, an FTND score ≤1 was defined as a “control,” and subjects with scores in-between those values were considered to have an “unknown” phenotype. Most ND/FTND cases were diagnosed with alcohol dependence (AD), cocaine dependence (CD), and/or opioid dependence (OD) using the SSADDA instrument. The basic demographic information for participants in this study for the analysis of each of the smoking behaviors is summarized in Table 1.
Five SNPs (rs2236196, rs3787138, rs1044396, rs1044394 and rs6010918) across CHRNA4 were selected for genotyping. This included the 2 SNPs (rs2236196, rs1044396) that were associated with smoking behavior in previous studies (Feng et al., 2004; Li et al., 2005; Hutchison et al., 2007; Breitling et al., 2009). The proportions of HapMap Phase II common SNPs (MAF > 0.05) of CHRNA4 tagged using an r2 of 0.6 by the five SNPs are 0.83 and 0.80 for CEU and YRI, respectively. SNPs were genotyped with a fluorogenic 5’ nuclease assay method, i.e., the TaqMan technique, using the ABI PRISM 7900 Sequence Detection System (ABI, Foster City, CA, USA). For genotyping quality control, 8% of samples were re-genotyped with 100% concordance.
A Pearson Chi-square test was used to test departures from Hardy–Weinberg equilibrium (HWE) expectations in controls defined by the absence of a DSM-IV diagnosis of ND and an FTND score ≤1 and in cases of ND or FTND. Only the SNPs in HWE in controls (P > 0.01, Bonferroni corrected) were included in subsequent analyses. The CPD phenotype was log transformed to achieve approximate normality and the association between log (CPD) and each SNP was tested using linear regression. The association between each SNP and the two binary traits was estimated in a logistic regression framework. We tested the genetic effects under a log-additive model with age, sex and race as covariates in the combined sample, and age and sex were included as covariates in each population-specific subgroup analysis. All of above analysis was performed in the R package “SNPassoc” (González et al., 2007).
LD plots were drawn using the R package “snp.plotter” (Luna et al., 2007) for the unrelated controls. We used r2 as the measure of LD strength. The haplotype association test was performed by WHAP (Purcell et al., 2007). WHAP calculates likelihood estimates, likelihood ratios and P-values, and takes into account the loss of information due to haplotype phase uncertainty and missing genotypes. To disentangle the correlation structure in the highly correlated regions, conditional haplotype analysis was performed using WHAP. To test the hypothesis that all significant associations surrounding a particular assumed “causal” SNP reflected the high correlation between each candidate SNP and the “causal” SNP, we performed a two-SNP haplotype analysis pairing the “causal” SNP with every other SNP one at a time and conditional on that SNP. Disappearance of the global haplotype association was interpreted to mean that the conditional SNP explained the entire association.
To control for possible confounding due to population stratification, we estimated individual ancestry proportions for each subject, which were then used as covariates in the regression model. A panel of 41 ancestry informative markers (AIMs), including 36 highly ancestry-informative short tandem repeat markers and 5 SNPs (rs1540771, rs2814778, rs1805007, rs1426654 and rs12896399) were genotyped in a majority of subjects (93.4%), among which 1592 AAs and 1093 EAs with genotyping rate > 50% were further included to estimate the individual admixture proportion. The exact sample size for each phenotype analyzed with admixture proportion as a covariate is provided in Supplementary table 1. Detailed genotyping methods for 37 of the 41 AIMs (including an FY SNP) have been described in detail previously (Yang et al., 2005). The remaining four SNPs were genotyped by the same TaqMan technique as was used for the CHRNA4 SNPs. We used the STRUCTURE program (Pritchard et al., 2000) to estimate the admixture proportion for each individual based on the AIMs. The log-likelihood of each analysis at a different number of population groups (k) was estimated from the average of 3 independent runs (20,000 burn in and 30,000 iterations) and, as expected, the result favored a two-ancestry population model. We included the admixture proportion of each individual as a covariate in the combined analysis and in each population specific subgroup analysis.
We tested the genetic associations between five SNPs and three smoking-related behaviors in three groups (the combined sample, AAs and EAs), requiring adjustment of the α level to avoid inflation of the type 1 error rate. Because traditional Bonferroni correction is too conservative when different tests are correlated, we used a permutation-based correction procedure. Briefly, we randomly reshuffled the phenotypes and genotypes while keeping the phenotype and genotype correlation structure unchanged. In each randomly permuted sample, we ran the association test in exactly the same way as in the real dataset and a minimum P-value was recorded. This process was repeated 10,000 times and 10,000 minimum P-values were obtained to get the empirical distribution of the minimum P-values. For each point-wise association test P-value, the empirical P-value corrected for multiple testing was estimated by counting the proportion of the minimum P-values less than or equal to the observed one across the 10,000 randomly permuted samples.
We used QUANTO software (Gauderman 2002) to calculate the power of current association studies for each investigated phenotype at the 0.05 level of significance with a two-sided test. In the AA sample, for the quantitative log transformed CPD trait with a mean 2.38 and standard deviation 0.70, under a log-additive genetic model with MAF = 0.1 (0.3), the CPD sample (N=1385) had 62% (93%) power to detect a main effect of 0.1. For the two binary traits in AAs, under a log-additive genetic model with MAF = 0.1 (0.3), our study had 71% (96%) and 68% (95%) power to detect an effect with a relative risk 1.5 for ND (990 cases/237 controls) and FTND (897 cases/225 controls), respectively. In the EA sample, for the quantitative log transformed CPD trait with a mean 2.78 and standard deviation 0.62, under a log-additive genetic model with MAF = 0.1 (0.3), the CPD sample (N = 915) had 54% (89%) power to detect a main effect of 0.1. For the two binary traits in EAs, under a log-additive genetic model with MAF = 0.1 (0.3), our study had 66% (94%) and 63% (93%) power to detect an effect with a relative risk of 1.5 for ND (716 cases/223 controls) and FTND (705 cases/208 controls), respectively.
Marker information and related summary statistics are presented in Table 2. For each SNP, the genotyping rate was >95% and the genotype distribution in the control sample was consistent with HWE. However, three SNPs (rs2236196, rs1044394, rs6010918) in ND cases and four SNPs (rs2236196, rs3787138, rs1044394, rs6010918) in FTND cases had genotype distributions that were not in HWE (0.004 < P < 0.05). Considering all the SNPs are in HWE in control samples, we included all the SNPs for association tests in subsequent analysis. In addition, there were large differences in allele frequencies between EAs and AAs for each SNP tested. Therefore, to avoid the spurious associations that can arise from stratification in the analysis of the combined sample, race was included as a categorical covariate in the association test.
Table 3 illustrates the association results in detail. In the combined analysis, the strongest signal came from the association of a synonymous SNP (rs1044394) with ND; the minor allele “A” at this locus was associated with a lower risk of ND (P = 0.001, OR = 0.73, 95% CI = 0.61–0.89). Subgroup analysis demonstrated a similar pattern of association in AAs (P = 0.009, OR = 0.74, 95% CI = 0.60–0.93) and EAs (P = 0.05, OR = 0.66, 95% CI = 0.43–1.00). Rs1044394 was also significantly associated with FTND (P = 0.01), and showed non-significant evidence for association with CPD (P = 0.083) in the combined samples. Another SNP (rs2236196), which has a low correlation with rs1044394 (r2 = 0.26 and 0.22 for AAs and EAs, respectively), was significantly associated with CPD (P = 0.003) in the combined sample, as well as in AAs (P = 0.022) and EAs (P = 0.049) separately. Note that the coefficient for rs2236196 is opposite in sign for CPD in AAs and EAs; this is attributable to the fact that the minor allele is opposite for AAs and EAs. That is, the allele effects are congruent. However, only the association between rs1044394 and ND in the combined sample survived the correction for multiple testing (P = 0.033).
The LD map and each SNP-associated P-value are depicted in Figure 1. Rs1044394 and rs6010918 are in strong LD in both AAs and EAs (r2 = 0.91 and 0.81 for AAs and EAs, respectively), which explained the evidence for association of both SNPs with ND (FTND) in the combined sample and in each population group (Table 3). To exclude the possibility that multiple observed effects were caused by LD with a single true risk allele, we performed conditional haplotype analysis on the combined sample with race, age and sex as covariates. As expected, both SNPs showed a significant global association with ND (P = 0.0046). The global association disappeared when conditioned on rs1044394 (P = 0.24), but remained significant when conditioned on rs6010918 (P = 0.047). This analysis suggests that the significant association of rs6010918 with ND arises from its high correlation with rs1044394. When a similar analysis was used for the FTND binary trait, the effects of rs1044394 and rs6010918 could not be disentangled.
Additional analyses were conducted to define more clearly the association of the CHRNA4 polymorphisms and the three smoking-related behaviors. First, an association test was performed in the subset of individuals with AIMs data available. The admixture proportion was estimated for each individual and included as a covariate with age and sex in the regression model. The association results remained similar, arguing against stratification as a confounder (Supplementary Table 2). Second, we examined whether recruitment site was a significant confounder by including site as a categorical covariate, and the association results did not change substantially (Supplementary Table 3). Third, haplotype analysis was performed using different SNP combinations but did not yield more significant results (data not shown).
The biological link between CHRNA4 and nicotine addiction phenotypes in animal models (Ross et al., 2000; Labarca et al., 2001) and humans (Breese et al., 1997; Fenster et al., 1999; Buisson et al., 2001) has made this gene a notable candidate in genetic association studies of smoking behavior. However, prior association studies in small samples have shown inconsistent results, and recent GWAS studies as reflected by three meta-analyses of GWAS studies on smoking behavior, also did not identify CHRNA4 as a susceptibility gene for nicotine dependence. There are many possible explanations for this lack of association, including genetic heterogeneity, phenotypic heterogeneity, and the limited power of GWAS studies arising from the necessity to correct for multiple comparisons.
In the first report, Feng et al. (2004) identified two synonymous SNPs in exon 5 (rs1044396, rs1044397), which were significantly associated with reduced risk for ND in Chinese men. Li et al. (2005) replicated and extended the association of CHRNA4 with smoking behavior in EAs and AAs, but the association results were not entirely consistent with those published by Feng. For example, rs1044396 showed a significant association with ND in the Chinese male sample but showed no association in the EA or AA samples and rs2236196 showed significant association in AA females but not in Chinese males. More recently, an association study in 5500 Germans showed that rs2236196 was significantly associated with FTND and the “G” allele was associated with a higher risk of nicotine addiction and higher odds of being a smoker (Breitling et al., 2009). The accumulating evidence from genetic association studies plus the additional findings on clinical effects (Hutchison et al., 2007) suggest that rs2236196 could be one of the most promising risk variants for smoking-related behaviors.
Consistent with this prior work, we detected a significant association of rs2236196 with a quantitative trait (CPD) in the combined sample, as well as in AAs and EAs. Specifically, we found that the rs2236196 “G” allele was associated with higher number of CPD, a finding consistent in magnitude and direction with the findings from previous studies (Hutchison et al., 2007; Saccone et al., 2007; Breitling et al., 2009). Although the association between rs2236196 and CPD in the combined sample did not survive multiple testing (Pcorrected = 0.075), given the prior findings of association between rs2236196 and smoking behavior, a nominally significant p-value could be considered of interest, which would provide evidence of replication of the finding that rs2236196 plays a role in smoking behavior. However, no significant evidence was observed for association between rs2236196 and the two binary traits of ND or dichotomized FTND. Because the SSADDA does not include an assessment of the maximum number of cigarettes smoked in a 24-hour period (the trait that we identified through the linkage signal with CHRNA4 in the meta-analysis), we cannot directly test the association of CHRNA4 variants with this phenotype.
In addition, we found that rs1044394, another synonymous SNP in exon 5, was nominally significantly associated with ND and FTND in the combined sample, as well as in AAs and in EAs separately. However, only the association of rs1044394 with ND in the combined sample remained significant after correction for multiple testing (Pcorrected = 0.033). We are aware of only one other study that investigated the role of rs1044394 in smoking-related behaviors; that yielded a negative result (Etter et al., 2009). Interestingly, the correlation between rs2236196 and rs1044394 was low in both AAs and EAs, suggesting that multiple independent variants within this gene could be associated with different dimensions of smoking behavior. However, all of the published studies investigated a very limited number of variants in this gene, thus we would speculate that other independent variants associated with smoking behavior may have been missed.
An exploration of the correlation among the three investigated phenotypes could be helpful to interpret the consistency (or lack of consistency) among the association results. The correlation between CPD and ND was low for both AAs (Spearman correlation coefficient ρ = 0.14) and EAs (ρ = 0.19). A low correlation between CPD and FTND was also observed in AAs (ρ = 0.16) and EAs (ρ = 0.26). The low correlation between CPD and ND/FTND could explain the inconsistent results for rs2236196, which demonstrated association for CPD but not for ND or FTND. There is a strong correlation between the ND and FTND phenotype (ρ = 0.83 and 0.88 in AAs and EAs, respectively), which could explain the similar patter of association for ND and FTND (Table 3).
Since the ND or FTND cases included in our samples are affected with other substance dependence disorders, including cocaine dependence (CD), alcohol dependence (AD) and/or opioid dependence (OD), we also investigated whether the signals associated with smoking behavior could be driven by other substance dependence disorders. To control the effect of other substance dependence disorders, we reanalyzed the data by including CD, AD and OD status as covariates in the regression model. As a result, rs2236196 remained significant for CPD (P = 0.0078) and was less significant for ND (P = 0.042). However, the signal disappeared for rs1044394 for both ND and FTND. We speculate that this loss of a signal for rs1044394 is attributable to either random fluctuation or that other substance dependence disorders drive the association between rs1044394 and ND (or FTND). It will be necessary to examine the role of CHRNA4 in other substance dependence disorders to determine the basis for this finding.
This study also has limitations. The positive linkage finding by genome scan meta-analysis in our previous study is consistent with a role for multiple rare (or less common) variants mapped to this region for ND; however, these were not investigated in present study. Further, there is evidence that the rare missense variants at CHRNA4 are associated with sporadic amyotrophic lateral sclerosis (Sabatelli et al., 2009). The role of rare variants at this locus for ND and related behaviors thus needs to be evaluated. With increasing evidence for a role of rare variants in psychiatric disorders (Carroll et al., 2010; Knight et al., 2009), we believe that sequencing the whole gene will be necessary to discover all of the causal variants in CHNRA4, especially rare variants, that are associated with smoking behavior. In addition, the AA and EA samples differed significantly on MAF for the five SNPs. There is a possibility that adjustment for self-reported race in the combined analysis may not be sufficient to address this confounding by race. However, in a subgroup of samples with AIMs available, the association results remained similar when using admixture proportion as a covariate, arguing against stratification as a confounder. Finally, the control samples used in the binary traits analysis include many “never-smokers” which may have become dependent if they had been exposed to nicotine, and this could arguably have reduced the power.
In summary, our study confirms the previously reported genetic association between CHRNA4 polymorphisms and smoking behavior in independent samples of AAs and EAs. Although rs2236196 could be one of the causal variants for smoking behavior, other independently associated functional variants, especially rare variants, may have been missed due to the limitations of the current study design. To identify the full range of CHRNA4 variants and to evaluate their role in smoking behavior, complete sequencing of the gene is necessary.
The authors are grateful to the volunteer families and individuals who participated in this research study. We are also grateful to Ann Marie Lacobelle and Greg Kay for their excellent technical assistance and to the many SSADDA interviewers who obtained the phenotypic data used in this study. Kathleen Brady, M.D., Ph.D. oversaw recruitment of some of the subjects from the Medical University of South Carolina. Roger Weiss, M.D. of McLean Hospital and Harvard Medical School oversaw study recruitment at that site. This work was supported by the U.S. National Institutes of Health (R01 AA11330, R01 AA017535, R01 DA12849, R01 DA12690, R01 DA018432, K01 DA024758, and M01 RR06192) and by the US Department of Veterans Affairs (VA CT REAP and New England MIRECC Center; VA CT Alcohol Research Center).
Drs. Yang, Han, and Gelernter report no competing interests. Dr. Kranzler reports consulting arrangements with Gilead Sciences and Alkermes, Inc. and research support from Merck & Co.. Dr. Kranzler also reports a current association with the following pharmaceutical companies: Eli Lilly, Janssen, Schering Plough, Lundbeck, Alkermes, GlaxoSmithKline, Abbott, and Johnson & Johnson, as these companies provide support to the Alcohol Clinical Trials Initiative (ACTIVE) and Dr. Kranzler receives support from ACTIVE.