|Home | About | Journals | Submit | Contact Us | Français|
Tobacco smoking continues to be a leading cause of preventable death. Recent research has underscored the important role of specific cholinergic nicotinic receptor subunit (CHRN) genes in risk for nicotine dependence and smoking. To detect and characterize the influence of genetic variation on vulnerability to nicotine dependence, we analyzed 226 SNPs covering the complete family of 16 CHRN genes, which encode the nicotinic acetylcholine receptor (nAChR) subunits, in a sample of 1050 nicotine-dependent cases and 879 non-dependent controls of European descent. This expanded SNP coverage has extended and refined the findings of our previous large scale genome-wide association and candidate gene study. After correcting for the multiple tests across this gene family, we found significant association for two distinct loci in the CHRNA5-CHRNA3-CHRNB4 gene cluster, one locus in the CHRNB3-CHRNA6 gene cluster, and a fourth, novel locus in the CHRND-CHRNG gene cluster. The two distinct loci in CHRNA5-CHRNA3-CHRNB4 are represented by the non-synonymous SNP rs16969968 in CHRNA5 and by rs578776 in CHRNA3, respectively, and joint analyses show that the associations at these two SNPs are statistically independent. Nominally significant single-SNP association was detected in CHRNA4 and CHRNB1. In summary, this is the most comprehensive study of the CHRN genes for involvement with nicotine dependence to date. Our analysis reveals significant evidence for at least four distinct loci in the nicotinic receptor subunit genes that each influence the transition from smoking to nicotine dependence and may inform the development of improved smoking cessation treatments and prevention initiatives.
Tobacco use continues to be a leading cause of preventable death. Smoking causes an estimated 21% of yearly cancer deaths worldwide (Danaei et al., 2005), currently kills approximately 5 million people a year, and is projected to be responsible for 10% of all deaths globally by the year 2015 (Mathers and Loncar, 2006). Despite heightened public awareness of the negative health consequences, smoking remains prevalent in the United States, and rates are increasing in some developing countries (Mackay et al., 2006). In 2005, 20.9% of US adults were current cigarette smokers, the same percentage as in 2004 (Centers for Disease Control and Prevention, 2006). This lack of decline underscores the limitations of current treatments and policy approaches aimed at curbing tobacco use.
Nicotine dependence is a leading predictor of smoking continuation (Breslau et al., 2001; Johnson et al., 2002). Thus an improved understanding of the biological underpinnings of nicotine dependence may be key to the success of further smoking cessation efforts. Twin studies find significant genetic influences on nicotine dependence (Lessov et al., 2004; Maes et al., 2004; Pergadia et al., 2006; Prescott and Kendler, 1995), motivating the search for genetic variants that influence risk for nicotine dependence.
Nicotine is a naturally occurring alkaloid in tobacco. Nicotine is highly addictive (Davis, 1988), and the nicotinic acetylcholine receptors, in part, mediate the effects of nicotine in the body (Lindstrom, 2003; Tapper et al., 2004). These receptors are pentameric molecular assemblies of nicotinic acetylcholine receptor (nAChR) subunits, which are coded by a family of distinct cholinergic nicotinic receptor (CHRN) genes. These subunits, and the receptors assembled from them, fall into two classes, neuronal and neuromuscular. In humans the neuronal subunits consist of α2 through α7, α9, α10, and β2 through β4; the muscle nAChRs are α1, β1, δ, ε and γ. The α8 subunit has been detected in avian tissue but not in mammals.
The muscle-type nicotinic receptor is composed of two alpha subunits, one beta, one delta and either one gamma or one epsilon subunit (Karlin, 2002). Neuronal-type receptors can be homopentamers or heteropentamers. The predominant heteropentameric nAChRs are α4β2* in brain and α3β4* in the peripheral nervous system (Gotti et al., 2006; Mao et al., 2006). In some instances, other subunits (as indicated by the asterisk), principally α5, also contribute to these two major heteromeric receptor subtypes. The α7 subunit is the only nAChR subunit confirmed to form homomeric nAChRs in mammalian brain (Chen and Patrick, 1997). The dopamine reward pathway, which is thought to play a critical role in addiction, is modulated by multiple nAChR subtypes, including α4β2, α4β2α5, α6β2β3, α4α6β2β3, and α7 (Klink et al., 2001; Pidoplichko et al., 2004; Salminen et al., 2004; Wooltorton et al., 2003; Wu et al., 2004).
Early genetic association studies of nicotine dependence and smoking have looked at some of the nicotinic subunit genes (Greenbaum et al., 2006; Lou et al., 2006), with CHRNA4 and CHRNB2 (coding α4 and β2 respectively) receiving the most attention (Ehringer et al., 2007; Feng et al., 2004; Li et al., 2005; Lou et al., 2007; Lueders et al., 2002; Silverman et al., 2000) until our reports of strong association within the CHRNA5-CHRNA3-CHRNB4 cluster and the CHRNB3-CHRNA6 cluster (Bierut et al., 2007; Saccone et al., 2007a). More recently, our association with rs16969968, a nonsynonymous coding SNP in CHRNA5, has been replicated in several independent datasets using the same SNP (Bierut et al., In press) (Stevens et al., unpublished data) or proxy SNPs having very high r2 with it (Berrettini et al., 2008; Thorgeirsson et al., 2008). Furthermore, this SNP or its r2 proxies have now been associated with lung cancer (Amos et al., 2008; Hung et al., 2008; Thorgeirsson et al., 2008). These findings underscore the nicotinic receptors as important targets for further indepth study.
Here we present a comprehensive association study of nicotine dependence with SNP-based variation in the complete family of 16 nicotinic receptor subunits. This targeted study of the CHRN genes expands upon the coverage we previously attained in NICSNP, a large-scale case-control nicotine dependence association study consisting of the first high-density genome-wide association screen (GWAS) for nicotine dependence (Bierut et al., 2007) paired with an extensive candidate gene (CG) survey of 348 genes (Saccone et al., 2007a).
All individuals in this case/control sample have reported smoking at least 100 cigarettes lifetime, the threshold classically used to define a smoker (Centers for Disease Control and Prevention, 2006). Cases have lifetime FTND scores of 4 or greater, while controls have never exhibited symptoms of dependence (lifetime FTND = 0). This sample therefore is designed to examine the transition from nicotine use to dependence. All subjects (1050 cases and 879 controls) were recruited by two ongoing studies: the Collaborative Genetic Study of Nicotine Dependence (COGEND), a US-based sample (N = 797 cases, N= 813 controls), and the Nicotine Addiction Genetics (NAG) study (Saccone et al., 2007b), an Australian-based sample (N= 253 cases, N= 66 controls). All subjects were self-identified as being of European ancestry. Table I gives demographic details including the distribution of cases and controls, by gender, and mean age and FTND, in the US and Australian samples. The table shows that the mean FTND in cases is very similar across the two recruiting sites (US and Australia), although the proportion of cases and controls recruited differs. Also, the proportion of women differs in cases and controls. Therefore the variables site and sex are included as covariates in the genetic association analyses. Analyses using STRUCTURE (Pritchard et al., 2000) demonstrating no evidence of confounding population substructure are published elsewhere (Bierut et al., 2007). The study was carried out in compliance with the Code of Ethics of the World Medical Association and obtained informed consent from all participants and approval from the appropriate institutional review boards.
226 SNPs covering the 16 CHRN genes were analyzed in this study. Initial genotyping was carried out by Perlegen Sciences using custom arrays as previously detailed (Bierut et al., 2007; Saccone et al., 2007a), leading to 119 CHRN SNPs passing quality control (QC) measures including reliability of genotype calls and Hardy-Weinberg equilibrium (HWE) > 0.01. Additional genotyping was then performed by the Center for Inherited Disease Research (CIDR) using Illumina Golden Gate technology (http://www.illumina.com/downloads/GOLDENGATEASSAY.pdf). These additional SNPs were selected for the following purposes. First, SNPs with poor or marginal genotype call rates (<98%) in our initial data were selected for regenotyping; these included some of the 119 SNPs and also SNPs that had already been attempted but did not meet the prior QC requirements. Second, SNPs were selected using our custom tag SNP programs which use r2 bin tagging (Carlson et al., 2004) to enhance coverage of the CHRN genes. Third, we chose additional SNPs to fine-map the strong association signals already seen in the CHRNA5-CHRNA3-CHRNB4 and CHRNB3-CHRNA6 clusters. Prior to analysis, all SNPs were required to have call rates ≥ 98%. SNPs having r2 ≥ 0.8 with CHRN SNPs (e.g. in IREB2, LOC123688, and PSMA4 which flank the CHRNA5-CHRNA3-CHRNB4 cluster) were also included in the final analysis set of 226 SNPs, of which 118 are newly genotyped by CIDR. Map positions and genomic annotations were obtained from the National Center for Biotechnology Information (NCBI) Human Reference Build 36.2 and dbSNP build 127.
Linkage disequilibrium (LD) between SNPs was calculated for cases and controls using the verbose option of ldmax (Abecasis and Cookson, 2000). Signed D’ coefficients were calculated to retain LD phase information (Saccone et al., 2006). LD plots were generated using a custom program.
Our primary single SNP association analyses of case-control status use logistic regression models to correct for significant covariates of gender and sample source site (US or Australia) (Table 1). The non-genetic base model is: ln, where P is the probability of being a case, g is gender (0=male, 1=female) and s is site (0 for US, 1 for Australia). Genotype status at each marker, coded as an ordinal variable, is then added to the model and tested for significance by the standard likelihood chi-square statistic with one degree of freedom (df). The ordinal coding corresponds to a log-additive (multiplicative) model for the number of copies of the risk allele, defined as the allele more common in cases than controls. This primary test differs from the test used in (Bierut et al., 2007; Saccone et al., 2007a), which included a genotype by gender interaction term to allow detection of loci having differential effect in males versus females. In our focused nAChR study here, we first tested for SNP effect only; then, for SNPs significant in that primary test, we investigated possible gender effects and alternative modes of inheritance.
To determine a “best model” for nominally significant SNPs, we evaluated whether an added gender-by-genotype interaction term is itself significant and also whether modeling a log-linear trend effect of the genotype is sufficient. If the gender-by-genotype interaction term was significant at the p < 0.05 level, the logistic regression was repeated with that term, and the “best model” p-value reported for that two df test. To determine whether modeling a non-multiplicative genotype effect gives a better fit than the primary model, we added a single dichotomous variable coding for heterozygote status, allowing deviation from an ordinal model. When this variable was significant, we evaluated other modes of inheritance: recessive (1 df), dominant (1 df), or genotype coded with 2 degrees of freedom to allow an arbitrary model, and report the “best model” p-value for the appropriate alternative model.
To interpret results, we ordered SNPs by primary test p-value and then formed r2 bins across the 226 SNPs, by chromosome, as follows. The correlation coefficients were computed separately in cases (r2case) and controls (r2control). The most significant SNP and all SNPs having min(r2case , r2control) ≥ 0.8 with it defined the first bin; these were removed from the binning process, which continued with the next most significant SNP that was not already binned. By ordering the SNPs by significance to define the bins, we enable the bins to better reflect the correlations with the most significant SNPs. Using min(r2case , r2control) is slightly more conservative (potentially leading to more bins) than using r2 from the combined sample, but in practice makes little difference. Alternative algorithms may be efficient for selecting tag SNPs, for example the greedy algorithm of (Carlson et al., 2004). However, such an algorithm might then relegate two SNPs into separate bins even though they are both correlated with a very significant SNP, giving the illusion that they are two distinct signals.
A Bonferroni correction for our 226 SNPs is overly conservative due to LD among SNPs and would require an individual p-value of 0.000221 for an experiment-wide significance level of 0.05. A more precise Sidak correction would require a p-value of α’ = 0.000227, so that 0.05 = 1 - (1-α’)226.
However, we may take LD into account by selecting an r2 tagging subset with an r2 threshold of 0.8 as described above. This approach results in 111 proxy SNPs, corresponding to no more than 111 distinct tests and a Sidak corrected p-value of 0.000462.
We performed selected multilocus analyses of case-control status. The purpose of these analyses is to help clarify and distinguish among the effects of multiple significant SNPs within an associated region but not in strong LD. The non-synonymous SNP rs16969968 in CHRNA5 is a biologically promising finding in our analysis, and other studies have shown evidence for replication via our analysis of rs16969968 in independent samples of families (Bierut et al., In press) and heavy smokers and controls (Stevens et al., unpublished data in review), and via analyses of strongly correlated SNPs by independent groups studying cigarettes-per-day (Berrettini et al., 2008) and cigarettes-per-day and nicotine dependence (Thorgeirsson et al., 2008). Furthermore, a new functional study has demonstrated that the risk allele at rs16969968 decreases response to a nicotine agonist (Bierut et al., In press). Therefore we targeted rs16969968 for these joint analyses; results for SNPs having high r2 with it will necessarily be similar.
Extensive LD across the CHRNA5-CHRNA3-CHRNB4 cluster led to multiple significant SNPs in our data. After accounting for the LD structure in this region, we identified two distinct, significant associations tagged by rs16969968 and rs578776. Therefore, genotypes at these SNPs were entered together into a multivariate logistic regression model to evaluate the significance of the effect of one locus when controlling for genotype at the other locus. Additionally, the combined influence of the two-locus genotype was examined in a 3×3 table.
The full logistic model has the form ln, where G1 codes genotype status at rs16969968 with a recessive model (determined from the best model analysis) and G2 codes the second SNP with the primary, log-additive model. We test for the effect of each SNP by removing it from the model and calculating the likelihood ratio chi-square with one df.
The covariates recorded included sex, age, and site (US or Australia). Forward selection using logistic regression indicated that sex and site were significant predictors of case/control status, as expected from Table I. Thus the covariate-corrected genetic analyses include these variables in the baseline model. With these main effects in the model, additional interaction terms such as age*sex were not significant. Figure 1 gives the distribution of FTND scores in male and female cases.
Twenty-one SNPs were significant after multiple test correction (p < 0.000462), using the primary 1 df logistic regression test. These 21 SNPs form five distinct r2 bins in the following genes or gene clusters: CHRNA5-CHRNA3-CHRNB4 (3 bins), CHRNB3-CHRNA6 (1 bin), and CHRND, (1 bin). Of the three bins in the CHRNA5-CHRNA3-CHRNB4 cluster, two are still correlated with each other as detailed below, leading to two distinct signals in the cluster.
Eight of the significant SNPs are on chromosome 8 in the β3-α6 gene cluster, twelve are on chromosome 15 in or near the α5—α3—β4 cluster, and one is on chromosome 2 in CHRND. These significant results are in bold in Table II, which presents association results for all SNPs genes or gene clusters that contained at least one SNP with primary p-value ≤ 0.05. Association results for all 226 SNPs sorted by chromosome and position are in supplementary table S1, so that the coverage for every nAChR gene may be seen. To aid interpretation, the tables also include the LD bin membership of the SNPs, as described in the methods. Of the 226 SNPs, 84 are nominally significant. For the nominally significant SNPs, supplementary table S1 gives p-values for the “best model” determined as described in the methods.
Results were entered into our custom installation of the Generic Genome Browser (Stein et al., 2002). Panels A, B and C of Figure 2 display plots of the negative logarithm of the primary (log-additive model) p-values for the three gene clusters that contained SNPs significant after multiple test correction (α5-α3-β4 and β3-α6, δ-γ). Figure 3A, 3B and 3C display the pairwise LD patterns (signed D’ and r2) for these gene clusters in cases and controls, labeled by rs number and primary p-value. Supplementary figure S2 extends the CHRNA5-CHRNA3-CHRNB4 LD plot to include the neighboring genes PSMA4, LOC123688 and IREB2.
The two most significant SNPs by best model p-value are in the CHRNA5-CHRNA3-CHRNB4 cluster: rs17487223 (best model p = 0.0000266) in CHRNB4 and rs16969968 (best model p = 0.0000284) in CHRNA5. The latter is a non-synonymous SNP in CHRNA5, and we consider it the most biologically promising finding. Both SNPs are in the same r2 bin that includes 7 additional SNPs (5 surpassing the multiple test criterion of 0.000462) extending both upstream into the gene LOC123688 and downstream through CHRNA3 and CHRNB4 (Table II and Figure 2A). These seven SNPs therefore represent a single signal. The additional 4 significant SNPs in the neighboring gene IREB2 form a separate r2 ≥ 0.8 bin but have r2 ≥ 0.68 with rs16969968 (0.78 for rs1504549, 0.694 for rs17405217, 0.690 for rs17484235, 0.697 for rs17483548 0.697, and 0.679 for rs17483686); thus these are still highly correlated and likely constitute the same signal. In this bin, an intronic SNP in IREB2, rs17484235, has a primary p-value of 0.0001244 compared to 0.0001298 for the non-synonymous SNP rs16969968.
In contrast, the remaining significant SNP in the CHRNA5-CHRNA3-CHRNB4 cluster, rs578776, is not correlated with rs16969968 or its bin. The LD between rs578776 and rs16969968 is r2 = 0.20 overall (0.196 in cases, 0.205 in controls) with D’ = −1.0, due to the LD being in repulsion phase. We thus observed two distinct loci in the CHRNA5-CHRNA3-CHRNB4 gene cluster whose significant association with nicotine dependence is not simply explained by LD between these loci. Table II, column 7 clarifies that for rs16969968 the “risk” allele (meaning the allele more common in cases than in controls) is the minor allele, while for rs578776 the risk allele is the major allele. Thus, recoding the genotypes with the major allele as the reference allele would give an odds ratio of 1.31 for rs16969968 versus a “protective” odds ratio of 0.75 for rs578776.
The CHRNB3-CHRNA6 gene cluster on chromosome 8 contains eight SNPs surpassing the multiple test criterion. These constitute a single r2 bin and a single association signal after LD is taken into account: rs13277254 is in nearly perfect LD with the other SNPs (Figure 3B), with each pairwise r2 exceeding 0.99. Even after the increased CIDR SNP coverage of this gene cluster, all significant SNPs are in the same LD bin as the previous associated SNPs we reported in (Saccone et al., 2007a), and no new distinct signals emerged.
The CHRND gene contains one SNP significant after multiple test correction. This SNP, rs12466358 (p = 0.00027), is a singleton bin in our data, although other SNPs have r2 ≥ 0.6 (but less than 0.8) with it (Figure 3C). It is noteworthy that the next most significant SNP in the CHRND-CHRNG cluster, rs1881492 (p = 0.00069) is uncorrelated with rs12466358 (r2 = 0.086), suggesting that there is a second distinct risk allele in this gene cluster, although this second signal does not meet our multiple test correction criterion.
For the SNPs significant after multiple test correction, the odds ratios (ORs) for the primary genetic model are modest, ranging up to 1.4 (for SNPs in CHRNB3) for the effect of a single copy of the risk allele. Among the SNPs covering the CHRNA5-CHRNA3-CHRNB4 region, the strongest alternative odds ratio is found for rs16969968 when modeled recessively, with an OR of 1.83 for the effect of having 2 copies of the risk allele (Table S1). This OR is similar to but larger than the OR for two risk alleles under the primary model: (1.31)2 = 1.7. For rs16969968, both the recessive-model and primary model ORs are higher than the corresponding ORs for the eight additional correlated SNPs in bin 4, further suggesting the importance of this non-synonymous variant relative to these alternative loci.
In a few additional genes, we observed nominally significant SNPs that did not meet our multiple test significance criterion. These were in CHRNA4, and CHRNB1. There is borderline nominal significance at a nonsynonymous SNP, rs12914008, in CHRNB4 (p = 0.078); this suggestive signal is distinct from the experiment-wide significant findings at rs16969968 and rs578776. Another SNP in CHRNA5, rs2229961, is a nonsynonymous change according to NCBI build 35.1 and has a borderline p-value of 0.053 but is rarer than the other nonsynonymous SNPs in the cluster, rs16969968 and rs12914008.
The CHRNA5-CHRNA3-CHRNB4 gene cluster is of particular interest after our report of association with nicotine dependence (Saccone et al., 2007a) and subsequent evidence of replication in independent samples (Berrettini et al., 2008; Bierut et al., In press; Thorgeirsson et al., 2008). We carried out joint analyses of the two distinct, uncorrelated findings in the CHRNA5-CHRNA3-CHRNB4 cluster, represented by rs16969968 coded recessively and rs578776 coded log-additively to match their best models.
In the joint logistic regression analysis, CHRNA5 SNP rs16969968 remains significant (p = 0.0000645) with a similar order of magnitude as its single SNP best model result of p = 0.000028. When controlling for rs16969968 genotype, rs578776 is reduced by an order of magnitude compared to its single SNP result, but it still remains significant in the joint model (p=0.00543). This gives further evidence that there are two distinct susceptibility loci in this gene cluster.
Table III presents the joint genotype analysis. From the allele frequencies in Table II (column 7), we see that rs16969968 represents a locus whose minor allele (A) confers risk for addiction, while rs578776 represents a locus whose minor allele (A) is protective. Thus, as expected, the GG/AA genotype combination for rs16969968/rs578776 confers the greatest protection from nicotine dependence, while the AA/GG is associated with the highest risk (Table III). The odds of nicotine dependence relative to non-dependence among those with the highest risk joint genotype is 2.5 higher than in those with the lowest risk. It is interesting to note the zero cells for the AG/AA, AA/AG and AA/AA combinations, which clarifies that the high risk genotype AA at rs16969968 occurs only in combination with the homozygous “risk” GG genotype at rs578776. We thus cannot observe the effect of the AA genotype at rs16969968 on other background genotypes at rs578776, and similarly the A-A haplotype is not observed (double heterozygotes being extremely likely to carry A-G and G-A). However, the effect on the observed background is clear from the cells of the first row and first column of Table III, where GG/GG is set as the reference genotype. We see that on the background of the low risk (protective) genotype at rs16969968 (GG), there is significantly increased protection for one and two copies of the A allele at rs578776 along row 1 (OR = 0.68 and 0.60 respectively). On the background of the risk genotype at rs578776 (GG), there is significantly increased risk for two copies of the A allele at rs16969968 along column 1 (OR = 1.49).
This is the most comprehensive survey of the nicotinic receptor subunit genes for involvement in nicotine dependence to date. We have further explored the relationship between nicotine dependence and this gene family using 226 SNPs genotyped in the NICSNP sample of nicotine dependent cases and non-dependent smoking controls. Four distinct findings, two in the CHRNA5-CHRNA3-CHRNB4 cluster, one in CHRNB3-CHRNA6, and one in CHRND-CHRNG are significant after multiple test correction across the CHRN gene family. Additional genes CHRNA4 and CHRNB1 harbor nominally significant SNPs. Those CHRN genes that were not nominally associated in our initial reports (Bierut et al., 2007; Saccone et al., 2007a) still lack significant association with nicotine dependence with the denser SNP coverage.
This study improved our coverage of an important gene family using a sample previously genotyped for a joint GWAS and large-scale candidate gene study (Bierut et al., 2007; Saccone et al., 2007a). Given the current popularity and importance of the GWAS design, the question of how best to interpret new analyses of GWAS datasets or samples in the context of the original large-scale GWAS is a challenge facing not only this study but the field as a whole. In the present case, although the NICSNP sample was used for a large-scale GWAS and candidate gene study, the CHRN gene family was in fact given the highest priority in our original design and would have been targeted even if our resources had been limited to only a few hundred SNPs. Therefore we feel it is appropriate and useful to report significant results based on multiple test correction for the 226 SNPs (111 r2 bins) tested here. However, these findings are not significant after formal multiple test correction for all the genotyping performed on this sample to date. In general, clearly citing any previous, overlapping GWAS or large-scale candidate gene studies is important to allow fully informed interpretation of results.
Our results continue to support an important role for the nonsynonymous CHRNA5 SNP rs16969968 in determining vulnerability to nicotine dependence, as first reported in (Saccone et al., 2007a), and separate in vitro evidence indicates that rs16969968 may indeed be the functional variant explaining the association across the other correlated members of this LD bin (Bierut et al., In press). Due to the extensive LD encompassing not only the CHRNA5-CHRNA3-CHRNB4 cluster but also the neighboring genes LOC123688, PSMA4 and IREB2 (Figures (Figures2A,2A, ,3A3A and S2), further work to definitively identify a causal variant from among all SNPs correlated with rs16969968 will be important.
A second variant in the CHRNA5-CHRNA3-CHRNB4 cluster, rs578776, constitutes a distinct, significantly associated locus that also warrants further study. This SNP was previously noted to have a false discovery rate of 0.09 (Saccone et al., 2007a); with our denser coverage it remains the most significant representative of this second locus. Joint analysis of the uncorrelated SNPs rs578776 and rs16969968 indicates that these two variants each exert independent influence on nicotine dependence vulnerability. Though |D’| between these SNPs is 1, the low r2 between them means that the disease association at one does not statistically explain the disease association at the other. These two variants demonstrate an interesting evolutionary history, with the risk allele for rs16969968 occurring on the background of the higher risk variant for rs578776 (Table III). Although not all genotype combinations occur because of this history, the three genotypes at one locus on a fixed background for the other demonstrate a clear pattern of altered risk (Table III, first row and first column). It is unclear from our genetic data whether the functional variants underlying this susceptibility reside within the same CHRN gene or reflect variation in two different genes.
Our third significant single-SNP finding is at the 5′ end of CHRNB3-CHRNA6, tagged by rs13277254, and here again the functional source of this signal is unclear. Interestingly, a study in non-small cell cancer tumors found higher expression of CHRNA6 and CHRNB3 in non-smokers compared to smokers (Lam et al., 2007), suggesting potential mechanisms of action related to exposure to smoking and nicotine dependence.
CHRND on chromosome 2 harbors a fourth significant locus, with an additional interesting, though only nominally significant, uncorrelated locus in the neighboring gene CHRNG. Efforts to replicate these findings in independent samples would be of great interest. The CHRND-CHRNG cluster lies at the end of chromosome 2q in a region of linkage that has been persistently reported in the literature for nicotine dependence (Straub et al., 1999) and other addiction phenotypes (Agrawal et al., 2008; Gelernter et al., 2005; Gelernter et al., 2006). The γ subunit is known only to be fetally expressed and replaced by ε in late fetal development (Mishina et al., 1986), suggesting an unexpected mechanism for influencing addiction risk if indeed CHRNG is involved.
Our CIDR genotyping included the CHRNA3 SNPs rs1317286 and rs6495308 recently reported by Berrettini et al. to be associated with cigarettes-per-day with p-values of 0.0000026 and 0.000069 respectively (Berrettini et al., 2008). With our comprehensive coverage we are thus able to make a direct comparison and confirm that their results are in line with our two distinct findings represented by rs16969968 and rs578776. In our sample, rs1317286 (primary p = 0.00034, OR = 1.28) is in the same LD bin and highly correlated (r2 = 0.975) with the slightly more significant rs16969968 (p = 0.00013, OR = 1.30), confirming that these findings coincide. We believe that rs16969968 is the best functional candidate among the correlated SNPs representing this locus, and further work to test this is underway. The second SNP rs6495308 is correlated (r2 = 0.76) with our separate significant finding at rs578776. This suggests that the signal at rs6495308 (p = 0.0019, OR = 1.29) may be an attenuation of the stronger association at rs578776 (p = 0.00011, OR = 1.34). The correlation between rs16969968 and rs6495308 is low (r2 = 0.15), again indicating that these represent two different signals.
The CHRNA5-CHRNA3-CHRNB4 cluster is of further interest because of recent reports of significant association with lung cancer (Amos et al., 2008; Hung et al., 2008; Thorgeirsson et al., 2008), a disease for which cigarette smoking is known to be the major risk factor. The associations with lung cancer were either at the non-synonymous CHRNA5 SNP rs16969968 (p-value 1 × 10−20 (Hung et al., 2008)), or at SNPs highly correlated with it (p = 1.5 × 10−8 at rs1051730 (Thorgeirsson et al., 2008); p = 7× 10−18 at rs1051730 and p = 3× 10−18 at rs8034191 (Amos et al., 2008)). The lung cancer risk allele matches the risk allele for nicotine dependence.
Those reports differed in their interpretation of this association with lung cancer — that is, whether it is evidence of a direct effect on lung cancer vulnerability, or whether it can be explained entirely through the indirect effect of increased risk for smoking. What is clear is that this locus is a risk factor for nicotine dependence and smoking quantity. However, it is interesting to note that for lung cancer, the odds ratios for the effect of one copy of the risk allele were 1.30 (95% confidence interval 1.23-1.38) for rs16969968 (Hung et al., 2008), 1.32 (1.24-1.41) for rs8034191 and 1.32 (1.23-1.39) for rs1051730 (Amos et al., 2008), and 1.31 (1.19-1.44) for rs1051730 (Thorgeirsson et al., 2008). These therefore match the odds ratio we see for nicotine dependence: 1.31 (1.14-1.5) for rs16969968. The lung cancer studies highlighted this single risk locus in this region, and appear to indicate weaker association between lung cancer and rs578776, which tags our second nicotine dependence locus in the CHRNA5-CHRNA3-CHRNB4 cluster. In our nicotine dependence study, rs16969968 (p = 1.3×10−4, OR 1.31 (1.14-1.5)) and rs578776 (p = 1.1×10−4, OR 1.34 (1.16-1.56)) have comparable evidence and effect size for association. In contrast, (Hung et al., 2008) typed both SNPs in the discovery subset of their sample and in that subsample found lung cancer association p-values of 5×10−9 (OR 1.32 (1.2-1.44)) for rs16969968 versus 2.5×10−4 (OR 1.2 (1.08-1.33)) for rs578776. Future work to clarify the direct versus indirect effects of these variants on lung cancer will be of great interest.
An important, unique feature of our sample is the definition of controls: smokers (> 100 cigarettes lifetime) who have never exhibited symptoms of dependence. Genetic associations from this study thus provide insight into the transition from smoking to nicotine dependence. This design also circumvents noise that may occur if the control group includes non-smoking (unexposed) but genetically vulnerable individuals. We expect our results to overlap with those from studies of alternative phenotypes such as cigarettes-per-day, but differences in genetic findings may reflect important differences in the phenotypes.
Recently, (Zeiger et al., 2008) demonstrated that the CHRNB3-CHRNA6 cluster is associated with early subjective responses to tobacco in adolescents. One of their most significant results is tagged by rs4950 in CHRNB3; this SNP is significantly associated with nicotine dependence in our sample (p = 0.00010) and is in the LD bin tagged by rs13277254 (r2 = 0.995). This concordance of genetic findings for these two phenotypes suggests that subjective effects may mediate the association between this locus and addiction, and raises the possibility that early interventions may be effective in disrupting this particular pathway leading to increased addiction risk.
Other groups have reported significant associations between measures of cigarette use or nicotine dependence and some CHRN genes. Nicotine dependence was associated with two CHRNA4 SNPs (rs1044396 and rs1044397) in a family-based study of Chinese male smokers (Feng et al., 2004). We tested both of these SNPs and did not replicate the univariate findings, perhaps because of ethnic differences across our samples.
A study of six CHRNA4 SNPs in European Americans (EA) and African Americans (AA) reported various nominally significant findings for specific phenotypes or samples, with rs2236196 highlighted in AA women after correction for multiple testing (Li et al., 2005). Our current study provides some support for rs2236196 (p = 0.0048), although we see a stronger association with nicotine dependence at rs2273504 (p = 0.0023). Importantly, these two SNPs are uncorrelated (r2 = 0.07), indicating that we have modest evidence for potentially two distinct loci affecting nicotine dependence risk, one which is novel and the other which replicates the earlier finding in (Li et al., 2005).
A follow-up to the above cited study in the same sample reported association between rs2302763 in CHRNB1 and smoking quantity in EA (Lou et al., 2006). We did not test this specific SNP, but genotyped other highly correlated SNPs according to HapMap Centre d’Etudie du Polymorphisme Humain (CEPH) from Utah (CEU) data (The International HapMap Consortium, 2005) (e.g. rs2302765, rs4796418, rs3855924). None were associated with nicotine dependence in our sample. We do observe a separate, nominally significant signal in CHRNB1 tagged by rs17732878; this LD bin is not correlated with rs2302763 (HapMap CEU r2 < 0.06).
A study of 39 SNPs in 11 CHRN genes in Israeli women reported main effects of rs2072660 (CHRNB2) with history of daily smoking, and of rs1909884 (CHRNA7), rs4861065 (CHRNA9), and rs9298629 (CHRNB3) with nicotine dependence (Greenbaum et al., 2006). Our sample provides weak support for rs9298629 (p = 0.019). Another study reported associations between subjective reactions to cigarettes and rs2072660 and rs2072658 in CHRNB2, in Caucasian and Hispanic young adults (Ehringer et al., 2007). We do not see association between these SNPs and nicotine dependence in our sample.
In summary, we have strong evidence that at least four distinct variants in CHRNA5-CHRNA3-CHRNB4, CHRNB3-CHRNA6 and CHRND-CHRNG influence nicotine dependence risk. These four significant nicotinic receptor findings, though important, together account for less than 10% of the phenotypic variance in our sample, indicating that additional risk factors are yet to be discovered and illustrating the challenge of complex disease genetics. Our ongoing work will continue the search for other genes and epistatic effects; we will also extend our analyses to diverse population samples having contrasting LD structure, as such studies can help narrow down among correlated SNPs and localize the most likely functional source of an association signal. Ultimately, determining the mechanism of action for these variants via functional studies can help improve prevention and cessation therapies.
We thank Dennis Ballinger for advice on genotyping and Louis Fox for database and analytic support. The NICSNP project is a collaborative research group and part of the NIDA Genetics Consortium. Subject collection was supported by NIH grants CA89392 (COGEND, PI - L Bierut) from the National Cancer Institute and DA012854 (NAG, PI - P Madden) from the National Institute on Drug Abuse. Genotyping work at Perlegen Sciences was performed under NIDA Contract HHSN271200477471C. Additional genotyping services were provided by the Center for Inherited Disease Research (CIDR). CIDR is fully funded through a federal contract from the National Institutes of Health to The Johns Hopkins University, contract number HHSN268200782096C. Phenotypic and genotypic data are stored in the NIDA Center for Genetic Studies (NCGS) at http://zork.wustl.edu/. We are grateful to Gary Chase for his many contributions to this project and remember with great appreciation his important and diverse scientific accomplishments. This work is in his memory, and in memory of Theodore Reich, founding Principal Investigator of COGEND.
Grant Support: P01 CA89392 (LJB) from the National Cancer Institute, R01 DA012854 (PAFM), R01 DA014369 (JAS), K08 DA019951 (MLP), K01 DA015129 (NLS) from the National Institute on Drug Abuse, K01 AA015572 (ALH) from the National Institute on Alcohol Abuse and Alcoholism, and IRG-58-010-50 from the American Cancer Society (SFS).