Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Hum Mol Genet. Author manuscript; available in PMC 2008 March 20.
Published in final edited form as:
PMCID: PMC2270437

Cholinergic nicotinic receptor genes implicated in a nicotine dependence association study targeting 348 candidate genes with 3713 SNPs


Nicotine dependence is one of the world’s leading causes of preventable death. To discover genetic variants that influence risk for nicotine dependence, we targeted over 300 candidate genes and analyzed 3713 single nucleotide polymorphisms (SNPs) in 1050 cases and 879 controls. The Fagerström test for nicotine dependence (FTND) was used to assess dependence, in which cases were required to have an FTND of 4 or more. The control criterion was strict: control subjects must have smoked at least 100 cigarettes in their lifetimes and had an FTND of 0 during the heaviest period of smoking. After correcting for multiple testing by controlling the false discovery rate, several cholinergic nicotinic receptor genes dominated the top signals. The strongest association was from an SNP representing CHRNB3, the β3 nicotinic receptor subunit gene (P = 9.4 × 10−5). Biologically, the most compelling evidence for a risk variant came from a non-synonymous SNP in the α5 nicotinic receptor subunit gene CHRNA5 (P = 6.4 × 10−4). This SNP exhibited evidence of a recessive mode of inheritance, resulting in individuals having a 2-fold increase in risk of developing nicotine dependence once exposed to cigarette smoking. Other genes among the top signals were KCNJ6 and GABRA4. This study represents one of the most powerful and extensive studies of nicotine dependence to date and has found novel risk loci that require confirmation by replication studies.


The World Health Organization estimates that if current trends continue, the annual number of deaths from tobacco-related diseases will double from five million in the year 2000 to 10 million in 2020 (1,2). Nicotine, a naturally occurring alkaloid found in tobacco, mimics acetylcholine, and nicotine’s ability to bind to nicotinic cholinergic receptors (nAChRs) underlies the molecular basis of nicotine dependence [susceptibility to tobacco addiction (MIM 188890)]. Chronic nicotine exposure produces long-lasting behavioral and physiological changes that include increased synaptic strength, altered gene expression and nAChR up-regulation (3). Although nAChRs are expressed throughout the central nervous system, the addictive effects of nicotine are thought to be mediated through mesocorticolimbic dopamine (DA) pathways (4). It is believed that the interplay among glutamate, dopamine and gamma-aminobutyric acid (GABA) systems is critical for the reinforcing effects of nicotine (3,5). Cigarettes are the predominant form of tobacco used worldwide (6), and genetic factors are important to the etiology of nicotine dependence, with estimates of the heritability ranging from 44 to 60% (7).

Efforts to identify susceptibility loci influencing cigarette smoking behavior through association studies have used a candidate gene approach with both case–control and family-based designs. Several candidate genes that may influence smoking have been studied, including nicotinic receptors (8-10), nicotine metabolizing genes (11-13), dopamine system receptors (14-17), GABA receptors (18) and other neurotransmitters and receptors (19-21). There appears to be very little concordance among linkage findings and association findings in candidate genes (reviewed in 22). The only genome-wide association study (GWAS) to date is by the companion paper by Bierut et al. (23) which was conducted in parallel with our study and used the same case–control sample.

Our approach was to target an extensive set of candidate genes for single nucleotide polymorphism (SNP) genotyping to detect variants associated with nicotine dependence using a case–control design. We targeted over 300 genes for genotyping, with a design that allowed for approximately 4000 SNPs. These included the gene families encoding nicotinic receptors, dopaminergic receptors and GABA receptors, which are known to be part of the biological pathways involved in dependence. This was done in conjunction with a GWAS conducted in the companion paper by Bierut et al. (23). Both studies used a large sample of cases and controls of European descent. The 1050 nicotine dependent cases were contrasted with a unique control sample of 879 individuals who are non-dependent smokers. The size of the sample and strict control criteria should provide ample power to detect variants influencing nicotine dependence, but the depth of the coverage of known candidate genes is ambitious and requires delicate handling to deal with the complex issue of multiple testing. We used the false discovery rate (FDR) to limit the effects of multiple testing (24,25) and to report on the top FDR-controlled list of associations.


Our list of candidate genes initially numbered 448 and was divided into categories ‘A’ and ‘B’. All 55 category A genes were targeted for SNP genotyping, but because it was beyond our resources to target all of the remaining 393 category B genes, these were prioritized for SNP genotyping according to the results of the pooled genotyping in the parallel GWAS (23). Table 1 shows a summary of the results of the pooled genotyping in the candidate genes. Out of the 393 category B genes considered for SNP selection, 296 were targeted for individual genotyping in our candidate gene study. These were chosen using the lowest corrected minimum P-values, as defined in Eq. (1), where the cutoff was approximately P ≤ 0.95. We individually genotyped 4309 SNPs in these candidate genes, and after quality control filtering, 3713 SNPs were tested for association. There were 515 SNPs tested for 52 category A genes and 3198 SNPs tested for 296 category B genes.

Table 1
Results of the pooled genotyping in the candidate genes from the parallel GWAS

In the individual genotyping for the candidate genes, the 10 smallest P-values from our primary association analysis ranged from 9.36 × 10−5 to 1.22 × 10−3. There were 39 SNPs with an FDR <40%, indicating the presence of about 24 true signals (Tables 2 and and3;3; Fig. 1). These top 39 signals were dominated by nicotinic receptor genes (Figs 2 and and3).3). The top five FDR values corresponded to the genes CHRNB3, CHRNA3 and CHRNA5 and ranged from 0.056 to 0.166. Our best evidence was that four of these five signals were from genuine associations and were not due to random effects. The permutation FDR estimates were roughly the same as the FDR, differing by not more than 0.02, with a minimum permutation FDR of 0.07 at the SNP rs6474413. After selecting a single SNP from each linkage disequilibrium (LD) bin, three of these 39 SNPs showed significant evidence of a non-multiplicative model (Table 4) and several SNPs were found to have a significant gender by genotype interaction (Table 5; also see Supplementary Material, Table S1 for a list of all SNPs from Table 2 showing gender by genotype P-values and gender-specific odds ratios).

Figure 1
Results of the candidate gene association analysis. The P-values from the primary analysis are plotted for each chromosome below an ideogram using the −log10(P) transformation. The bottom axis is P = 1 and the top axis is P = 10−3. Category ...
Figure 2Figure 2
Detailed results for the top association signals. (A) The top two signals are near the CHRNB3 nicotinic receptor gene on chromosome 8. (B) The Non-synonymous SNP rs16969968 and the CHRNA5-CHRNA3-CHRNB4 cluster of nicotinic receptor genes on chromosome ...
Figure 3
LD between markers in (A) the CHRNB3-CHRNA6 and (B) CHRNA5-CHRNA3-CHRNB4 clusters of nicotinic receptor genes.
Table 2
Top associations with nicotine dependence where the weighted FDR is <40%
Table 3
Details of all category A genes and any category B genes with SNPs among our top signals (i.e. SNPs that appear in Table 2)
Table 4
SNPs exhibiting significant deviation from a multiplicative genetic model
Table 5
Gender-specific odds ratios and 95% confidence intervals for SNPs in Table 2

The β3 nicotinic receptor subunit gene CHRNB3, located on chromosome 8, accounted for the two strongest signals from our analysis: rs6474413 and rs10958726 (Fig. 2A). These two SNPs effectively contributed to a single signal because they were in very high LD with an r2 correlation ≥ 0.99. They are in the putative 5′ promoter region: the SNP rs6474413 is within 2 kb of the first 5′ promoter and the SNP rs10958726 is an additional 15 kb upstream. Two other SNPs in CHRNB3, rs4953 and rs4952, were also among the top signals. These are synonymous SNPs in exon 5 and are the only known coding SNPs for CHRNB3 (dbSNP build 125, Again, these represent a single signal as their genotypes were completely correlated.

The next group of SNPs among our top signals is in the CHRNA5–CHRNA3–CHRNB4 cluster of nicotinic receptor genes on chromosome 15 (Fig. 2B). The third most significant signal was the SNP rs578776 in the 3′-untranslated region (UTR) of CHRNA3, the α3 nicotinic receptor subunit gene (Fig. 2B). Approximately 5 kb downstream from CHRNA3 is our fifth strongest signal rs16969968, a non-synonymous coding SNP in exon 5 of CHRNA5, the α5 nicotinic receptor subunit gene. This SNP was in very strong LD with rs1051730, a synonymous coding SNP in CHRNA3, with an r2 correlation ≥0.99.

The most interesting signal appears to be the non-synonymous SNP rs16969968 in CHRNA5. As discussed earlier, it is completely correlated with an SNP in the CHRNA3 gene (Fig. 2B). Allele A of rs16969968 has a frequency of 38% in cases and 32% in controls. There is convincing evidence for a recessive mode of inheritance for this SNP (Table 4). Compared to having no copies, the odds ratios for having one copy and two copies of the A allele were 1.1 (95% CI 0.9–1.4) and 1.9 (95% CI 1.4–2.6), respectively. That is, compared with individuals with other genotypes, individuals with the AA genotype were nearly twice as likely to have symptoms of nicotine dependence.


Nicotine addiction from tobacco smoking is responsible for over three million deaths annually, making it the leading cause of preventable mortality in the world (1). In the USA in 2003, 21.6% of adults were smokers, where 24% of men and 19% of women were smokers (26). Previous association studies have been limited to narrowly focussed candidate gene studies. Our candidate gene study was more extensive, genotyping 3713 SNPs for 348 candidates in 1050 nicotine-dependent cases and 879 non-dependent smokers, where our control group definition was particularly strict.

Our top FDR-controlled findings were dominated by nicotinic receptor genes. Our positive association findings for the α5 and β3 nicotinic receptor subunits are novel. To date, most human genetic and biological studies of the nicotinic receptors and nicotine dependence have focussed on the α4 and β2 subunits because they co-occur in high-affinity receptors and are widely expressed in the brain (27). However, mouse studies have demonstrated that of the α4β2 containing receptors that mediate dopamine release, a substantial proportion contain α5 as well (28). This is consistent with our evidence for an important role of α5 in nicotine dependence susceptibility. Furthermore, in a brain α4β2 receptor, an α5 or β3 subunit can take the fifth position in the pentamer, corresponding to β1 of muscle. Although neither α5 nor β3 is thought to participate in forming binding sites, they are able to affect channel properties and influence agonist potency because they participate in the conformational changes associated with activation and desensitization (27).

The most compelling biological evidence of a risk factor for nicotine dependence is from the non-synonymous SNP rs16969968 in CHRNA5. This SNP causes a change in amino acid 398 from asparagine (encoded by the G allele) to aspartic acid (encoded by A, the risk allele), which results in a change in the charge of the amino acid in the second intracellular loop of the α5 subunit (29). The risk allele appeared to act in a recessive mode, in which individuals who were homozygous for the A allele are at a 2-fold risk to develop nicotine dependence. Although the α5 subunit has not been studied extensively and there are no reports of known functional effects of this polymorphism, it is striking that a non-synonymous charge-altering polymorphism in the corresponding intracellular loop of the α4 nAChR subunit has been shown to alter nAChR function in mice in response to nicotine exposure (30-33). This variant is common in the populations of European descent (allele frequency of A allele ~42%), but uncommon in populations of Asian or African descent (<5%, data from International HapMap project,

Also among the top 39 FDR-controlled signals were the genes KCNJ6 (also known as GIRK2) and GABRA4. These were the only other genes besides nicotinic receptors with SNPs that had P-values less than 0.001. KCNJ6 belongs to the inwardly rectifying potassium channel (GIRK) family of genes. GIRK provides a common link between numerous neurotransmitter receptors and the regulation of synaptic transmission (34). GABA is the major inhibitory neurotransmitter in the mammalian central nervous system and is critical for the reinforcing effects of nicotine (3,5). We found significant evidence that the risk due to genotype is much stronger in men than in women (Table 5), where the male odds ratio was 2.2 (95% CI 1.4–3.3).

Previously reported findings in other nicotinic receptors were not among our most significant findings. In prior studies of CHRNA4, nominal association with nicotine dependence measures was reported for the SNPs rs2236196 and rs3787137 in African-American families and rs2273504 and rs1044396 in European-Americans, but only rs2236196 in African-Americans remained after multiple testing correction (9). Also in CHRNA4, rs1044396 and rs1044397 were associated with both Fagerström test for nicotine dependence (FTND) score and qualitative nicotine dependence in a family-based sample of Asian male smokers (8). In our sample of European descent, we tested 11 SNPs for CHRNA4 including the above-mentioned SNPs except rs2273504, which did not pass our stringent quality control standards. The lowest primary P-value across all 11 SNPs was 0.026 for rs2236196 (study-wide rank = 132); this particular result may be considered a single test given the specific prior finding for this SNP, thus providing modest evidence for replication. The remaining four previously reported SNPs that we analyzed showed P-values greater than 0.8. Contrasts in these results are possibly due in part to the different ethnicities of the respective samples.

A recent study of smoking initiation and severity of nicotine dependence in Israeli women (10) analyzed 39 SNPs in 11 nicotinic receptor subunit genes. Their single SNP analyses also did not detect association with SNPs in α4, including rs2236196, rs1044396 and rs1044397, although finding nominal significance in the α7, α9, β2 and β3 subunits. Their study did not include the same SNPs in the β3 subunit and α5–α3–β4 cluster comprising our four strongest associations in nicotinic receptor genes; they did analyze our fifth ranking nicotinic receptor SNP, rs1051730, and found a suggestive P-value of 0.08 when comparing ‘high’ nicotine-dependent subjects with ‘low’ nicotine-dependent subjects in a much smaller sample than ours.

Our study was unable to corroborate reported association findings of Beuten et al. (18) for the β2 subunit of the GABAB receptor GABBR2 (also known as GABABR2, GABAB2 and GPR51). We genotyped 32 SNPs in GABBR2 including five SNPs reported by Beuten et al. (18), three of which were the most significant in European-Americans by at least one test in that study. The primary P-value in our study was greater than 0.07 for all 32 SNPs and greater than 0.3 for the five previously reported SNPs.

Similarly, we do not find evidence for nominal association in our primary test of the 31 SNPs we genotyped for the DDC gene, which includes an SNP previously reported significant in European-Americans (35). And of the 11 SNPs covering the gene BDNF, three (rs6265, rs2030324 and rs7934165) were previously reported as associated in European-American males (21); these three were not significant in our sample (primary P = 0.86, 0.088 and 0.12, respectively), and the lowest primary P-value among the remaining eight SNPs was 0.02, which does not survive correction for the six LD bins covering the gene. Note that our primary test uses a log-additive model, whereas previous reports sometimes found their strongest results under other models (e.g. recessive and dominant); however, for these previously reported associations, our tests for departure from the log-additive model did not find evidence for improvement under alternative modes of inheritance.

Our primary association analysis was a two-degree-of-freedom test of the significance of adding genotype and genotype by gender interaction terms to the base predictors sex and site. This approach helps to ensure that we detect associations that are significantly influenced by gender. The disadvantage is that the extra degree of freedom makes associations with insignificant gender interaction appears to be less significant overall.

Because our controls were highly selected and could even be considered ‘protected’ against susceptibility to nicotine dependence, interpretation of our results must consider the possibility that an association signal from our study may actually represent protective rather than risk effects. We used the allele more frequent in cases for reporting these data as a convention to facilitate comparison of the odds ratios among SNPs; this should not be viewed as a conclusion of how a particular variant influences the risk for nicotine dependence. The precise determination of the mechanism by which a variant alters risk can only come from functional studies.

We performed additional tests for association using only the individuals from the US sample to determine whether our primary conclusions still hold in this subset of 797 cases and 813 controls (the Australian sample alone is too small to test for association, with only 253 cases and 66 controls). We used the same logistic regression method as for the entire sample except for the omission of the term ‘site’. The Spearman rank-order correlation of the P-values between the two tests for association was 0.87. Supplementary Material, Table S2 shows the results of the US-only analysis for the 39 SNPs from our list of top associations (Table 2), with the original ordering and FDR filtering, side by side with results from the US sample. Supplementary Material, Table S3 describes the result of completely starting over and using only the US sample to order by P-value, filter by FDR <40% and compute LD bins. In this case, 30 of 39 (77%) SNPs in our original set of top signals (Table 2) appeared in the list of top signals in the US-only analysis (Supplementary Material, Table S3), which includes the genes CHRNA5 and CHRNB3, the top genes from our initial analysis. Hence, although there were some changes in the order of the results, the primary conclusion of association with the nicotinic receptors CHRNB3 and CHRNA5 remains valid when the analysis is performed on the US subsample.

As a companion to the candidate gene study, a GWAS was carried out in parallel (23). Approximately 2.4 million SNPs were genotyped across the human genome in a two-stage design that began with pooled genotyping in a portion of the sample and followed with individual genotyping of the entire sample for the top 40 000 signals. The 21st strongest signal from the GWAS was due to an SNP 3 kb upstream of the first 5′ promoter of CHRNB3, the gene with the strongest signal from our candidate gene study. This signal came from the SNP rs13277254 (genotyped only for the GWAS and not for our candidate gene study) and had a P-value of 6.52 × 10−5. This convergence from two different study designs provides further support that the signals in this gene are not random effects.

In conclusion, we have identified several genetic variants as being associated with nicotine dependence in candidate genes, the majority of which are nicotinic receptor genes. One of the SNPs implicated has a number of biologically relevant consequences, making it a particularly plausible candidate for influencing smoking behavior. These variants should be considered potential sources of genetic risk. Additional research is required to establish replication and possibly its role in the pharmacogenetics of response to nicotine dosing as well as to treatments for nicotine dependence.



All subjects (Table 6) were selected from two ongoing studies. The Collaborative Genetic Study of Nicotine Dependence (US) recruited subjects from three urban areas in the USA and the Nicotine Addiction Genetics (Australian) study collected subjects of European ancestry from Australia. Both studies used community-based recruitment and equivalent assessments were performed. Subjects who were identified as being smokers, using the criteria that they had smoked 100 or more cigarettes in their lifetimes, were queried in more detail using the FTND questionnaire. The US samples were enrolled at sites in St Louis, Detroit and Minneapolis, where a telephone screening of community-based subjects was used to determine whether subjects met criteria for case (current FTND ≥4) or control status. The study participants for the Australian sample were enrolled at the Queensland Institute of Medical Research in Australia, where families were identified from two cohorts of the Australian twin panel, which included spouses of the older of these two cohorts, for a total of approximately 12 500 families with information about smoking. The ancestry of the Australian samples is predominantly Anglo-Celtic and Northern European. The Institutional Review Boards approved both studies and all subjects provided informed consent to participate. Blood samples were collected from each subject for DNA analysis and submitted, together with electronic phenotypic and genetic data for both studies, to the National Institute on Drug Abuse (NIDA) Center for Genetic Studies, which manages the sharing of research data according to the guidelines of the National Institutes of Health.

Table 6
A summary of covariates and FTND scores in our sample: by definition, all control subjects scored 0 on the FTND (34)

Case subjects were required to score 4 or more on the FTND (36) during the heaviest period of cigarette smoking (the largest possible score is 10). This is a common criterion for defining nicotine dependence. Control subjects must have smoked 100 or more cigarettes in their lifetimes, yet never exhibited symptoms of nicotine dependence: they were smokers who scored 0 on the FTND during the heaviest period of smoking. By selecting controls that had a significant history of smoking, the genetic effects that are specific to nicotine dependence can be examined. Additional data from the Australian twin panel support this designation of a control status (23). In the US study, using the sample of 15 086 subjects who were determined to be smokers (smoked 100 or more cigarettes in their lifetimes) during the screening process, the prevalence of ‘nicotine dependence’ (FTND ≥ 4) was 46.4% and the prevalence of ‘smoking without nicotine dependence’ (FTND = 0) was 20.1%.

Candidate gene selection

The criteria for the selection of the candidate genes were based on known biology, correlations between nicotine dependence and other phenotypes and previous reports on the genetics of nicotine dependence and related traits. Genes were nominated by an expert committee of investigators from the NIDA Genetics Consortium (, with expertise in the study of nicotine and other substance dependence. These included classic genes that respond to nicotine, such as the nicotinic receptors, and other genes involved in the addictive process.

In total, 448 genes were considered for SNP genotyping. The genes were divided into two categories: A and B. Category A genes, which included the nicotinic and dopaminergic receptors, were considered to have a higher prior probability of association and were guaranteed to be targeted for genotyping. As our study design allowed for individual genotyping of approximately 4000 SNPs, the category B genes were too numerous to receive adequate SNP coverage once the A genes had been sufficiently covered. We therefore prioritized the category B genes using the results of the pooled genotyping from the companion GWAS study (23). Genes exhibiting the most evidence for association with nicotine dependence were prioritized for coverage. Some genes are larger than others and, therefore, may receive more SNPs. These genes may therefore appear more significant because of the increased number of tests performed. Hence, we corrected for multiple testing as follows. For a given candidate gene on the B list, if Pmin is the minimum P-value found in the pooled genotyping of stage I of the GWAS for all the SNPs genotyped in the gene and N is the number of SNPs tested, then we computed the corrected minimum P-value Pcorr using the formula


As roughly 50% of the SNPs in any chromosomal region are in high LD (37), we used (N + 1)/2 as the exponent. The category B genes were then ranked by these corrected minimum P-values and SNPs were selected from the top of the ranked list until our resources were exhausted.

SNP selection

We chose all SNPs within exons, regardless of the allele frequency, and all SNPs within ±2 kb of annotated gene promoters where the European-American minor allele frequency was at least 4%. We then chose tag SNPs for all European-American LD bins (38) crossing the exons of the candidate genes, with two SNPs for each bin with three or more SNPs. SNPs meeting these criteria were chosen first from those selected for individual genotyping in the companion pooled study (23) and then to cover the physical regions as uniformly as possible if there was choice available for the other SNPs. In addition, we included specific SNPs that have been reported in the literature as being associated with nicotine dependence (8,9,18,34).

Pooled genotyping

See the companion paper by Bierut et al. (23) for a description of the pooled genotyping.

Individual genotyping

For individual genotyping, we designed custom high-density oligonucleotide arrays to interrogate SNPs selected from candidate genes, as well as quality control SNPs. Each SNP was interrogated by 24 25mer oligonucleotide probes synthesized on a glass substrate. The 24 features comprise four sets of six features interrogating the neighborhoods of SNP reference and alternate alleles on forward and reference strands. Each allele and strand is represented by five offsets: −2, −1, 0, 1 and 2, indicating the position of the SNP within the 25mer, with 0 being at the 13th base. At offset 0, a quartet was tiled, which includes the perfect match to reference and alternate SNP alleles and the two remaining nucleotides as mismatch probes. When possible, the mismatch features were selected as purine nucleotide substitution for a purine perfect match nucleotide and as a pyrimidine nucleotide substitution for a pyrimidine perfect match nucleotide. Thus, each strand and allele tiling consisted of six features comprising five perfect match probes and one mismatch.

Individual genotype cleaning

Individual genotypes were cleaned using a supervised prediction algorithm for the genotyping quality, compiled from 15 input metrics that describe the quality of the SNP and the genotype. The genotyping quality metric correlates with a probability of having a discordant call between the Perlegen platform and outside genotyping platforms (i.e. non-Perlegen HapMap project genotypes). A system of 10 bootstrap aggregated regression trees was trained using an independent data set of concordance data between Perlegen genotypes and HapMap project genotypes. The trained predictor was then used to predict the genotyping quality for each of the genotypes in this data set (see Supplementary Material for more information regarding cleaning).

Population stratification analysis

In order to avoid false positives due to population stratification, we performed an analysis using the STRUCTURE software (39). This program identifies subpopulations of individuals who are genetically similar through a Markov chain Monte Carlo sampling procedure using markers selected across the genome. Genotype data for 289 high performance SNPs were analyzed across all 1929 samples. This analysis revealed no evidence for population admixture.

Genetic association analysis

An ANOVA analysis testing the predictive power of various phenotypes indicated that gender and site (USA or Australia) were the most informative and that age and other demographic variables did not account for significant additional trait variance (Table 7). Our primary method of analysis was based on a logistic regression: if P is the probability of being a case, then our linear logistic model has the form

Table 7
ANOVA analysis of covariates

where α is the intercept, g the gender coded 0 or 1 for males or females, respectively, and s the site coded as 0 or 1 for USA or Australia, respectively. The variable G represents genotype and is coded as the number of copies of the risk allele, defined as the allele more common in cases than in controls. It follows from Eq. (2) that the risk due to genotype is being modeled using a log-linear (i.e. multiplicative) scale rather than an additive scale. Maximum likelihood estimates for the coefficients and confidence intervals for odds ratios were computed using the SAS software package (40).

The predictors of our base model were gender and site. We then tested whether the addition of genotype and gender by genotype interaction to the base model significantly increased the predictive power and used the resulting two-degree-of-freedom χ2 statistic to rank the SNPs by the corresponding P-values. Table 8 shows the formulas for the odds ratios in terms of the coefficients.

Table 8
Coding of the gender term g and the genotype term G used in the primary logistic regression model

Following these primary analyses, we further analyzed the top ranked SNPs for significant evidence of dominant or recessive mode of inheritance. This was done using a logistic regression of the form


where H is 1 for heterozygotes and 0 otherwise. When H is significant, the interpretation is that the genetic effect deviates significantly from the log-linear model. We then compute odds ratios for dominant and recessive models, as described in Table 9.

Table 9
Codings used for the secondary logistic regression model

Linkage disequilibrium

We estimated r2 correlation separately in cases and controls for all pairs of SNPs within 1 Mb windows using an EM algorithm as implemented in the computer program Haploview (version 3.2, (41). Our final measure of LD is the minimum r2 from the two samples. Following the algorithm in Hinds et al. (38) and Carlson et al. (42), SNPs were grouped into bins, where every bin contains at least one ‘tag SNP’ satisfying min(r2) ≥ 0.8 with every SNP in the bin. The group of association signals from such an LD bin can be viewed essentially as a single signal.

Correcting for multiple testing

To account for multiple testing, we estimated the FDR (24,25) to control the proportion of false positives among our reported signals. As category A genes were considered to have a higher prior probability of association, we followed the recommendations of Roeder et al. (43) and weighted category A gene SNPs a moderate 10-fold more heavily. Therefore, the category B genes must have stronger association signals for inclusion in our list of FDR-filtered top signals. For each P-value, we computed a weighted P-value Pw using the formula

Pw={wPcategory A genes10wPcategory B genes

where w was defined so that the average of the weights is 1 (this depends on the number of SNPs selected for A and B genes). For every weighted P-value Pw0, we computed a q-value qw0 that has the property that the FDR is no greater than qw0 among all SNPs with qw < qw0 (25,44). This was done using the computer program QVALUE (version 1.1, (45). Our estimates of the FDR are based on the q-values.

This method of estimating the FDR does not take into account LD. Therefore, as an additional measure to correct for multiple testing and to assess statistical significance, we estimated the FDR using permutations and P-values weighted for A and B genes, which preserves the LD structure. This was done by performing 1000 random permutations of the case–control status and testing the permuted data for association. The significance of a P-value from the original data was assessed by counting the number of times a more significant weighted P-value occurs in the random permutations, where the weights were the same as those used for the FDR estimates.


The authors wish to acknowledge the contributions of advisors to this project. The NIDA Genetics Consortium and two NCG committees were vital to the success of the research. The Candidate Gene Committee helped review and finalize the list of candidate genes to be genotyped with individual SNP genotyping. In addition to the authors, committee members included Andrew Bergen, Joseph Cubells, Ken Krauter, Mary Jeanne Kreek, Sharon Murphy, Huijin Ring, Ming Tsuang and Kirk Wilhelmsen. The Data Analysis Committee helped oversee analyses for the candidate gene and genome-wide association studies and investigated methodological issues in association analyses. Further, the committee assisted in data management and data sharing functions. In addition to the authors, committee members included Andrew Bergen, Gerald Dunn, Mary Jeanne Kreek, Huijun Ring, Lei Yu and Hongyu Zhao. At Perlegen Sciences, we would like to acknowledge the work of Laura Stuve, Curtis Kautzer, the genotyping laboratory, Laura Kamigaki, the sample group and John Blanchard, Geoff Nilsen and the bioinformatics and data quality groups for excellent technical and infrastructural support for this work performed under NIDA Contract HHSN271200477471C. Figures 1 and and22 were generated with the Generic Genome Browser (version 1.64, (46). In memory of Theodore Reich, founding Principal Investigator of COGEND, we are indebted to his leadership in the establishment and nurturing of COGEND and acknowledge with great admiration his seminal scientific contributions to the field. This work was supported by the NIH grants CA89392 from the National Cancer Institute, DA12854 and DA015129 from the National Institute on Drug Abuse and the contract N01DA-0-7079 from NIDA.


SUPPLEMENTARY MATERIAL Supplementary Material is available at HMG Online.

Data access: Phenotypes and genotypes are available through the NIDA Genetics consortium ( to the scientific community at time of publication.

Conflict of Interest statement. D.G.B. and K.K. are employed by Perlegen Sciences, Inc. With the exception of D.G.B. and K.K., none of the authors or their immediate families are currently involved with, or have been involved with, any companies, trade associations, unions, litigants or other groups with a direct financial interest in the subject matter or materials discussed in this manuscript in the past 5 years.


1. World Health Organization. World Health Statistics 2006. WHO Press; 2006. [accessed 14 December, 2006].
2. Warren CW, Jones NR, Eriksen MP, Asma S. Global Tobacco Surveillance System (GTSS) collaborative group. Patterns of global tobacco use in young people and implications for future chronic disease burden in adults. Lancet. 2006;367:749–753. [PubMed]
3. Tapper AR, Nashmi R, Lester HA. Neuronal nicotinic acetylcholine receptors and nicotine dependence. In: Madras BK, Colvis CM, Pollock JD, Rutter JL, Shurtleff D, von Zastrow M, editors. Cell Biology of Addiction. Cold Spring Harbor Laboratory Press; Cold Spring Harbor, NY: 2006.
4. Laviolette SR, Van de Kooy D. The neurobiology of nicotine addiction: bridging the gap from molecules to behavior. Nat Rev Neurosci. 2004;5:55–65. [PubMed]
5. Corrigall WA, Coen KM, Adamson KL. Self-administered nicotine activates the mesolimbic dopamine system through the ventral tegmental area. Brain Res. 1994;653:278–284. [PubMed]
6. World Health Organization. The Tobacco Atlas. Types of Tobacco Use. 2006. [accessed 19 June, 2006].
7. Lessov CN, Martin NG, Statham DJ, Todorov AA, Slutske WS, Bucholz KK, Heath AC, Madden PA. Defining nicotine dependence for genetic research: evidence from Australian twins. Psychol Med. 2004;34:865–879. [PubMed]
8. Feng Y, Niu T, Xing H, Xu X, Chen C, Peng S, Wang L, Laird N, Xu X. A common haplotype of the nicotine acetylcholine receptor alpha 4 subunit gene is associated with vulnerability to nicotine addiction in men. Am J Hum Genet. 2004;75:112–121. [PubMed]
9. Li MD, Beuten J, Ma JZ, Payne TJ, Lou XY, Garcia V, Duenes AS, Crews KM, Elston RC. Ethnic- and gender-specific association of the nicotinic acetylcholine receptor alpha4 subunit gene (CHRNA4) with nicotine dependence. Hum Mol Genet. 2005;14:1211–1219. [PubMed]
10. Greenbaum L, Kanyas K, Karni O, Merbl Y, Olender T, Horowitz A, Yakir A, Lancet D, Ben-Asher E, Lerer B. Why do young women smoke? I. Direct and interactive effects of environment, psychological characteristics and nicotinic cholinergic receptor genes. Mol Psychiatr. 2006;11:312–322. [PubMed]
11. Boustead C, Taber H, Idle JR, Cholerton S. CYP2D6 genotype and smoking behaviour in cigarette smokers. Pharmacogenetics. 1997;7:411–414. [PubMed]
12. Pianezza ML, Sellers EM, Tyndale RF. Nicotine metabolism defect reduces smoking. Nature. 1998;393:750. [PubMed]
13. Cholerton S, Boustead C, Taber H, Arpanahi A, Idle JR. CYP2D6 genotypes in cigarette smokers and non-tobacco users. Pharmacogenetics. 1996;6:261–263. [PubMed]
14. Comings DE, Ferry L, Bradshaw-Robinson S, Burchette R, Chiu C, Muhleman D. The dopamine D2 receptor (DRD2) gene: a genetic risk factor in smoking. Pharmacogenetics. 1996;6:73–79. [PubMed]
15. Shields PG, Lerman C, Audrain J, Bowman ED, Main D, Boyd NR, Caporaso NE. Dopamine D4 receptors and the risk of cigarette smoking in African-Americans and Caucasians. Cancer Epidemiol Biomarkers Prev. 1998;7:453–458. [PubMed]
16. Lerman C, Caporaso NE, Audrain J, Main D, Bowman ED, Lockshin B, Boyd NR, Shields PG. Evidence suggesting the role of specific genetic factors in cigarette smoking. Health Psychol. 1999;18:14–20. [PubMed]
17. Spitz MR, Shi H, Yang F, Hudmon KS, Jiang H, Chamberlain RM, Amos CI, Wan Y, Cinciripini P, Hong WK, Wu X. Case–control study of the D2 dopamine receptor gene and smoking status in lung cancer patients. J Natl Cancer Inst. 1998;90:358–363. [PubMed]
18. Beuten J, Ma JZ, Payne TJ, Dupont RT, Crews KM, Somes G, Williams NJ, Elston RC, Li MD. Single- and multilocus allelic variants within the GABA(B) receptor subunit 2 (GABAB2) gene are significantly associated with nicotine dependence. Am J Hum Genet. 2005;76:859–864. [PubMed]
19. Hu S, Brody CL, Fisher C, Gunzerath L, Nelson ML, Sabol SZ, Sirota LA, Marcus SE, Greenberg BD, Murphy DL, Hamer DH. Interaction between the serotonin transporter gene and neuroticism in cigarette smoking behavior. Mol Psychiatry. 2000;5:181–188. [PubMed]
20. Lerman C, Caporaso NE, Audrain J, Main D, Boyd NR, Shields PG. Interacting effects of the serotonin transporter gene and neuroticism in smoking practices and nicotine dependence. Mol Psychiatry. 2000;5:189–192. [PubMed]
21. Beuten J, Ma JZ, Payne TJ, Dupont RT, Quezada P, Huang W, Crews KM, Li MD. Significant association of BDNF haplotypes in European-American male smokers but not in European-American female or African-American smokers. Am J Med Genet B Neuropsychiatr Genet. 2005;139B:73–80. [PubMed]
22. Li MD. The genetics of nicotine dependence. Curr Psychiatry Rep. 2006;8:158–164. [PubMed]
23. Bierut LJ, Madden PAF, Breslau N, Johnson EO, Hatsukami D, Pomerleau OF, Swan GE, Rutter J, Bertelsen S, Fox L, et al. Novel genes identified in a high-density genome wide association study for nicotine dependence. Hum Mol Genet. 2006;16:24–35. [PMC free article] [PubMed]
24. Hochberg Y, Benjamini Y. More powerful procedures for multiple significance testing. Stat Med. 1990;9:811–818. [PubMed]
25. Storey JD. A direct approach to false discovery rates. J R Stat Soc B. 2002;64:479–498.
26. CDC. Annual smoking-attributable mortality, years of potential life lost, and productivity losses-United States. Morb Mortal Wkly Rep. 2005;54:625–628. [PubMed]
27. Lindstrom JM. Nicotinic acetylcholine receptors of muscles and nerves: comparison of their structures, functional roles, and vulnerability to pathology. Ann N Y Acad Sci. 2003;998:41–52. [PubMed]
28. Salminen O, Murphy KL, McIntosh JM, Drago J, Marks MJ, Collins AC, Grady SR. Subunit composition and pharmacology of two classes of striatal presynaptic nicotinic acetylcholine receptors mediating dopamine release in mice. Mol Pharmacol. 2004;65:1526–1535. [PubMed]
29. Cserzo M, Wallin E, Simon I, von Heijne G, Elofsson A. Prediction of transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment surface method. Protein Eng. 1997;10:673–676. [PubMed]
30. Stitzel JA, Dobelis P, Jimenez M, Collins AC. Long sleep and short sleep mice differ in nicotine-stimulated 86Rb+ efflux and alpha4 nicotinic receptor subunit cDNA sequence. Pharmacogenetics. 2001;4:331–339. [PubMed]
31. Dobelis P, Marks MJ, Whiteaker P, Balogh SA, Collins AC, Stitzel JA. A polymorphism in the mouse neuronal alpha4 nicotinic receptor subunit results in an alteration in receptor function. Mol Pharmacol. 2002;62:334–342. [PubMed]
32. Butt CM, Hutton SR, Stitzel JA, Balogh SA, Owens JC, Collins AC. A polymorphism in the alpha4 nicotinic receptor gene (Chrna4) modulates enhancement of nicotinic receptor function by ethanol. Alcohol Clin Exp Res. 2003;27:733–742. [PubMed]
33. Butt CM, King NM, Hutton SR, Collins AC, Stitzel JA. Modulation of nicotine but not ethanol preference by the mouse Chrna4 A529T polymorphism. Behav Neurosci. 2005;119:26–37. [PubMed]
34. Lewohl JM, Wilson WR, Mayfield RD, Brozowski SJ, Morrisett RA, Harris RA. G-protein-coupled inwardly rectifying potassium channels are targets of alcohol action. Nat Neurosci. 1999;12:1084–1090. [PubMed]
35. Ma JZ, Beuten J, Payne TJ, Dupont RT, Elston RC, Li MD. Haplotype analysis indicates an association between the DOPA decarboxylase (DDC) gene and nicotine dependence. Hum Mol Genet. 2005;14:1691–1698. [PubMed]
36. Heatherton TF, Kozlowski LT, Frecker RC, Fagerström KO. The Fagerström test for nicotine dependence: a revision of the Fagerström tolerance questionnaire. Br J Addict. 1991;86:1119–1127. [PubMed]
37. Saccone SF, Rice JP, Saccone NL. Power-based, phase-informed selection of single nucleotide polymorphisms for disease association screens. Genet Epidemiol. 2006;30:459–470. [PubMed]
38. Hinds DA, Stuve LL, Nilsen GB, Halperin E, Eskin E, Ballinger DG, Frazer KA, Cox DR. Whole-genome patterns of common DNA variation in three human populations. Science. 2005;18:1072–1079. [PubMed]
39. Pritchard JK, Stephens M, Donnelly PJ. Inference of population structure using multilocus genotype data. Genetics. 2000;155:945–959. [PubMed]
40. SAS Institute Inc. SAS Release 9.1.3. Cary, NC: 2004.
41. Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005;15:263–265. [PubMed]
42. Carlson CS, Eberle MA, Rieder MJ, Yi Q, Kruglyak L, Nickerson DA. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am J Hum Genet. 2004;74:106–120. [PubMed]
43. Roeder K, Bacanu S-A, Wasserman L, Devlin B. Using linkage genome scans to improve power of association genome scans. Am J Hum Genet. 2006;78:243–252. [PubMed]
44. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B. 1995;57:289–300.
45. Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proc Natl Acad Sci. 2003;100:9440–9445. [PubMed]
46. Stein LD, Mungall C, Shu S, Caudy M, Mangone M, Day A, Nickerson E, Stajich JE, Harris TW, Arva A, et al. The generic genome browser: a building block for a model organism system database. Genome Res. 2002;12:1599–1610. [PubMed]