We performed genome wide association analyses for seven related smoking behaviors in two datasets totaling 4,611 individuals and 2617 ever smokers. We selected smoking behaviors with established hereditary components 
and public health relevance
. To the best of our knowledge this study represents the first genome-wide association study of duration of smoking, pack years, and age at initiation of smoking. The sample size is also larger than most published candidate gene association studies of smoking behavior 
and two previous genome-wide association studies of smoking behaviors 
Although we did not discover novel genome-wide significant (p<10−7) associations, we did find additional evidence for an association between genetic variants in the chr15q25.1 region and number of cigarettes smoked per day. Candidate gene analyses also provided suggestive evidence for association between variants in the MAOA gene region and the smoking behavior cigarettes per day.
The lack of genome-wide significant results suggests that common variants have at most a modest influence on smoking behavior. We had adequate power to detect a variant that explained even 2.5% of the variation in cigarettes per day. We had 61% power in the NHS sample and 71% power in the PLCO sample to detect such a variant at the 10−7
level; the power of the combined analysis was greater than 99%. Conversely, the lack of genome-wide significant findings does not rule out the existence of (many) common variants with small individual effects on smoking behavior, since our power to detect any one is small. Even with our relatively large sample size, our power to detect a variant similar to the 15q25.1 SNP rs1051730 (which was estimated to explain about 0.7% of the trait variance 
at the genome-wide significance level) was only 8.5% for the combined analysis (and less than 1% for either study alone).
SNPs at the nicotinic receptor candidate genes CHRNA3
(chr15q25.1) and CHRNA1
(chr2q31.1) are associated in the CGEMS sample with three smoking behaviors: CPD, PKYRS and SMKAGE (). Another candidate gene association study investigating 348 of 359 candidate genes included in this study 
evaluated association with a dichotomized nicotine dependence phenotype, and identified nicotinic receptor SNPs associated with FTND, including rs578776 and rs1051730 within CHRNA3
, and rs16969968 within CHRNA5
. Nicotinic receptors are also associated with CPD in the candidate gene group analysis as the most significantly associated gene group, and also with the phenotype SMKDU ().
Finally, we combined our chr15q25.1 results with data from three other published reports (Table S3
. The SNP rs1051730, found within CHRNA3
(Ex5+268), was highly statistically significantly associated with CPD (p
); the SNP rs8034191 (LOC123688
IVS2+256) was also highly statistically significantly associated with CPD (p
). These SNPs were evaluated using a total of 26,789 (rs1051730) or 24,891 (rs8034191) smokers from this study and two other reports. The CHRNA5
SNP rs16969968 (Ex5-54, D398N) was significantly associated (p<.01) with CPD in this study but not an earlier, smaller study; combined evidence for association in 3,464 smokers remained significant (p
). Comparative judgments of the relative importance of the individual SNPs are not possible due to the different sample sizes, the strong LD among the SNPs and the inability to adjust for the effects of the other SNPs in this meta-analysis.
Our candidate gene analyses identified an association (rs3027409, p<5.4×10−5
) between genetic variation in MAOA
and a dichotomized measure of smoking intensity (10 or less cigarettes smoked per day versus more than 10). This was the only gene-level result that remained significant after Bonferroni correction for the number of genes tested, which we regard as a conservative multiple-testing correction. This association is notable because of the role of the monoamine oxidases in the regulation of catecholamines and the inhibition of monoamine oxidases A and B by tobacco smoke 
. There is substantial evidence that smoking results in reduced levels of the monoamine oxidase enzymes 
and subsequent reduced catabolism of dopamine likely contributes to the reinforcing and motivating effects of smoking. Investigation of MAO-related polymorphisms in relation to alcoholism 
, Parkinson disease 
and smoking 
have yielded mixed results; our results suggest further investigation of this X-chromosome locus is warranted.
The gene group analysis that we performed provides one way to summarize the statistical evidence for association between a trait and multiple genetic variants across groups of genes that share sequence similarity and function. Nicotinic cholinergic receptors and voltage-dependent calcium-activated potassium channels were significantly associated with CPD (gene group P<0.01). We have previously discussed nicotinic receptor findings above. The association of rs7050529 (IVS3+286 of TRPC5) with CPD () is notable as a closely related family member, TRPC7
, was previously significantly associated with nicotine dependence 
. The transient receptor potential cation family is a superfamily of 28 genes coding for cationic ion channels responding to temperature, endogenous and exogenous organic compounds, Ca2+ flux, and mechanical stimuli, and are expressed in nearly every tissue 
. This study, the NICSNP study and Feng et al., 2006 
have identified significant associations between five Transient Receptor Family Potential (TRP) subfamily members and nicotine related behaviors in the canonical (this study , 
, and 
) and vanilloid subfamilies (this study , and Saccone et al., 2007, Supplementary Material 
). Recently, Gu et al., 2005 
have shown that vanilloid subfamily members are expressed in the lung and are responsible for the pulmonary chemoreflex response, suggesting further study of these TRP subfamilies and their potential role in smoking behavior and downstream consequences may be fruitful.
The cytochrome P450, cell cycle control, and alcohol dehydrogenase candidate genes groups also exhibited nominally significant (0.01<Ppermuted
<0.05) associations with smoking behaviors (). The cytochrome P450 results may have been driven by association between SNPs at CYP2B6
with EVNV, and CYP2A6
and SMKAGE (). These results are consistent with evidence for the relationship between CYP2A6
genetic variation and both nicotine metabolism 
and smoking behavior 
In our study, the observed association between cell cycle control genes and quit status may be driven by association of SNPs at FBXL17
0.021, rs1433050) and NFKB1
(rs10489113, gene-level P
is one of 68 members of the human F-box protein superfamily, a large group of ubiquitin ligases 
. Ubiquitin ligases function in the ubiquitin-proteasome complex, which regulates protein assembly, trafficking and degradation, a cellular activity itself regulated by nicotine 
was also identified in the NICSNP GWAS 
as significantly associated with FNTD, via another SNP (rs10793832). None of the SNPs in the same high linkage-disequilibrium bin as rs10793832 (according to the Pelegen genome browser) were in high linkage disequilibrium with rs1433050, the FBXL17
SNP identified in this study. One SNP genotyped in this study (rs885624) was in the same LD block as rs10793832 but was not significantly associated with quit status in either this study alone or in the combined analysis (p
The finding that the alcohol dehydrogenases genes were significantly associated with the smoking behavior EVNV in this analysis (e.g., ADH4
0.048 (rs3828541), and ADH6
, gene-level P
0.034 (rs3857224) suggests that genetic variation at these ADH loci may influence the establishment of smoking behavior. However this analysis did not control for alcohol consumption and so this finding should be considered preliminary.
Because of the large number of male and female smokers, we were able to conduct genome-wide association scans stratified by gender (study), and conduct a genome-wide association scan for differences in genetic effect between men and women. Such analyses are important, because the effect for some loci may differ between men and women or be restricted to one gender, e.g., due to differences in the environment. However, no SNPs achieved genome-wide significance for association with any smoking behavior in either study, and no SNP achieved genome-wide significance for heterogeneity in effect between men and women (between studies).
This study has several strengths. We performed a GWAS and candidate gene study investigating a variety of smoking behaviors with public health importance for the first time in a sample unselected for smoking behaviors and/or smoking attributable disease. We confirm important findings from recent GWAS and candidate gene studies of nicotine dependence and CPD. Our sample size is relatively large, yet still not large enough to reliably detect variants with modest effects on smoking behaviors. The absence of selection bias in the cohort bases for the samples enhances generalizability to U.S. non-Hispanic whites although a modest limitation is that the education level in both cohorts is above average. By limiting analyses to subjects of European ancestry and adjusting for principal components of population structure, we minimized risk of false positives due to population stratification, but are not be able to detect SNP alleles associated with smoking behavior that are common in non-Europeans but rare among European-Americans. The smoking behavior characteristics for the two studies are quite similar after taking into account expected differences by gender (), and the correlation of smoking behaviors are similar within NHS and PLCO (see Table S1
). The combined sample has the advantage of increased power and generalizablity.
The diverse smoking behaviors we investigated represent the spectrum of key events in an individual's smoking history from initiation (age at initiation, ever never smoking) thru establishment of dependency (smoking duration, smoking intensity, and pack years), to outcome (current versus former cigarette smoking status), with potential genetic influence at each stage. The finding that selected genes are associated with multiple phenotypes may represent both correlations among the phenotypes but also pleiotropic effects of the genes, and is a strength of the integrative approach 
. Although we did not identify specific candidate regions that achieved the genomewide threshold of statistical significance, our study provides candidate genes for follow-up evaluation. Future GWAS studies with additional smoking behavioral measures, including nicotine dependence measures, the planned sharing of data across large consortia with increased sample size 
and the functional analysis of individual SNPs 
, will be required to achieve the necessary power and specificity to understand SNP with low effects (OR<1.3), effects in subgroups, explore effect modification by demographic variables, and dissect pleiotropy.