|Home | About | Journals | Submit | Contact Us | Français|
Although previous investigations have indicated a role for genetic factors in smoking initiation, the underlying genetic mechanisms are still unknown. In 2,339 adolescents from a Chinese Han population in the Wuhan Smoking Prevention Trial (Wuhan, China, 1998–1999), the authors explored the association of 57 genes in the dopamine pathway with smoking initiation. Using a conservative approach for declaring significance, positive findings were further examined in an independent sample of 603 Caucasian adolescents followed for up to 10 years as part of the Children's Health Study (Southern California, 1993–2009). The authors identified 1 single nucleotide polymorphism (rs2298122) in the calcyon neuron-specific vesicular protein gene (CALY) that was positively associated with smoking initiation in females (odds ratio = 2.21, 95% confidence interval: 1.49, 3.27; P = 8.4 × 10−5) in the Wuhan Smoking Prevention Trial cohort, and they replicated the association in females from the Children's Health Study cohort (hazard rate ratio = 2.05, 95% confidence interval: 1.27, 3.31; P = 0.003). These results suggest that the CALY gene may influence smoking initiation in adolescents, although the potential roles of underlying psychological characteristics that may be components of the smoking-initiation phenotype, such as impulsivity or novelty-seeking, remain to be explored.
As the leading cause of preventable morbidity and mortality in the world, smoking results in 5 million premature deaths worldwide each year (1, 2), and this number will increase to more than 10 million by the year 2020 (3, 4). Previous research has indicated that many adult smokers started smoking in early adolescence (5), and early initiation of smoking greatly increases the likelihood of adult smoking dependence (6), failure to quit (7), and smoking-related health problems (8). Numerous studies have documented demographic, psychological, social, and cultural risk and protective factors for smoking initiation. Demographic factors include age, ethnicity, immigrant status, and acculturation (9, 10). Psychological factors include depression (8, 11), impulsivity (12), and other personality traits. Social factors include the smoking behavior of friends, siblings, and parents (13, 14), bonding with one's school or other prosocial institutions (15), family cohesion and conflict (16), and social norms. Cultural factors include cultural values, discrimination, and westernization (17). While it is clear that these factors influence smoking initiation, there is evidence supporting an important role of genetic factors as well (18, 19).
Although many previous investigations have focused on the genetic basis of nicotine dependence and smoking cessation (20–23), there has been less research on smoking initiation (24–27). Most candidate-gene-based association studies on smoking initiation have focused on the effects of a limited number of genes, with a limited number of variants within each gene (28–32). Furthermore, 2 recent genome-wide association studies failed to identify single nucleotide polymorphisms (SNPs) associated with smoking initiation on the genome-wide significance level in adults (33, 34). In contrast to these 2 distinct study designs, a pathway-based study (35, 36) inspects the association of multiple variants in multiple genes in 1 or more plausible pathways based on a priori biologic information. By including multiple variants and multiple genes and focusing on hypothesis-driven rather than purely exploratory methods, a well-designed pathway-based investigation could offer some advantages for identifying variants and providing clues to the mechanisms underlying this complex behavior (37, 38).
There is ample evidence indicating the role of the dopamine pathway in smoking-related behaviors. The dopamine pathway plays a central role in the brain's reward system (39), potentially motivating drug-seeking and other behaviors related to addiction (40). Furthermore, the dopamine pathway is involved in many psychological traits that often co-occur with smoking, such as depression and anxiety (41–43). Moreover, previous investigations demonstrated that rats with high extracellular dopamine levels have persistently high exploratory activity (44) and that overexpression of the dopamine transporters can lead to impulsive behaviors (45). Both of these phenotypes can affect a person's reaction to smoking cues, with a subsequent impact on smoking initiation (46, 47).
With a cross-sectional sample of more than 2,300 adolescents from a genetically homogeneous Han population in Wuhan, China, we explored the association between smoking initiation and SNPs in 57 dopamine pathway genes. Significant findings were then examined in the Children's Health Study (CHS), an independent cohort of adolescents in Southern California, whose smoking behavior has been followed for up to 10 years.
The Wuhan Smoking Prevention Trial (WSPT) is a longitudinal smoking prevention trial initiated by the China Center for Disease Control and Prevention and the Department of Preventive Medicine at the University of Southern California (Los Angeles, California) in 1998 (48). The cohort includes 2,661 adolescents from 14 middle schools in 7 urban districts in Wuhan, China, who were enrolled between December 1998 and January 1999. The participating adolescents completed a 200-item survey for assessment of their baseline smoking behavior and social and economic status, with a response rate of 97%. Buccal cells were collected for future DNA extraction (49). The questionnaires, study protocol, data collection, and DNA sample collection were approved by the institutional review boards at the University of Southern California, the China Center for Disease Control and Prevention, and the US National Institute of Environmental Health Sciences.
Following the methods of large-scale investigations such as the Global Youth Tobacco Survey (50), smoking initiation was determined from participants’ responses to a question on lifetime tobacco exposure (“Have you ever tried smoking, even a few puffs?”). Never smokers were defined as subjects who reported that they had never smoked a cigarette, not even a puff. Ever smokers were defined as students who reported that they had ever tried smoking. This measure was used instead of alternative measures such as past-month smoking or daily smoking because it is more sensitive to low-frequency smoking behavior in samples with a low prevalence of smoking, such as young adolescents.
Details on the selection of candidate genes in the dopamine pathway are given elsewhere (51). Briefly, 57 candidate genes were selected on the basis of a systematic review of the pathway-based databases (52), as well as comprehensive bioinformatics literature mining. The selected candidate genes mainly consisted of genes encoding proteins involved in the dopamine-mediated reward pathway, including dopamine receptors, the dopamine transporter, and enzymes involved in biosynthesis and metabolism of dopamine. Genes hypothesized to play a role in modulating the dopamine pathway (e.g., genes in the serotonin and opioid pathways) were also included. Nine neuronal nicotinic acetylcholine receptor subunit genes which have been reported to regulate dopamine release in the brain (53) were also included. The complete list of selected candidate genes is shown in Web Table 1, which is posted on the Journal’s Web site (http://aje.oxfordjournals.org/).
We selected 1,295 SNPs within our candidate genes. Several candidate genes are positioned adjacent to each other on a chromosome, reducing the regions of interest for tag SNP selection to 53 (51 gene regions on the autosomes and 2 on the X chromosome). SNP selection within each gene region (±10 kilobases) was carried out with a beta version of the Snagger software (54), using HapMap data from the Han Chinese (CHB) and Caucasian (CEU) populations (Genome Build 35) (55). Among the 1,295 selected SNPs, 118 have an a priori putative function, and conditional on their inclusion, an additional 1,177 tag SNPs were chosen to capture the underlying genetic structure. Overall, within the 53 gene regions, 93% of all SNPs in HapMap have substantial linkage disequilibrium (r2 ≥ 0.80) with at least 1 SNP genotyped in this study in the HapMap Chinese population, indicating that the selected SNPs capture the genetic structure in these gene regions adequately.
SNP genotyping was performed in the Genomics Core Facility at the University of Southern California's Norris Comprehensive Cancer Center using the Illumina GoldenGate Genotyping Assay (Illumina, San Diego, California). Quality control procedures included automated protocols utilizing robotics and bar coding for the entire genotyping process and the inclusion of replicates and HapMap CEPH trios to identify genotyping errors.
Among the 2,661 subjects, genotype data were available for 2,443 persons. Of the 1,295 SNPs genotyped, 47 with a call rate of zero were excluded. Thereafter, 95 persons with call rates less than 95% were excluded. After removal of these persons, 25 SNPs with call rates less than 95% were excluded. An additional 9 persons with no phenotype information were removed, leaving 2,339 subjects available for analysis. Of the remaining 1,223 SNPs, 92% had a SNP call rate of 99% or more.
Untyped SNPs were imputed on the basis of the haplotype structure of the HapMap East Asian population (58) for the 51 autosomal chromosome gene regions in the WSPT. Genotypes at 2,260 SNPs were imputed using MACH software (59). Of the imputed genotypes, 189 SNPs with imputation quality scores lower than 0.80 were excluded. After combining the genotyped and imputed SNPs, 165 SNPs with P values smaller than 0.005 from the exact Hardy-Weinberg equilibrium test and 429 SNPs with minor allele frequencies less than 0.01 were removed. Thus, 2,700 SNPs were included in the final analysis.
The STRUCTURE program (60) was used to estimate the coefficient of ancestry for each person in the WSPT sample. No large-scale mixture was observed for any of the subjects (96.8% of the participants had an estimated Asian coefficient of ancestry greater than 0.90). Furthermore, the results were similar from analyses with and without adjustment for the coefficient of ancestry.
A generalized linear mixed model (with logistic link) was employed to evaluate the effect of each SNP on smoking initiation with school-level variation modeled through a school-specific intercept. Considering the previously documented gender difference in smoking behavior among Chinese adolescents (61), the effect of a given SNP was first analyzed with all subjects and then with data stratified by gender. Age and gender (when applicable) were included in the model as covariates.
To determine the heredity model, additive, dominant, and recessive models were tested, and the model with the smallest P value was selected. To obtain valid P values, we adjusted the final reported P value using the PACT (P value adjusted for correlated tests) approach (62). We first performed this adjustment to account for the determination of the heredity model and then performed the adjustment separately in each gene region to account for the multiple correlated tests due to linkage disequilibrium between markers in a single gene region. This procedure yielded a gene-level adjusted P value for each SNP. In addition, a Bonferroni correction for the number of gene regions investigated was applied to determine the α level for pathway-level significance (i.e., adjustment across all candidate gene regions; 0.05/53 = 9.43 × 10−4 for adjusted P value). All analyses were performed using the statistical package R 2.81 (63).
The CHS is a longitudinal study designed to assess the long-term effects of air pollution on children's health in the Los Angeles, California, area (64). The study consists of 5 cohorts totaling more than 11,000 children enrolled in 1993, 1996, and 2003, followed every year for up to 10 years. Genome-wide genotyping using the Illumina 550 K chip was recently performed on over 4,000 of the CHS subjects, with a primary focus of finding genes related to asthma and lung function development.
For this paper, we focused on the available data on the first wave of 1,942 children genotyped. We applied a stringent data-cleaning procedure that involves a quality control process of 30 or more steps, including identification and comparison of inter- and intraplate controls, stratified SNP and sample call rate investigations by cohort, confirmation of strand, concordance comparisons with additional genotyping performed via the Illumina GoldenGate Genotyping Assay, intensity checks, identification of chromosome X and Y anomalies, stratification by stripe, and tests of Hardy-Weinberg equilibrium. The quality control procedures that we followed have become the general standard for genome-wide association studies. Specifically for the SNPs replicated in the CHS, call rates were greater than 94.7%, and all were in Hardy-Weinberg equilibrium (P > 0.20).
Of the 1,942 genotyped subjects, information on smoking behavior was available for 926 (the other 1,016 subjects belong to the recently enrolled (2003) cohort and consist mostly of children under 10 years of age). To ensure genetic and environmental homogeneity within the CHS sample and to reduce the impact of potentially different linkage disequilibrium patterns across ethnic groups, we excluded 322 children who identified themselves as Hispanic or had an estimated European coefficient of ancestry less than 90% from the population structure analysis, using the program STRUCTURE (63). We further excluded 1 child who initiated smoking before enrollment in the CHS. Thus, 603 Caucasian children were included in the final analysis. All CHS subjects and their parents gave informed consent, and the study was approved by the University of Southern California Institutional Review Board.
During the investigation, CHS subjects were interviewed annually to estimate the number of cigarettes they had smoked in the previous year before each interview. The phenotype of smoking initiation was defined as the first time a participant reported having ever smoked during the past year.
To validate the positive findings from the WSPT, we performed a Cox regression analysis to estimate the hazard rate ratio for smoking initiation over the follow-up period. Gender, the coefficient of European ancestry, and an indicator of cohort (the subjects in the first wave of genotyping came from different cohorts) were included in the Cox regression model as covariates.
The demographic characteristics of the WSPT sample are shown in Table 1. Of the 2,339 persons included in the final analysis, 706 (30.2%) had initiated smoking and 1,633 (69.8%) were never smokers at the time of interview. The sample contained slightly more males (1,218; 52.1%) than females (1,121; 47.9%), and the prevalence of smoking initiation was much higher in males (40.9%) than in females (18.6%), corresponding to a gender odds ratio of 3.08 (95% confidence interval (CI): 2.55, 3.72). The average age of the subjects was 12.6 years (standard deviation, 0.7).
The CHS sample contained 285 males (47.3%) and 318 females (52.7%) (Table 1). The children were enrolled in the CHS at a mean age of 10.2 years (standard deviation, 0.96), and the average follow-up time was 7.41 years (standard deviation, 1.07). During follow-up, 59 (20.7%) of the males and 70 (22.0%) of the females initiated smoking. No gender difference in the rate of smoking initiation (P = 0.74) was observed.
Several SNPs in the corticotropin-releasing hormone receptor 1 gene (CRHR1) (rs4076452, rs12953076, and rs4074461) were associated with smoking initiation with marginal statistical significance in the WSPT. Details on the association between the 3 SNPs and smoking initiation are shown in Table 2. The 3 SNPs were in strong linkage disequilibrium with each other (r2 > 0.95) and were therefore considered 1 signal. The most significant SNP, rs4076452 (minor allele frequency = 0.11; recessive heredity model; odds ratio = 4.47, 95% CI: 2.15, 9.31), was an imputed SNP with a very high imputation quality score (0.99).
The association between rs4076452 and smoking initiation was further examined in the CHS. No association was observed (hazard rate ratio = 1.49, 95% CI: 0.61, 3.68; P value = 0.38). Additionally, we did not find any statistically significant association between the other 2 SNPs (rs12953076 and rs4074461) and smoking initiation in the CHS.
Considering the heterogeneity of smoking prevalence in the Chinese sample, we stratified the sample by gender and repeated the analyses described above. The gender-specific adjusted P values are shown in Web Figure 2, and the complete results can be found in Web Table 3.
In the WSPT, 1 SNP in the calcyon neuron-specific vesicular protein gene (CALY) (rs2298122, minor allele frequency = 0.07) and several other SNPs in linkage disequilibrium (rs11101694 and rs4838721, both with r2 values of approximately 0.88 with rs2298122) were found to be associated with smoking initiation only in females. These SNPs were also considered to be 1 signal. The prevalence of smoking initiation in females was 16.6% with the TT genotype of rs2298122, 34.9% with the GT genotype, and 50.0% with the GG genotype. Assuming an additive heredity model, the corresponding odds ratio was 2.21 (95% CI: 1.49, 3.27), with a crude P value of 8.39 × 10−5 (Table 3). After adjustment for all of the SNPs within CALY and determination of the heredity model, the adjusted P value was 4.10 × 10−4, which was significant after adjustment for all genes tested. No association between rs2298122 and smoking initiation was observed in males (odds ratio = 0.88, 95% CI: 0.63, 1.21; adjusted P = 0.68).
We further examined the association of rs2298122 with smoking initiation in the CHS. As shown in Figure 1, the smoking initiation rate among females with the TT genotype was consistently lower than that among those with the GG/GT genotypes. Assuming an additive heredity model, the estimated hazard rate ratio for rs2298122 in females was 2.05 (95% CI: 1.27, 3.31; P = 0.003). Consistent with the WSPT, this association was not observed in males (hazard rate ratio = 1.32, 95% CI: 0.78, 2.23; P = 0.304).
In the current investigation, we first explored the effect of genetic variants in the dopamine pathway on smoking initiation in a large, cross-sectional sample of Chinese Han adolescents. After an extensive pathway-based candidate gene selection, a comprehensive SNP selection, genotyping, and imputation for untyped SNPs, 2,723 SNPs in 57 genes were examined for association with smoking initiation. Thereafter, SNPs surviving stringent multiple-testing correction in the Chinese population were examined in an independent Caucasian adolescent cohort sample in Los Angeles, California, a dramatically different sociocultural setting. When screening for the effects of the candidate genes in the WSPT, 1 SNP in CALY (rs2298122) was associated with smoking initiation only in females. In the CHS, highly consistent patterns were observed for rs2298122. This replication in a completely independent adolescent sample from a different population with a distinct design appears promising in implicating CALY in smoking initiation.
Previous investigations have demonstrated that CALY might be important in brain function and psychiatric disorders. CALY codes for calcyon neuron-specific vesicular protein (65), a novel brain-specific protein. Recent investigations revealed that the calcyon C terminal can stimulate the self-assembly of clathrin in a dose-dependent fashion and is therefore involved in clathrin-mediated endocytosis, which is crucial for efficient synaptic transmission and optimizing the levels of releasable pools of neurotransmitters (66) and is thereby involved in dopamine-related signaling and dopamine activity. Both dopamine-related signaling and dopamine activity have been implicated in various brain functions, such as motor control and cognitive processing (67, 68). Specifically, it has been shown that dopaminergic neurotransmission plays an important role in impulsive decision-making (45, 69), sensation-seeking (70), novelty-seeking (71), and extraversion (72), all of which have been identified as risk factors for reaction to smoking cues and smoking initiation (46, 73, 74).
The top-hit SNP, rs2298122, is located in intron 5 of CALY, and the role of this variant in human cognition and behavior is not well understood. Within HapMap 3, 2 SNPs in the CEU population (rs7085530 and rs2105341) and 4 SNPs in the CHB population (rs11101694, rs2275723, rs2105341, and rs7085530) are in notable linkage disequilibrium with rs2298122 (r2 > 0.5). However, none of these SNPs are nonsynonymous SNPs. Among the few association studies on CALY, Laurin et al. (75) performed a nuclear-family-based association study of attention deficit/hyperactivity disorder in Caucasian adolescents. Although their results are only suggestive, the association between rs2298122 and the impulsive dimension of attention deficit/hyperactivity disorder was marginally significant (P = 0.08) (75). The molecular mechanism for the association between CALY and smoking initiation might reflect a potential effect via impulsivity, which has been shown to play a complex role in smoking behavior and the reaction to smoking cues (45, 69).
The association between CALY and smoking initiation was observed only in females. It is well known that smoking behavior differs between the genders, especially in some populations, such as the Chinese (76). Furthermore, a recent investigation in 32,359 pairs of native California twins indicated a gender difference in the pattern of genetic and environmental determinants of smoking initiation (77). Moreover, a gender difference in the association between some psychological characteristics and smoking-related behaviors has been repeatedly reported (78, 79). These studies indicate that there may be a subtle difference in the mechanisms or underlying genetic factors for smoking initiation between the genders. Our observation of an association only in females may partly reflect this.
In the WSPT, we also observed a strong effect of CRHR1, although this association was not replicated in the CHS. CRHR1 codes for the type I receptor of corticotrophin-releasing hormone (80), and it has been associated with depression (81, 82), suicide (83), and alcohol dependence (84, 85) in previous investigations. However, the role of CRHR1 in smoking-related behaviors has not been demonstrated. There are 2 possible explanations for the lack of replication of the positive findings in the CHS. First, in the WSPT, this association was only marginally significant for a pathway-wide significance level. Considering the fact that the best-fitting heredity model for this SNP was recessive, sparse data bias (86) might have arisen and led to unstable estimates or false positives. Furthermore, even if the findings for CRHR1 in WSPT are true, statistical power to detect the recessive genetic effects would have been low in the CHS because of the limited sample size. Second, none of the CRHR1 SNPs associated with smoking initiation in the WSPT are located in the coding region, indicating that those SNPs might be markers for another rare SNP in linkage disequilibrium with them in a Chinese population. Therefore, the different linkage disequilibrium structure between the Chinese and Caucasian populations might have led to inconsistent association results in the 2 populations. In either scenario, further functional or population-based data may help to clarify the role of CRHR1 in smoking initiation.
The primary phenotype of the current investigation, smoking initiation in young adolescents, is different from lifetime smoking initiation. Specifically, given the high prevalence of smoking in adult Chinese males (>60%) (75), some persons who will initiate smoking later in adolescence or adulthood might have been treated as controls in the WSPT. This may have led to decreased statistical power if we believe that lifetime smoking initiation is a homogenous phenotype. However, since smoking is integral to social interactions among adults in the Chinese population, there may exist different mechanisms or competing risk factors for smoking initiation at later ages, as compared with early smoking initiation, in which personality factors such as impulsivity, behavior control, decision-making, and novelty-seeking might play more important roles—especially in adolescent females. In addition, we limited our analysis in the WSPT to smoking status at the time of enrollment and did not include longitudinal follow-up data. This was primarily done to avoid any heterogeneity in inference that may have existed postintervention, since the WSPT was specifically designed to test the effectiveness of a school-based intervention aimed at altering adolescent smoking trajectories. In contrast, because the CHS is a cohort representing more natural smoking trajectories, our replication analysis utilized the entire longitudinal history. If we restrict our analysis to comparable ages (12 years and 13 years), we see that the pattern of association remains (odds ratio = 2.25, 95% CI: 0.9, 5.61), although it is not statistically significant because of decreased information in the small sample.
Subtle differences between early and lifetime initiation highlight the difficulty of research in this area, especially when considering the numerous nongenetic (host, social, and environmental) factors that are known to influence smoking initiation. The impact and temporal dynamics of these factors may lead to many gene-environment interactions, making discovery and replication across cohorts more difficult. As an example, in 2 recent genome-wide association studies on smoking behavior (Cancer Genetic Markers of Susceptibility and Genetic Association Information Network), no significant association (defined as P < 0.00001) between adult smoking initiation and variants in the CALY region (33, 34) was identified in an analysis combining both males and females. These differences in findings may be partly attributed to differences in the definition of smoking initiation, gender interaction, or differences in exposure to nongenetic factors. In any case, further replication and characterization via stratified analysis is required in this area of genetic research.
While a long-term goal of research in this area is aimed at elucidating potential interactions, we focused the current analyses on main effects of genes and heterogeneity due to gender. Although the positive findings of our exploratory investigation in the WSPT are supported by an independent sample, there are no functional data to further illustrate the putative association. Furthermore, when the WSPT cohort study was initiated in 1998, information on related psychological traits associated with smoking initiation, such as impulsivity and novelty-seeking, was not collected, and measures of depression and hostility were not derived from standardized, established items. This has greatly limited our ability to explore the role of personality characteristics as mediators of or intermediates in the association between the dopamine pathway and smoking initiation. Despite these caveats, we believe the evidence is sufficient to encourage further investigation to verify the potential role of CALY variants in smoking initiation.
In summary, we performed a pathway-based association study in a homogenous Chinese Han adolescent population to explore the role of the dopamine pathway in smoking initiation. We identified 1 SNP (rs2298122) in the CALY gene that was positively associated with smoking initiation only in females. Supportive evidence for this association was subsequently observed in an independent sample of Caucasian adolescents. While these findings will ultimately need to be replicated in additional population-based samples and explored in functional studies, the results of our investigation provide a foundation for future research in understanding the genetic mechanism of smoking initiation in adolescents.
Author affiliations: Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California (Dalin Li, Jinghua Liu, Wonho Lee, Xuejuan Jiang, David Van Den Berg, Jim Gauderman, Frank Gilliland, Chih-Ping Chou, David V. Conti); National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina (Stephanie J. London); Center for Health Sciences, SRI International, Menlo Park, California (Andrew W. Bergen, Denise Nishita, Gary E. Swan, Nahid Waleh); School of Community and Global Health, Claremont Graduate University, Claremont, California (Peggy Gallaher, Jennifer B. Unger, C. Anderson Johnson); and Department of Pharmacology and Pharmaceutical Sciences, School of Pharmacy, University of Southern California, Los Angeles, California (Jean C. Shih).
This work was supported by the US National Institute on Drug Abuse (grants DA020830 and CA084735 to Drs. Dalin Li and David V. Conti) and the US National Institute of Environmental Health Sciences (grants ES015090 and GM069890 to Dr. David V. Conti). This work was also supported in part by the Intramural Research Program of the US National Institutes of Health, National Institute of Environment Health Sciences. Genotyping for the example was performed as part of the Pharmacogenetics of Nicotine Addiction Treatment Program (SRI International and University of California, San Francisco), which received funding from the National Institute on Drug Abuse (grant DA020830).
The authors acknowledge Drs. Yungang He, Ruhong Jiang, Li Su, Rich Flores, Suman Prasad, and Huijun Ring for their contributions to DNA analysis in the Wuhan Smoking Prevention Trial.
Conflict of interest: none declared.