We have identified six
CHRNA5 SNPs (rs1979905, rs1979906, rs1979907, rs880395, rs905740, and rs7164030) in complete LD located in a region more than 13

kb upstream of the transcription site that are robustly associated with mRNA expression. Quantitative analysis in prefrontal cortex tissues revealed a fourfold increase in
CHRNA5 mRNA expression for homozygous carriers of the minor allele enhancer SNPs, supporting the hypothesis that the six upstream SNPs are located in a highly penetrant enhancer region. Given the level of precision attainable for measuring allelic mRNA ratios of transcripts with low expression, we cannot address here whether additional regulatory polymorphisms exist that have lesser effects on expression.
Evolutionary changes in gene expression across species commonly result from polymorphisms in
cis-regulatory elements that create or abolish
cis-acting transcription/enhancer/suppressor factor binding sites.
27 The accumulation of the proposed enhancer SNPs to very high allele frequencies (37% in populations of European descent), resulting in gain of function compared with ancestral alleles, together with their position within a large haplotype block, suggests that these enhancer SNPs have undergone positive selection in the human lineage. Analysis of transcription factor binding sites created or abolished by the enhancer SNPs in this region using JASPAR
28, 29 (
http://jaspar.cgb.ki.se/;) suggests that rs905740 is a leading functional candidate: the variant ‘T' allele increases the likelihood of Ets-1 binding, which has been shown to interact with the transcriptional coactivator EA1 binding protein p300 to regulate transcription at distal enhancer sites.
30, 31 Significant upregulation of both Ets-1 and
CHRNA5 in lung cancer
32, 33, 34, 35 provides a disease model in which the relationship between enhancer SNP rs905740 genotype and Ets-1 expression can be tested.
Approximately 50% of tissue-specific distal enhancers are found within 20

kb of the nearest transcription start site, but can also occur >90

kb up- or downstream of the genes they regulate.
31 The pattern of multiple transcription factor binding sites around a given gene is proposed to determine tissue-selective enhancers.
36 Therefore, we tested the effect of rs905740 genotype on
CHRNA5 mRNA expression in human lymphocytes, in which it is also expressed. Lack of association between the rs905740 genotype and expression in lymphocytes is consistent with this region serving as a tissue-specific distal enhancer. The activity of the enhancer in other tissues remains to be determined, but cortical and subcortical structures of the forebrain seem to show variable allelic expression (unpublished observations). It is important to determine whether the proposed enhancer region regulates
CHRNA5 expression in brain regions crucial for addiction, or whether
CHRNA5 expression in lung tissues associates with lung cancer independent of nicotine addiction.
Association of the enhancer region with clinical traits
Previous clinical association studies on nicotine dependence have failed to reveal a role for the enhancer region alone on phenotype when using marker SNP rs880395.
2, 4, 5 However, the results of this study strongly implicate rs880395, and those adjacent SNPs in complete LD, as strong drivers of
CHRNA5 expression. This prompted a reevaluation of rs880395 in nicotine dependence in conjunction with other functional SNPs. A joint analysis of rs880395 and rs16969968 revealed a modest but significant association of rs880395 with nicotine dependence. This is similar to a previous analysis examining the promoter variant rs3841324 (22

bp deletion) with rs16969968.
10 Both studies confirm that the minor allele of the nonsynonymous variant rs16969968 (
A/A genotype) is a risk factor when occurring on the low-expressing background, that is, the ancestral allele for the proposed enhancer SNPs, whereas rs880395 and SNPs in high LD located in this proposed upstream enhancer region now appears to emerge as another independent risk factor.
It may, at first glance, be surprising that the joint rs880395/rs16969968 analysis did not yield stronger associations with nicotine addiction compared with the rs3841324/rs16969968 analysis,
10 when considering the fact that rs3841324 does not account for allelic expression differences. However, rs880395 and rs3841324 genotypes are highly correlated in populations of European descent (). Examining ethnic groups in which the correlation between these two variants is not as strong might magnify the difference in risk potential for nicotine dependence. Given the large effect of the enhancer region on mRNA expression, future studies should consider the haplotype on which the risk polymorphism resides as an important factor in penetrance.
Our data cannot rule out the possibility that alternative splicing contributes to allelic differences in specific transcript expression, as indicated by differences in AEI ratios between the two marker SNPs, even when the ratios correlate (r=0.62) (). Human mRNA clones and expressed sequence tags display evidence of alternative splicing of CHRNA5 in exon 5, in which rs16969968 resides. Presumably, any genetic variant driving a decrease in inclusion of rs16969968 in the mature mRNA could also decrease risk for dependence.
Conclusions
CHRNA5 exists in three similarly abundant main haplotypes (Supplementary Table S3) that have distinct biological functions, defined by the distal enhancer SNPs determining expression and the nonsynonymous rs16969968 SNP determining ligand-mediated signaling: the ancestral haplotype with low expression, a high-expressing haplotype, and a low-expressing haplotype with altered protein and channel activity. The effects of high versus low CHRNA5 expression on channel activity in vivo remain to be determined. Because LD at the CHRNA5 locus varies by ethnicity, the proposed enhancer variants should be considered when evaluating the penetrance of CHRNA5 in nicotinic receptor-related disorders. The presence of additional functional variants cannot be excluded, possibly resulting in a more complex haplotype repertoire with distinct functions.