In this fine mapping analysis of genetic variants in the 5p15.33 region, we identified four novel SNPs associated with lung cancer risk, one of which was specific to smokers and one was specific to non-smokers. None of the SNPs was in protein-coding region or a promoter or splice site variant; rs4975538 is an intronic SNP in TERT, rs451360 and rs370348 are intronic SNPs in CLPTM1L and rs4975615 is in the intergenic region between the two genes. Although none of these SNPs is in a putative functional region, our findings confirm that the TERT–CLPTM1L region is related to lung cancer risk. Interestingly, there are very few common variants in the exonic protein-coding regions in TERT or CLPTM1L and other than rs2736098 (corresponding to A305A); all are rare variants with a <5% minor allele frequency.
The GWAS by McKay et al.
) first suggested two genes, TERT
, in the 5p15.33 region that could have a role in lung cancer susceptibility. Their GWAS identified four SNPs in this region, rs402710, rs2736100, rs401681 and rs31489, of which the first two SNPs were replicated in an independent sample of cases and controls. The studies that followed confirmed the association of these and other SNPs in the TERT–CLPTM1L
region (Supplementary Table 1
is available at Carcinogenesis
Online). Furthermore, lung cancer risk for SNPs in the TERT–CLPTM1L
region was also reported by other authors in specific subgroups, including never-smokers (8
), people with adenocarcinoma (4
), women (11
), African Americans (21
) and Asians (7
). SNPs in the TERT–CLPTM1L
region that reached genome-wide significance in these studies are listed in Supplementary Table 1
(available at Carcinogenesis
Online). In particular, rs2736100 was found to be significant across different studies and subgroups, which suggest that this or another SNP in LD with it is probably to be causally related to lung cancer risk. In comparison, although all the eight previously reported SNPs were nominally significant at P
< 0.05 in our study ( and ), rs2736100 was not one of the most significant SNPs (P
There is evidence that the 5p15.33 region may be important in susceptibility to other cancers as well. Rafnar et al.
examined rs401681 and rs2736098, two SNPs in the 5p15.33 region for their association with risk for many different types of cancer and found a significant association for rs401681 with several cancers including basal cell carcinoma, lung, bladder, prostate and cervical cancers (6
). Another study confirmed the association of variants in this region with bladder cancer (23
) and associations were also determined for rs401681 with squamous cell carcinoma of the head and neck (24
). Furthermore, in a GWAS for pancreatic cancer, rs401681 was identified as one of the susceptibility loci (25
). Interestingly, rs401681 was the second most significant SNP in our study (P
= 1.1 × 10−5
). Rafnar et al.
also examined the association between rs401681 and rs2736098 and telomere length in DNA from whole blood as telomere shortening is a possible mechanism of carcinogenesis related to the 5p15.33 region. Their results suggested that the variants may lead to a gradual shortening over time, although this effect was only apparent in older women (6
). However, these results were not confirmed by Pooley et al.
, who found that rs401681, which is located in intron 13 of CLPTM1L
was not associated with mean telomere length (26
at the 5p15.33 susceptibility locus are attractive candidate genes for lung cancer as they have both been plausibly linked with carcinogenesis. TERT
encodes the catalytic subunit of telomerase, an enzyme that maintains telomere ends by adding the telomere repeat TTAGGG. It has been shown that telomerase expression is high in progenitor and cancer cells and absent or low in normal somatic cells (27
). Telomere length is linked to aging and anti-apoptosis, and mouse studies have shown that dysregulation of telomerase expression may be involved in oncogenesis (28
). Similarly, CLPTM1L
encodes an enzyme—cleft lip and palate transmembrane 1-like that is upregulated in cisplatin-resistant cell lines and may be associated with apoptosis (10
). Furthermore, the risk allele of rs402710 within the CLPTM1L
gene has also been found to be associated with a higher accumulation of DNA damage measured by bulky aromatic/hydrophobic DNA adducts, which may be an early step in lung carcinogenesis (13
One of the lingering questions about the GWAS hits identified for lung cancer is whether genes identified are associated with smoking and not lung cancer. We tested for gene–smoking interaction to see if smoking modified the SNP–lung cancer association. However, other than for rs6889886, our results did not show that smoking modified the SNP–lung cancer association for any of the SNPs. Even for rs6889886, which is an intergenic SNP between CLPTM1L and SLC6A3 genes, evidence for an interaction with smoking could reflect type 1 error, given the number of tests performed. We also examined smoking (as pack-years smoked) as a confounder of the SNP–lung cancer association and did not find a significant change in the effect sizes when we compared the results of the unadjusted and adjusted analyses. Finally, we examined the SNP–smoking association in the unaffected controls and found that none of the SNPs was associated with smoking. Our findings clearly suggest that the SNPs in the 5p15.33 region are strongly associated with lung cancer and not smoking.
In summary, in this analysis, we used a fine mapping approach to evaluate additional, possibly causal SNPs in the 5p15.33 GWAS-identified lung cancer susceptibility locus. We used multiple logistic regression according to haplotype blocks to identify independent variants associated with lung cancer risk. We identified rs370348 and rs4975538 as novel SNPs associated with lung cancer risk and two additional SNPs that may be susceptibility markers for lung cancer risk in smokers (rs4975615) and non-smokers (rs451360). Our results show that after fine mapping, the 5p15.33 locus that has repeatedly been identified as a strong susceptibility locus for lung cancer, there appears to be several distinct loci influencing disease risk. None of the SNPs we identified were obvious functional SNPs, that is, in exonic, splice site or promoter regions. A limitation of this study was incorporation of a limited number of SNPs on the SNP array of the total number of SNPs identified in the SNP selection process. Future analyses using sequencing approaches may help to identify all causal variants in this region and animal and cell models may be needed to establish mechanisms of cancer risk.