|Home | About | Journals | Submit | Contact Us | Français|
Recent evidence from several relatively small nested case-control studies in prospective cohorts shows an association between longer telomere length measured phenotypically in peripheral white blood cell (WBC) DNA and increased lung cancer risk. We sought to further explore this relationship by examining a panel of 7 telomere-length associated genetic variants in a large study of 5,457 never-smoking female Asian lung cancer cases and 4,493 never-smoking female Asian controls using data from a previously reported genome-wide association study. Using a group of 1,536 individuals with phenotypically measured telomere length in WBCs in the prospective Shanghai Women’s Health study, we demonstrated the utility of a genetic risk score (GRS) of 7 telomere-length associated variants to predict telomere length in an Asian population. We then found that GRSs used as instrumental variables to predict longer telomere length were associated with increased lung cancer risk (OR = 1.51 (95% CI=1.34–1.69) for upper vs. lower quartile of the weighted GRS, P-value=4.54×10−14) even after removing rs2736100 (P-value=4.81×10−3), a SNP in the TERT locus robustly associated with lung cancer risk in prior association studies. Stratified analyses suggested the effect of the telomere-associated GRS is strongest among younger individuals. We found no difference in GRS effect between adenocarcinoma and squamous cell subtypes. Our results indicate that a genetic background that favors longer telomere length may increase lung cancer risk, which is consistent with earlier prospective studies relating longer telomere length with increased lung cancer risk.
Telomeres are specialized chromatin structures that shorten during each round of cellular division in mammalian cells. Prolonged erosion of telomere length can lead to genetic instability, cellular senescence, and apoptosis1. Earlier studies, mainly retrospective, on peripheral WBCs have suggested increased cancer risk associated with shorter telomere length2–6. These studies may suffer from disease bias in which telomere shortening was a consequence of tumor growth and progression rather than a risk factor for tumorigenesis. Recent, primarily prospective studies indicate that, contrary to expectation, longer telomere length may be associated with cancer risk7–14, particularly for lung cancer15–18.
Telomere length has historically been measured in peripheral WBC by multiplex quantitative polymerase chain reaction19. A recent genome-wide association study (GWAS) on telomere length has identified 7 loci robustly associated with WBC telomere length20. While genetic variants at these loci explain a small proportion of the total biological variation in telomere length, the age-related shortening per variant risk allele was equivalent to 1.9–3.9 years of attrition in telomere/single copy gene (T/S) ratio, equating to approximately 57 to 117 base pairs in telomere length per risk allele. Furthermore, the authors demonstrated the utility of genetic risk scores (GRS) of these variants to replicate a well-established association between shorter mean peripheral WBC telomere length and coronary artery disease. This suggests that by using telomere-length associated GRS as an instrument to approximate telomere shortening or lengthening, causal relationships with telomere length can be investigated in etiologically complex diseases that include environmental risk factors associated with both disease risk and telomere length.
We herein report an investigation of the 7 identified telomere-length associated variants in a sample of lung cancer cases and controls from a population of never-smoking Asian females. Our investigation uses data generated as part of a previously reported genome-wide association study (GWAS) conducted by the Female Lung Cancer Consortium in Asia21. Our objectives are to (1) validate the utility of these 7 telomere-length associated variants discovered in a primarily European population to predict measured telomere length in an Asian population; (2) characterize overall and individual associations of telomere-length associated variants with lung cancer risk; (3) investigate the ability of GRSs of these variants to predict lung cancer risk; and (4) describe the direction of the associations observed between telomere-length associated variants and lung cancer risk.
Study subjects were from a published GWAS investigating lung cancer susceptibility risk in female Asian non-smokers drawn from 14 studies from mainland China, South Korea, Japan, Singapore, Taiwan, and Hong Kong21. Cases had histologically confirmed lung cancer. Each study was approved by the Institutional Review Board of the investigator’s institution, and all participants provided written informed consent.
Genotyping was performed in the Cancer Genomics Research Laboratory of the National Cancer Institute’s Division of Cancer Epidemiology and Genetics (Gaithersburg, MD); Gene-Square Biotech, Inc. (Beijing, China); GeneTech Biotech Co. (Taiwan); deCODE Genetics (Iceland); Memorial Sloan-Kettering Cancer Center (New York, NY); and Genome Institute of Singapore (Singapore). Genotyping was carried out on commercially available Illumina Infinium BeadArray human assays (Illumina 370k, Illumina 610Q, and Illumina 660W SNP microarrays) following standard procedures. The methods and quality control metrics applied to genotyping with SNP microarrays have been previously published21. Briefly, samples were excluded with low completion rates, extreme heterozygosity values, gender discordance, low Asian ancestry (less than 86%), and first degree relatives were removed. After quality control filtering, a total of 5,510 cases and 4,544 controls had genetic data available for analysis.
To address potential population substructure, principal components were calculated using the GLU struct.pca module (http://code.google.com/p/glu-genetics/) using 33,165 SNPs with low pairwise correlation (R2 < 0.01).
Genotype imputation was performed to ensure complete data existed for all 7 telomere-length associated variants. The IMPUTE2 program (http://mathgen.stats.ox.ac.uk/impute/impute_v2.html) was used with the March 2012 release of the 1,000 Genomes Project data22 and the DCEG Imputation Reference Set 23 as merged references for imputation. The DCEG reference set serves as a supplement to the 1,000 Genomes reference and includes 2.8 million autosomal polymorphic SNPs for 1,249 individuals, of which 162 individuals are of Asian ancestry. Since the genotyping data was on NCBI Build 36, all genotyped variant coordinates were converted to NCBI Build 37 using UCSC’s liftOver utility (http://hgdownload.cse.ucsc.edu/downloads.html) before performing genotype imputation. Recommended IMPUTE2 default settings were used and all imputed SNPs (rs7675998, rs8105767, rs755017, rs11125529) achieved INFO scores >0.99. There was no evidence for significant departures from Hardy-Weinberg proportions (P-value>0.05).
A group of subjects included in previous nested case-control studies of various cancers in the prospective Shanghai Women’s Health Study (N=1,536) had both genotyping data and experimentally measured peripheral WBC telomere length that we used to validate the telomere-length associated variants in an Asian population. Multiplex quantitative polymerase chain reactions were used to quantify telomere length. T/S values were extracted for the analysis and log transformed to improve normality.
All plotting and statistical analyses were performed on a 64-bit Windows build of R version 3.0.1 “Good Sport”24. Only subjects with complete genotyping, histology, and covariate information were included in the analysis (5,457 cases and 4,493 controls). Models investigating lung cancer risk were adjusted for study indicator variable, 10-year age group indicator variables (<40, 40–49, 50–59, 60–69, and 70+), and significant principal components (EV1, EV2, and EV4), unless otherwise noted. Likelihood-ratio and SNP-set kernel association test (SKAT) linear kernel tests25, 26 were used to assess statistical significance of aggregations of telomere-length associated variants on lung cancer risk by comparing null models to fitted models containing combinations of the 7 telomere-length associated variants. The SKAT linear kernel test aggregates a set of SNP score test statistics and efficiently computes an overall p-value26.
Both unweighted and weighted genetic risk scores (GRS) were calculated for telomere-length associated variants. To calculate GRS for the i-th subject from the 7 telomere-length associated variants the following formula was used:
Here, xij is the number of risk alleles for the j-th SNP in the i-th subject (xij=0, 1 or 2) and wj is the weight or coefficient for the j-th SNP. Unweighted genetic risk scores simply counted the number of alleles associated with longer telomere length an individual carried across all 7 telomere-length associated variants, thus giving an equal weight to all risk alleles (wj=1). Weighted genetic risk scores were calculated likewise, with the addition of assigning previously published telomere-length associated beta estimates20 as wj for each telomere-length associated SNP allele count. Weighting normally results in more specificity of the GRS by assigning more weight to variants with stronger effects.
Our dataset consisted of a sample of 5,457 lung cancer cases and 4,493 controls from a population of never-smoking Asian females (Table 1). The participants were drawn from 14 contributing studies with collection areas in mainland China, South Korea, Japan, Singapore, Taiwan, and Hong Kong. Age, a major factor associated with telomere attrition, was available in 10-year age-groups for all participants. Most participants were between 50 and 70 years of age (63%) with 6% of subjects younger than 40 years of age.
Measured and imputed genotypes were available for the 7 telomere-length associated variants (Table 2). Alleles associated with longer telomere length were denoted the risk allele and risk allele frequencies from our dataset were compared to those previously reported by Codd et al20. Risk allele frequency differences between our Asian lung cancer study and the Codd et al. study of a population of primarily European descent likely reflect differences in ancestral allele frequencies.
To ensure the telomere-length associated variants, discovered in a population of primarily European ancestry, were a valid surrogate for telomere length in our Asian population, we carried out an analysis on a set of 1,536 Asian females with both measured telomere length and genotype data from the prospective Shanghai Women’s Health Study. When testing for an association of each of the 7 telomere length associated variants with measured telomere length, only the TERT variant (rs2736100) had a significant association with measured telomere length (P-value=0.03); however, our sample size was substantially smaller than the Codd et al analysis (N=48,423), and although insignificant, 6 of the 7 variants had beta estimates in the correct direction. A weighted GRS with all 7 telomere-length associated variants was calculated and the association with telomere length was also investigated. In the overall sample, the telomere-length associated GRS was significantly associated with measured telomere length (P-value=0.001, Figure 1A), the estimated effect was in the positive direction (beta=0.15), and explained the same percent of total telomere length variance as in Codd et al.(R2=0.01)20. For the cancer cases in this sample, the mean time between blood sample collection and cancer diagnosis was 5.34 years with 75 percent of cases having blood collected more than 3 years prior to cancer diagnosis. When restricting the analysis to controls (N=533), the association remained significant (P-value=0.04) with similar effect size and variance explained (Figure 1B). Together, this provides evidence the weighted GRS of telomere-length associated variants has utility in predicting measured telomere length in Asian populations.
Overall association tests were conducted to investigate if, in aggregate, all 7 telomere-length associated variants were associated with lung cancer risk. A likelihood ratio test comparing a null model adjusting for 10-year age group, contributing study, and significant principal components to the same model plus all 7 telomere-length associated variants indicated that in aggregate the telomere-length associated variants were significantly associated with lung cancer risk (P-value=9.64×10−25). Furthermore, a linear SKAT found a highly significant association between the 7 telomere-length associated variants and lung cancer (P-value=3.19×10−27).
Each telomere-length associated variant from Codd et al.20 was tested for an individual association with lung cancer risk. All 7 telomere-length associated variants were included in the same logistic regression model and covariates were included to adjust for 10-year age-group, contributing study, and significant principal components. Two of the 7 telomere-length associated variants (rs2736100 and rs10936599) exhibited association p-values less than 0.05, significantly more than the 0.4 variants expected by chance (P-value=0.04) (Table 2). The rs2736100 variant, located in the first intron of the TERT gene, has previously been associated by GWAS with lung cancer risk 27. Interestingly, 5 of the 7 telomere-length associated variants show effects in the same direction for both the Codd et al. telomere-length association20 and lung cancer association suggesting enrichment for variants that are associated with both longer telomere length and increased lung cancer risk (Table 2).
Both unweighted and weighted GRSs were calculated as measures of predicted telomere length for each study participant and association with lung cancer risk was tested by logistic regression models that adjusted for 10-year age group, contributing study, and significant principal components. The unweighted telomere-length associated GRS was significantly associated with lung cancer risk (P-value=1.90×10−12), indicating scores associated with longer telomere length were also associated with increased lung cancer risk. The odds ratio comparing individuals in the upper quartile of GRS to those in the lower quartile of GRS was 1.47 (95% CI=1.31–1.65). The beta weighted telomere-length associated GRS demonstrated greater specificity for the lung cancer association with greater evidence for association between longer telomere length and lung cancer risk (P-value=4.54×10−14). A higher odds ratio of 1.51 (95% CI=1.34–1.69) was observed for individuals in the upper quartile of the weighted GRS compared to those in the lower quartile. The association of the weighted GRS across contributing study was homogeneous (homogeneity P-value=0.34) and produced an overall meta-analysis odds ratio of 1.51 (95% CI=1.34–1.71, P-value=1.53×10−11) comparing individuals in the upper quartile of weighted GRS to those in the lower quartile of GRS (Figure 2). When investigating deciles of the weighted GRS, the effect of weighted GRS on lung cancer risk appeared to be monotonic with no threshold indicating a substantial change in risk (Figure 3). Furthermore, to assess if rs2736100 was the only SNP accounting for the association between the weighted GRS and lung cancer risk, the weighted GRS was recomputed with the exclusion of rs2736100, and rs2736100 used as a separate covariate in the regression model. The weighted GRS minus rs2736100 remained significantly associated with increased lung cancer risk, although the P-value was greatly attenuated (P-value=4.81×10−3).
Additional age-stratified analyses were conducted to investigate potential differences in the weighted GRS lung cancer association with age. Results indicate women in the younger than 60 years age group had an odds ratio of 1.72 (95% CI=1.46–2.02, P-value=9.35×10−11) comparing women in the fourth and first quartiles of weighted GRS, whereas women in the 60 years or older age group had an odds ratio of 1.33 (95% CI=1.12–1.57, P-value=0.001). A significant difference was observed between the two effect estimates (P-value=0.03) indicating the association between weighted telomere-associated GRS and lung cancer risk may be stronger in younger women. Analyses were also stratified based on the two primary histological subtypes of lung cancer: adenocarcinoma and squamous cell carcinoma. The weighted GRS odds ratio comparing the fourth to first quartile for adenocarcinoma cases was 1.51 (95% CI=1.33–1.72, P-value=2.82×10−10). The squamous cell carcinoma odds ratio estimate was slightly lower at 1.42 (95% CI=1.10–1.81, P-value=0.006). A case-only analysis of the two histological subtypes found no significant difference in weighted GRS effect (P-value= 0.80).
Our study investigated the relationship between 7 telomere-length associated variants and lung cancer risk. Aggregations of the 7 variants were highly associated with lung cancer risk with the direction of the associations indicating that longer telomere length, as predicted by higher telomere length associated GRS, is a risk factor for lung cancer. Although the telomere-length associated variants explained only a fraction of the variation in telomere length, the associations suggest genetic effects tagged by these variants are important for lung cancer risk.
Previous studies have demonstrated an association with the TERT locus (rs2736100)21, 27 and lung cancer risk, however, our study is the first to provide evidence for associations with other telomere-length associated variants. In particular, the nominal significance of the TERC locus (rs10936599) suggests this locus may play a role in telomere-related maintenance important for lung cancer risk, although further studies are needed to verify this association. The 7 telomere-length associated variants explain a limited amount of the total variation in telomere length, suggesting that additional variation in telomere length may be attributable to other genetic variants which remain to be discovered. Additionally, the lower association P-values of aggregate association tests relative to the telomere-length specific GRS tests suggests that in addition to telomere length other aspects of telomeres, such as maintenance of genome stability or chromosomal repair, or distinct biological process tagged by these telomere-length associated variants, especially rs2736100 in TERT, may be important contributors to the lung cancer risk.
Using telomere-length associated genetic variants as an instrument for measuring telomere length provides several advantages. First, reverse causation biases that may influence case-control studies of telomere length and disease can be eliminated since telomere-length associated variants are unrelated to time of blood draw and disease diagnosis. Also, by using a correlated genetic proxy for telomere length, it may be possible to partition genetic versus other risk factors (e.g., aging, oxidative damage) that are reflected in the telomere length phenotype. One potential confounder in our analysis is correlated population specific differences between lung cancer frequency and telomere risk allele frequencies. However, this potential population stratification bias was mitigated by adjusting for all principal components that were significantly associated with lung cancer risk.
The biological mechanism linking longer telomere length to lung carcinogenesis is unclear. While telomere attrition leads to replicative senescence and apoptosis, telomere elongation may result in immortalized cells with unregulated telomerase activity and unlimited potential for cellular and tumor growth28–31. Shorter telomeres may act as tumor suppressors, whereas longer telomeres may not. In addition, recent evidence suggests excessively long telomeres may be as important for chromosomal instability as critically short telomeres32.
Our results, as well as an example from coronary artery disease20, suggest the 7 telomere-associated variants are useful proxies for investigating telomere length in a variety of diseases. Although the 7 variants explain a small portion of measured peripheral WBC telomere length, the age-related shortening per variant risk allele (1.9–3.9 years) and equivalent changes in telomere base pair length (57 to 117 bases) appear to be biologically meaningful for disease risk20. Evidence from our analysis suggests that these 7 telomere-associated variants, discovered in a European population, also have application to Asian populations. Additionally, the effect of the weighted GRS appears stronger in younger individuals suggesting telomere-length associated GRSs may be more useful in younger populations with fewer accumulated environmental exposures affecting telomere length than in older populations.
Results from our study indicate the variation tagged by 7 telomere-length associated variants is important for lung cancer risk. Our genetic-based proxy for telomere length suggests longer telomere length is associated with increased lung cancer risk in non-smoking Asian females which is consistent with evidence from a number of relatively small prospective studies of measured telomere length and lung cancer risk with non-smoking cases in Asia and mostly ever-smoking cases of European descent17. Further studies investigating the biological mechanisms related to the variation in telomere length captured by these genetic variants will improve understanding of the molecular pathways linking telomere length to lung cancer risk and may elucidate important preventative and therapeutic targets.
Using 7 telomere-length associated SNPs to estimate telomere length, we observe a positive association between longer telomere length and increased lung cancer risk in female Asian never smokers. Our genetic proxy is not affected by reverse-causation bias or environmental exposures and supports prior evidence from smaller prospective studies. The variation in telomere length captured by these variants may aid in linking specific biological mechanisms related to telomere length with lung cancer risk.
FLCS (J.C.W., D.R., L.J.) - Ministry of Health (201002007). Ministry of Science and Technology (2011BAI09B00). National S&T Major Special Project (2011ZX09102-010-01). China National High-Tech Research and Development Program (2012AA02A517, 2012AA02A518). National Science Foundation of China (30890034). National Basic Research Program (2012CB944600). Scientific and Technological Support Plans from Jiangsu Province (BE2010715).
GDS (Y.L.W.) - Foundation of Guangdong Science and Technology Department (2006B60101010, 2007A032000002, 2011A030400010). Guangzhou Science and Information Technology Bureau (2011Y2-00014). Chinese Lung Cancer Research Foundation, National Natural Science Foundation of China (81101549). Natural Science Foundation of Guangdong Province (S2011010000792).
GELAC (C.A.H.) - National Research Program on Genomic Medicine in Taiwan (DOH98-TD-G-111-015). National Research Program for Biopharmaceuticals in Taiwan (DOH 100-TD-PB-111-TM013, MOST 103-2325-B-400-011). National Science Council, Taiwan (NSC 100-2319-B-400-001).
GEL-S (A.S.) - National Medical Research Council Singapore grant (NMRC/0897/2004, NMRC/1075/2006). (J.Liu) - Agency for Science, Technology and Research (A*STAR) of Singapore.
HKS (J.W.) - General Research Fund of Research Grant Council, Hong Kong (781511M, 17121414M) and National Science Foundation, China (91229105).
JLCS (K.M, T.K.) - Grants-in-Aid from the Ministry of Health, Labor, and Welfare for Research on Applying Health Technology and for the 3rd-term Comprehensive 10-year Strategy for Cancer Control; by the National Cancer Center Research and Development Fund; by Grant-in-Aid for Scientific Research on Priority Areas and on Innovative Area from the Ministry of Education, Science, Sports, Culture and Technology of Japan. (W.P.) - NCI R01-CA121210.
NLCS (H.S.) - China National High-Tech Research and Development Program Grant (2009AA022705). Priority Academic Program Development of Jiangsu Higher Education Institution. National Key Basic Research Program Grant (2011CB503805).
SKLCS (Y.T.K.) - National Research Foundation of Korea (NRF) grant (NRF-2014R1A2A2A05003665). (J.C.) - This work was supported by a grant from the National R&D Program for Cancer Control, Ministry of Health &Welfare, Republic of Korea (grant no. 0720550-2). (J.S.S) – grant number is A010250.
WLCS (T.W.) - National Key Basic Research and Development Program (2011CB503800).
SLCS (B.Z.) - National Nature Science Foundation of China (81102194). Liaoning Provincial Department of Education (LS2010168). China Medical Board (00726).
SWHS (W.Z., W-H.C., N.R.) - The work was supported by a grant from the National Institutes of Health (R37 CA70867) and the National Cancer Institute intramural research program, including NCI Intramural Research Program contract (N02 CP1101066).
TLCS (K.C., B.Q) - Program for Changjiang Scholars and Innovative Research Team in University (PCSIRT), China (IRT1076). Tianjin Cancer Institute and Hospital. National Foundation for Cancer Research US.
YLCS (Q.L.) - Supported by the intramural program of U.S. National Institutes of Health, National Cancer Institute.
Conflicts of interest: Richard Cawthon holds a patent for the polymerase chain reaction method of measuring telomere length that is used in this study, and licensed that method for commercial use. However, throughout this study his laboratory has been blind as to the age, genotype, and outcomes of the subjects in the study.
The findings and conclusions in this report are those of the authors and do not necessarily represent the views of the National Institute of Health. The authors report no conflicts of interest.