|Home | About | Journals | Submit | Contact Us | Français|
TERT-locus single nucleotide polymorphisms (SNPs) and leucocyte telomere measures are reportedly associated with risks of multiple cancers. Using the iCOGs chip, we analysed ~480 TERT-locus SNPs in breast (n=103,991), ovarian (n=39,774) and BRCA1 mutation carrier (11,705) cancer cases and controls. 53,724 participants have leucocyte telomere measures. Most associations cluster into three independent peaks. Peak 1 SNP rs2736108 minor allele associates with longer telomeres (P=5.8×10−7), reduced estrogen receptor negative (ER-negative) (P=1.0×10−8) and BRCA1 mutation carrier (P=1.1×10−5) breast cancer risks, and altered promoter-assay signal. Peak 2 SNP rs7705526 minor allele associates with longer telomeres (P=2.3×10−14), increased low malignant potential ovarian cancer risk (P=1.3×10−15) and increased promoter activity. Peak 3 SNPs rs10069690 and rs2242652 minor alleles increase ER-negative (P=1.2×10−12) and BRCA1 mutation carrier (P=1.6×10−14) breast and invasive ovarian (P=1.3×10−11) cancer risks, but not via altered telomere length. The cancer-risk alleles of rs2242652 and rs10069690 respectively increase silencing and generate a truncated TERT splice-variant.
Chromosome ends are capped by telomeres, which protect them from inappropriate DNA repair and maintain genomic integrity1. Telomeres consist of structural proteins2 combined with many hundreds of hexanucleotide DNA repeats3,4, which are progressively shortened by normal cell division5–7. Shortening restricts proliferation of normal somatic cells, but not of cancer cells, which can maintain long telomeres, usually via telomerase8–10, and may divide indefinitely. The TERT gene at 5p15.33 (see URLs) encodes the catalytic subunit of telomerase reverse transcriptase, an important component of telomerase. Germline mutations in TERT cause dyskeratosis congenita, a cancer susceptibility disorder characterized by exceedingly short telomeres11. Although up to 80% of the variation of telomere length is estimated to be due to heritable factors12,13, association studies on TERT SNPs and differences in leucocyte telomere length have, to date, been inconclusive14–17. Furthermore, it is unclear whether telomere length, measured in leucocyte DNA, is predictive of cancer risk: retrospective studies report that cancer patients, after diagnosis, have shorter telomeres than unaffected controls18–21, but prospective studies, with DNA taken prior to diagnosis, have been inconclusive19,22,23. SNPs at 5p15.33 are reported to be associated with risks of several human cancers14–16,24–32, including certain subtypes of both ovarian33 and breast cancers34.
Due to a common interest, SNPs surrounding the TERT locus were nominated by members of each of the constituent COGS consortia. Consequently, the iCOGS chip design included a combination of individual TERT gene candidate SNPs, as well as a more comprehensive set to fine-scale map the entire locus, for shared use by all consortia. This study had three aims: to assess SNPs across the TERT-locus for all detectable associations with mean telomere length and breast and ovarian cancer subtypes; to fine-scale map this locus to identify potentially-causal variants for the observed associations; and to evaluate the functional effects of the strongest candidate causative variants.
One hundred and ten SNPs at the 5p15.33 locus (Build 37 positions 1,227,693 – 1,361,969) passed quality control tests (QC) in 103,991 breast cancer cases and controls from 52 Breast Cancer Association Consortium (BCAC) studies, of which 41 studies (89,050 individuals) were of European, nine were of Asian (12,893 individuals) and two were of African-American ancestry (2,048 individuals). The same 110 SNPs passed QC in 11,705 BRCA1 mutation carriers of European ancestry, recruited by 45 studies from the Consortium of Investigators of Modifiers of BRCA1/2 (CIMBA), while 108 SNPs passed QC in 44,308 ovarian cancer cases and controls from 43 Ovarian Cancer Association Consortium (OCAC) studies. For OCAC, analysis was confined to the 39,774 European ancestry participants, of whom 8,371 cases had invasive epithelial ovarian- and 986 had serous low malignant potential (LMP) neoplasia. For all study participants, genotype-imputation, using the 110 genotyped SNPs together with the January 2012 release of the 1000 Genome Project (1000GP)35–38 was used to increase coverage to ~480 SNPs (imputation r2>0.3, minor allele frequency (MAF)>0.02) for each phenotype. Telomere length was initially measured in control subjects from two BCAC studies (SEARCH and CCHS, combined n= 15,567) (see Supplementary Information).
Figure 1 shows Manhattan plots of the genotyped and well-imputed SNPs for the seven phenotypes analyzed: mean telomere length (a), overall breast cancer (b), breast cancer in BRCA1 carriers (c), estrogen receptor negative (ER-negative) breast cancer (d), estrogen receptor positive (ER-positive) breast cancer (e), serous LMP ovarian cancer (f) and serous invasive ovarian cancer (g). Conditional analyses within each of these phenotypes revealed multiple independent SNP associations each for telomere length, overall breast cancer, ER-negative breast cancer and overall breast cancer risk in BRCA1 mutation carriers, but only one peak each for ER-positive breast cancer, serous LMP and invasive ovarian cancer (Table 1). Full results of all these SNP analyses are given in Supplementary Tables 1–3. All associations are consistent with a log-additive model.
SNPs in two distinct regions (hereafter denoted Peaks 1 and 2) are strongly associated with telomere length (Tables 1 and and2;2; Fig.1, panel a; Supplementary Fig.1, panel a). Imputed SNP rs7705526 (Peak 2, position 1285974, TERT intron 2) has the largest effect with a change in relative telomere length of 1.026-fold per-allele (95%CI 1.019–1.033, P=2.3×10−14, conditional P=2.5×10−11). We confirmed this finding in an additional 20,512 women and 17,645 men from a third study (CGPS) genotyped for rs7726159 (the best directly-genotyped SNP; r2=0.83 with rs7705526). From a joint analysis of all 53,724 individuals, the change in relative telomere length is 1.020-fold per-allele (95%CI 1.016–1.023, P=7.5×10−28). A second, independent association was observed with rs2736108 (Peak 1, position 1297488, TERT promoter) with a per-allele change in relative telomere length of 1.017-fold (95%CI 1.010–1.024, P=5.8×10−7, conditional P=4.0×10−4) (Fig.1, panel a; Supplementary Fig.1, panel a; Tables 1 and and2).2). SNPs rs7705526 and rs2736108 are only weakly correlated (r2=0.04 in Europeans). Weak associations between Peak 3 SNPs and telomere length became non-significant after adjustment for Peak 2 SNP rs7705526 (data not shown).
We identified SNPs associated with breast cancer risk (P<10−4) in three distinct regions in BCAC studies and two in CIMBA BRCA1 mutation carriers. No significant (P<10−4) evidence for heterogeneity among odds ratios (OR) or hazard ratios (HR) between studies for any of the top SNPs was observed (Supplementary Fig.6). The strongest association with overall breast cancer risk in BCAC is with Peak 1 SNP rs3215401 (Fig.1, panel b; Supplementary Fig.1, panel b; Tables 1 and and2).2). There is also good evidence for an association with SNPs in Peak 2 and weaker evidence for an additional SNP, outside the three main association peaks, to be independently associated with breast cancer risk (Supplementary Table 1; Table 1). The most strongly-associated SNPs in BRCA1 mutation carriers are located in introns 2–4 (hereafter denoted Peak 3) including rs10069690 (Fig.1, panel c; Supplementary Fig.2, panel c; Tables 1 and and2)2) and rs2242652 (correlation with rs10069690, r2= 0.70). The latter SNP also exhibits the strongest association with ER-negative breast cancer in BCAC (Fig. 1, panel d; Supplementary Fig. 1, panel d; Tables 1 and and2),2), but shows little evidence of association with ER-positive breast cancer (Table 2). Stepwise regression analysis in CIMBA studies indicated two independent associations with breast cancer risk in BRCA1 mutation carriers (conditional P=5×10−5 for rs2736108 in Peak 1; P=4.8×10−13 for rs10069690 in Peak 3). A very similar pattern was observed for ER-negative breast cancer in BCAC (conditional P=6×10−6 for rs3215401 in Peak 1 and P=4.3×10−9 for rs2242652 in Peak 3; Table 1). The most strongly associated SNP with ER-positive breast cancer is rs2736107 in Peak 1 (Fig.1, panel e; Supplementary Fig.2, panel e; Tables 1 and and2).2). Weak associations between the key SNPs and risk for BRCA2 mutation carriers are also observed, but the sample size is too small to draw definitive conclusions (data not shown).
The strongest association observed for risk of LMP ovarian cancer is with Peak 2 SNP rs7705526 and this is the only SNP retained in the stepwise regression analysis (Fig.1, panel f; Supplementary Fig.1, panel f; Table 1 and and2).2). The strongest observed association for serous invasive ovarian cancer is with Peak 3 SNP rs10069690 (Fig.1, panel g; Supplementary Fig.1, panel g; Tables 1 and and2).2). No other independent associations were observed for serous invasive ovarian cancer (Table 1). We also analysed SNP associations with endometrioid, mucinous, clear cell invasive and mucinous LMP ovarian cancers but found no associations at P<10−4 (Supplementary Table 4). We attempted analysis of invasive serous ovarian cancer stratified by grade, but again statistical power was low (Supplementary Fig.2).
The above results indicate that the majority of observed associations with all seven tested phenotypes fall into association Peaks 1–3. Correlated SNPs in the TERT promoter (Peak 1) are associated with telomere length, ER-positive breast cancer, ER-negative breast cancer and breast cancer in BRCA1 mutation carriers. SNPs in Peak 2, spanning TERT introns 2–4, are independently associated with telomere length, overall breast cancer and serous LMP ovarian cancer. Finally, SNPs in Peak 3, also spanning TERT introns 2–4, display strong associations with ER-negative breast cancer, breast cancer risk for BRCA1 mutation carriers and serous invasive ovarian cancer, but not with telomere length (Tables 1 and and2).2). Although Peaks 2 and 3 overlap physically, they define distinct sets of SNPs that are only partially correlated (e.g. correlation between rs10069690 and rs7705526; r2=0.33; Fig.2). Some SNP-phenotype associations in Peak 2 are clearly weaker than those in Peak 3 (e.g. with ER-negative breast cancer) and become non- significant after adjustment for SNP rs2242652 in Peak 3. Conversely, the associations with telomere length and serous LMP ovarian cancer are stronger for SNPs in Peak 2, indicating that the associations in Peaks 2 and 3 are not being driven by the same causal variants.
The strongest candidates for causation within each peak were identified by computing likelihood ratios; the SNPs listed in Tables 1 and and22 are those that cannot be excluded at a likelihood ratio of >1:100 fold compared to the top hit in the peak. The power to exclude SNPs differs between phenotypes; in Peak 1, all but seven SNPs can be excluded from being causal for relative telomere length, breast cancer risk in BRCA1 mutation carriers and ER-negative risk, but an additional SNP can be excluded for ER-positive breast cancer risk (Table 2). In Peak 2, the greatest power is for the telomere length phenotype, where all but three SNPs can be excluded, whilst five or six remain for cancer risk. For Peak 3, three putative causal SNPs remain for ER-negative breast cancer risk, two for serous invasive ovarian cancer risk and just one for breast cancer risk in BRCA1 carriers. Results in each peak are compatible with a single causative variant being responsible for the multiple phenotype associations (n.b. in Peak 3, SNPs rs2242652 and rs10069690 are equally compatible with being the single causal variant). However the possibilities of different causal variants being responsible for different phenotypes, or of the associations being due to haplotype effects, cannot be ruled out.
We tested all SNPs (n=341) with MAF >0.02 and imputation r2>0.3 for association with breast cancer in the nine BCAC Asian studies (comprising 6,269 cases and 6,624 controls) but none reached formal levels of significance. Furthermore none of the European top SNPs displayed more than borderline levels of significance in Asians (Supplementary Table 5). Peak 3 SNP rs10069690 was directly genotyped in two BCAC African-American studies (1,116 cases and 932 controls), as well as the above mentioned Asian studies and has estimated effects on ER-negative breast cancer similar to those in European populations: per-allele OR=1.19, 95%CI 1.06–1.31, P=0.009 in African-Americans and OR=1.09, 95%CI 1.00–1.19, P=0.07 in Asian women. Within OCAC there were too few women of Asian and African ethnicity to draw meaningful conclusions. (Supplementary Table 6).
Analysis of the ENCyclopedia Of DNA Elements (ENCODE) data39 revealed no evidence of regulatory elements or open chromatin coinciding with any risk-associated SNPs in normal breast epithelial cells or the other represented tissues (Supplementary Fig.3) (Data for ovarian tissues are not included in ENCODE). We therefore performed site-specific formaldehyde-assisted isolation of regulatory elements (FAIRE40) in ovarian cancer precursor tissues to identify regulatory elements in a 1Mb region centred on Peak 3. In fallopian tube secretory- and ovarian surface- epithelial cells, we detected FAIRE peaks coinciding with the CLPTM1L promoter but not the TERT promoter (Supplementary Fig.3). In silico analyses additionally indicated that TERT introns 4 and 5 (within and beyond Peak 3) contain regions showing regulatory potential and vertebrate sequence conservation41. We used site-specific FAIRE analyses of a ~1 kb region centered on the Peak 3 SNP rs10069690 in normal tissue samples from breast reduction mammoplasty (n=4), ovarian cancer precursor tissues (n=4) and ovarian cancer cell-lines (n=4). Breast cells from each woman were sorted into four enriched fractions based on differential expression of cell surface markers42 (myoepithelial/stem, luminal progenitor, mature luminal and stromal) and assays were performed on each fraction (Fig.3). Chromatin was closed in all the ovarian, breast luminal progenitor and mature luminal fractions. However, in 2/4 stromal cell fractions, we detected ~600bp open chromatin of varying amplitude, covering the position of SNP rs10069690, but not of rs2242652, and in 3/4 myoepithelial/stem cell fractions, we detected ~800bp open chromatin, covering the positions of both SNPs rs10069690 and rs2242652.
The regulatory capabilities of the DNA in each of the three peaks, and the effects of most of the strongest candidate causative variants in each one, were examined in luciferase reporter assays, using a construct containing 3915bp of the TERT promoter sequence43. Effects of Peak 1 TERT promoter variants were examined via five haplotype constructs differing at rs2736107, rs2736108 and rs273610925 (Fig.4a): one with all three major alleles (TERT wt), another with all three minor alleles (rs2736107, rs2736108, and rs2736109), and three with minor alleles of each SNP individually. Relative promoter activity was determined in ER-positive (MCF7), ER-negative (MDA-MB-468) breast and ovarian (A2780) cancer cell-lines. The construct containing all three minor alleles consistently generated the lowest luciferase signals, close to baseline. To determine whether the risk-associated variants in Peaks 2 and 3 fall within putative cis-acting regulatory elements (PREs), we cloned ~3 kb of sequence surrounding each SNP. Constructs of PRE-A (Peak 2) had no significant effect on the activity of either promoter construct (Fig.4b). However, inclusion of the minor allele of rs7705526 increased TERT promoter activity by ~30% in all three cell-lines, suggesting that it can act as a transcriptional enhancer. This increase in promoter activity was also observed with the construct in A2780 ovarian cells, but not in the two breast cancer cell-lines. Constructs of PRE-B (Peak 3) consistently act as strong transcriptional silencers, leading to a 40–50% decrease in activity, specifically in constructs containing the TERT wt promoter. Notably, inclusion of the minor allele of rs2242652 in PRE-B constructs decreased relative TERT wt promoter activity by a further ~20% compared to the silencer containing the major alleles, but highly-correlated SNP rs10069690 did not generate this effect (Fig.4b).
Several TERT alternative splice variants have been found to impact on telomerase activity44,45. To determine the role of PRE-B (Peak 3) SNPs in TERT alternative splicing, we inserted intron 4 sequence into a full length TERT cDNA mini-gene construct, and confirmed accurate splicing. Cancer risk-associated alleles for rs10069690 and rs2242652 were generated individually and in combination within the mini-gene. RT-PCR, using primers spanning intron 4, revealed that all SNP permutations in all cell-lines produced comparable levels of both wild-type and an INS1 alternative splice variant, which includes the first 38bp of TERT intron 446,47 (Supplementary Fig.5a). We also identified a novel TERT splice variant, specifically associated with the minor allele of rs10069690 (termed INS1b; Supplementary Fig.5a). Sequence analysis confirmed that INS1b includes the first 480bp of intron 4 and results from an alternative splice donor created by the minor allele of rs1006969048. INS1b has a premature stop codon 16 amino acids into intron 4 and is predicted to generate a severely truncated protein product, likely to impact on telomerase activity (Supplementary Fig.5b).
We used The Cancer Genome Atlas (TCGA)49 data to examine gene expression of the 11 protein-coding genes and one microRNA (MIR4457) located within 1Mb of Peak 3 SNP rs10069690. Most genes showed higher expression in ovarian tumors compared with normal tissues (Supplementary Fig.3; Supplementary Table 7). We observed no association between rs10069690 and expression levels of any of the genes in any of the cells tested (Supplementary Table 7; Supplementary Table 8; Supplementary Fig.5). There is some evidence of association between rs10069690 and tumor methylation with probes cg23827991 (TERT CpG island, P=1.3×10−6) and cg06550200 (CLPTM1L, P=6.9×10−4) among 935 probes tested. Both showed reduced methylation with the minor, cancer-risk allele (Supplementary Table 9) but this did not correlate with changes in expression.
Our comprehensive examination of the TERT locus has answered some long-standing questions and raised several new ones. We have identified two independent regions associated with telomere length in leucocyte DNA; these provide definitive evidence for genetic control of telomere length by common TERT variants. For rs2736108, the most significant SNP in promoter Peak 1, the minor allele is associated with a 1.7%-fold increase in telomere length. This equates to a telomere length change of ~60bp and, since telomere length decreases by approximately 19bp per year of age50, this is equivalent in magnitude to an age difference of 3.1 years. We estimate that rs2736108 explains 0.08% of the variance in telomere length in men and 0.06% in women. SNPs in Peak 2 have a stronger effect on telomere length with each additional A (minor) allele of rs7705526 associated with a 2.6%-fold increase. This equates to a ~90bp change in telomere length and, correspondingly, to 4.7 years of age. We estimate that rs7705526 explains 0.31% of the variance in telomere length in men and 0.16% in women. The only other reported associations with telomere length, reaching genome-wide significance, involve TERC-locus SNP rs126930451 and OBFC1-locus SNP rs438728752, which have similar effects on telomere length (75bp and 115bp per-allele, respectively).
Our only findings consistent with the hypothesis that shorter telomeres predispose to increased cancer risk53 (equivalent to longer telomeres being protective) are those from the Peak 1 SNPs. However, a regulatory-element construct containing the longer telomere associated alleles of three highly correlated SNPs, rs2736108, rs2736107 and rs2736109 (reconstructing a haplotype with 25% frequency in Europeans35) virtually abolished promoter activity in a reporter assay. This finding leaves an apparently paradoxical association between decreased enhancer activity and increased telomere length (Figure 4). Control of telomerase activity is currently poorly understood and this clearly merits further investigation.
SNPs within Peak 3 (TERT introns 2–4) exhibit strong associations with hormone-related cancers: Peak 3 SNP rs10069690 is associated with risk of ER-negative breast cancer34 and breast cancer in BRCA1 mutation carriers, consistent with the observation that the majority of breast cancers arising in BRCA1 mutation carriers are ER-negative. This variant has been reported to be associated with prostate cancer26,54 and we find it associated with serous invasive ovarian cancer. Although SNPs in Peaks 2 and 3 overlap on a physical map, the SNPs most strongly associated with cancer risk or telomere length were not highly correlated with each other [r2 between rs10069690 and rs7705526 = 0.33 (Fig.2, panel b)]. These observations suggest that either the associations observed with multiple cancers and SNPs in Peak 3 are mediated via a mechanism distinct from control of telomere length, or that telomere length in breast, prostate and ovarian cells is under the control of a different set of SNPs to those controlling telomere length in leucocytes. Luciferase reporter assays show that Peak 3 contains a silencer of the TERT promoter and that the minor allele of Peak 3 SNP rs2242652 further reduces expression. Consistent with this finding, Kote-Jarai Z. et al.54, report that the minor, risk allele of this SNP is associated with reduced TERT expression in benign prostate tissue. However, our search for comparable associations in ovarian or breast tumor tissue has been negative, possibly because TERT expression is severely disregulated in most tumors. Taken together, our luciferase assays indicate that either reduced signal from regulatory elements in Peaks 1 and 3, or increased signal from Peak 2, increases risk of specific cancer types.
We have additionally shown that the minor allele of rs10069690 affects splicing and is associated with transcription of a novel, truncated isoform resulting from a premature stop codon (Supplementary Fig.4). We do not yet know whether this isoform affects canonical telomerase activity, or how it changes activity. We further identified novel, open chromatin signatures overlapping rs10069690 in breast stromal and myoepithelial/stem cell fractions but not in progenitor or differentiated luminal epithelial cell fractions. Senescent stromal cells can stimulate malignant transformation of epithelial cells in in vitro and in vivo models55,56, and these SNP mechanisms merit investigation in future studies.
The SNPs originally reported to be associated with risk of lung (rs402710)57 and breast cancer (rs3816659)58 (Supplementary Table 10) were not associated with any cancer in this study. Moreover SNP rs2736100, in Peak 2, has been reported to be associated with glioma, lung and testicular cancer27,28,31,57,59–62 while nearby SNP rs2853677 was reported to be associated with glioma in the Chinese Han population63. Despite their physical proximity, these are not highly correlated with rs7705526 (r2=0.52 and 0.18 respectively), nor do they display independent associations with telomere length after adjustment for rs7705526. Thus, variants underlying susceptibility to different cancer types are different from the set of variants in the TERT region mediating changes in telomere length.
One limitation of this study is the incomplete representation of all SNPs at 5p15.33 on the iCOGS chip, which was designed in March 2010 using SNPs catalogued in HapMap3, together with those from the pilot study of the 1000GP35. To help fill known gaps on the iCOGS chip, additional SNPs were genotyped from the October 2010 1000GP data release, and imputation was based on the most recent, January 2012 release. However, several gaps remain across the TERT locus and this, coupled with the strikingly low linkage disequilibrium across the region (Fig.2), raises the possibility that there could be more independent associations that we have not yet detected. Furthermore, the incomplete SNP catalogue at the time of study design means that we cannot assume with certainty that the true causal variants, directly responsible for the observed association peaks, were captured in our analysis. It is also possible that more rare variants, not specifically investigated in this study, could have additional functional effects within this locus. Further re-sequencing of this region is needed to uncover the full spectrum of variation and phenotype associations. Another limitation is that telomere length was measured in DNA from leucocytes rather than from breast or ovarian tissue. Whilst we obtained suitable blood DNA for measurements in >53,000 subjects (a necessary sample size for adequate statistical power), obtaining comparable qualities and quantities of DNA from normal breast or ovary cells would be almost impossible. Telomere lengths, measured in different tissues within one individual have been shown to be highly correlated64–66 meaning that leukocyte telomere lengths are likely to be good surrogates for other tissues. Furthermore, one of our aims was to investigate whether the previously reported associations between mean telomere length and cancer risk might be mediated through TERT variants, and such studies have been based on telomere length measured in blood cell DNA. Another limitation was that we were unable to stratify OCAC ovarian cancer cases by BRCA1 and BRCA2 mutation status because this information was not available; nor was there sufficient power to evaluate ovarian cancer risk in mutation carriers in CIMBA.
Our findings provide evidence relevant to the hypothesis that shorter telomeres increase cancer risks: associations in the TERT promoter (Peak 1) fit this hypothesis best, while those in Peaks 2 and 3 (introns 2–4) and other reported 5p15.33 SNP-cancer associations (Supplementary Table 10) do not. Thus, it would appear that the majority of cancer associations within the TERT locus are mediated via alternative mechanisms involving the TERT gene. The protein product of TERT has functions beyond the telomerase-mediated extension of telomeres67. These non-canonical functions of TERT strongly resemble those mediated by MYC and WNT68, which are upstream regulators of proliferation, differentiation and migration. TERT also modulates WNT/β-catenin signaling69, and ectopic TERT expression induces increased cell division and decreased apoptosis in primary mammary cells, independently of telomere elongation70.
In conclusion, this study provides definitive evidence for genetic control of telomere length by common genetic variants in the TERT locus. Additionally, we report multiple, independent TERT SNP associations with breast cancer risk, confirming previously-reported associations and identifying new associations in both the general population and in BRCA1 mutation carriers. We also provide, for the first time, highly significant evidence for the association of distinct TERT SNPs with serous LMP and invasive ovarian cancer risk. Our results demonstrate that the relationships between TERT genotype, telomere length and cancer risk are complex, and that the TERT locus may influence cancer risk through multiple mechanisms.
Most SNPs were genotyped on the iCOGS custom array36,37,71. SNPs at 5p15.33 (Build 36 positions 1280000–1415000; Build 37 positions 1227693 to 1361969) were selected, based on published cancer associations, from the March 2010 release of the 1000 Genomes Project (1000GP)35. These included all known SNPs with minor allele frequency (MAF)>0.02 in Europeans and r2>0.1 with the then-known cancer associated SNPs (rs40271057 and/or rs381665958), plus a tagging set for all known SNPs in the linkage disequilibrium blocks encompassing the genes in the region (SLC6A18, TERT and CLPTM1L). An additional 30 SNPs in TERT were selected through a telomere length candidate gene approach. In total, 134 SNPs were selected, 121 of which were successfully manufactured; 110 of those passed quality control (QC)36 in BCAC and CIMBA, and 108 in OCAC (Supplementary Tables 1–3). After genotyping, these SNPs were complemented with 22 SNPs, selected from the October 2010 release of 1000GP, to improve coverage. These were genotyped in two BCAC studies, SEARCH72 and CCHS73, using a Fluidigm™ array according to the manufacturer's instructions. To improve SNP density further, comprehensive genotype data for the locus were imputed for all subjects based on the January 2012 1000GP release. The genotype imputation process is described in36–38.
Study characteristics, iCOGS methodology and quality control for cancer-risk analyses are detailed elsewhere36–38. We measured telomere length in 6,766 control samples from the SEARCH study; 1,569 of these were accrued by SEARCH itself36, 793 were collected as part of the Sisters in Breast Screening (SIBS) study15 and 4,404 were cancer-free participants in the EPIC-Norfolk study19. We also measured telomere length in 8,841 participants in the Copenhagen City Heart Study (CCHS)73,74 and in 38,145 participants in the Copenhagen General Population Study (CGPS)75,76. Genotype clusters were visually inspected for the most strongly associated SNPs (Supplementary Fig.6). For all studies, ethnicity was assigned using HapMap (release 22) genotype data for European, African and Asian populations as reference (for BCAC and CIMBA using multidimensional scaling, for OCAC using LAMP77). All CIMBA analyses were restricted to individuals of European ancestry. For BCAC, separate estimates for individuals of East Asian and African-American ancestry were also derived. For OCAC, limited analyses of non-European ancestry groups were also performed. A subset of BCAC and OCAC were utilised in previous breast and ovarian cancer association studies of individual SNPs78. However, the associations with the key SNPs (rs10069690, rs2736108 and rs7705526) remained significant after excluding these studies, demonstrating similar ORs.
Telomere length was measured in SEARCH using a modified version of the protocol as described elsewhere19,79. Twelve percent of samples were run in duplicate. Failed PCR reactions were not repeated. Telomere length was measured in CCHS and CGPS with a modified version of the protocol as described elsewhere50,80. Each individual was measured in quadruplicate. After exclusion of outliers, average cycle threshold (Ct) values of the remaining samples were calculated. Failed measurements were repeated up to twice. For meta-analysis, telomere length measurements from SEARCH were converted to the same scale as the CCHS and CGPS measurements, based on parameters from the linear regression between corresponding CCHS and SEARCH 10-year-age and 5-percentile groups in women only (Supplementary Fig.7). This measure of telomere length was used for all the analyses and then converted into fold changes (RTL) to aid interpretation (Supplementary Fig.7).
SNP associations with telomere length were evaluated using linear regression to model the fold-change in telomere length per-minor-allele, adjusted for age, 384-well plate, sex, seven principal components and study. The SNP was coded as number of minor alleles (0, 1, 2 for genotyped and the inferred genotype for imputed SNPs). The test of association was based on the 1 degree of freedom (1df) trend-test statistic. We also performed separate analyses (SEARCH, CCHS females, CCHS males, CGPS females, CGPS males) and combined the parameter estimates in a fixed-effect meta-analysis81. Associations with breast and ovarian cancer risk in BCAC and OCAC were evaluated by comparing genotype frequencies in cases and controls using unconditional logistic regression. Analyses were adjusted for study and seven principal components in BCAC36 and five in OCAC37. Nine OCAC studies with case-only genotype data were paired with case-control studies from similar geographic regions, resulting in 34 analysis study-strata. The principal analysis fitted the SNP as an allelic dose and tested for association using a 1df trend-test, but genotype-specific risks were also obtained. Associations between genotypes and breast cancer risk in CIMBA studies (BRCA1 carriers) were evaluated using a 1df per-allele trend-score test, based on modeling the retrospective likelihood of the observed genotypes conditional on breast cancer phenotypes82. To allow for non-independence among related individuals, an adjusted version of the score-test was used in which the variance of the score was derived, taking into account the correlation between the genotypes by estimating the kinship coefficient for each pair of individuals using the available genotype data83. Per-allele Hazard Ratio (HR) estimates were obtained by maximizing the retrospective likelihood. All analyses were stratified by country of residence. USA and Canada strata were further stratified on the basis of reported Ashkenazi Jewish ancestry.
Conditional analyses were performed to identify SNPs independently associated with each phenotype. To identify the most parsimonious model, all SNPs with marginal P-value<0.001 were included in forward-selection regression analyses with a threshold for inclusion of P-value<10−4, and including terms for age (for telomere length only), principal components and study. Similarly, forward-selection Cox-regression analysis was performed for BRCA1 carriers, stratified by country of residence, using the same P-value thresholds. This approach provides valid association tests, although the estimates can be biased82,84. Parameter estimates for the most parsimonious model were obtained using the retrospective likelihood approach.
Normal breast tissue was donated by women undergoing reduction mammoplasty surgery. Patients provided written consent and all work was performed with full local institutional human ethics approval. Tissue was dissociated as described previously85. Cells were prepared for flow cytometry as described previously42 by staining with a cocktail of Lin+ markers (CD31-PE, CD45-PE and CD235a-PE), EpCAM-FITC, CD49f-PE-Cy5 and Sytox Blue. Cells were then processed by a BD FACSAria II Cell Sorter and live cells immuno-negative for Lin+ markers were sorted into four subpopulations on the basis of their EpCAM-FITC and CD49f-PE-Cy5 fluorescence.
Cell pellets derived from FACS-sorting of breast tissue samples were cross-linked in 1% formaldehyde, then lysed in 200μl Tris-buffered 1% SDS lysis buffer containing protease inhibitors. Lysates were sonicated using a QSONICA Model Q125 Ultra Sonic Processor to shear chromatin to 200bp-1kb fragments. Insoluble cell material was removed through centrifugation, and supernatants equally divided into 100μl INPUT and FAIRE samples. INPUTs were incubated overnight at 65°C to reverse cross-linking. All samples were purified through two rounds of phenol-chloroform extraction and DNA recovered through ethanol precipitation and re-suspended in water for use as PCR templates.
TERT promoter variants were introduced into pGL3-TERT-391543 by site-directed mutagenesis (Agilent Technologies). TERT Putative Regulatory Elements PRE-A (hg19; chr5:1,284,900–1,287,087) and PRE-B (chr5:1,279,401–1,282,763), were PCR amplified using KAPAHiFi DNA polymerase (Geneworks) and cloned into pGL3-TERT-3915 or rs2736107/8/9 vectors. Individual SNPs were incorporated using overlap extension PCR. Cells were transfected with equimolar amounts of luciferase reporter plasmids and 50ng of pRLTK using siPORT NeoFX Transfection Agent (Ambion), according to the manufacturer's instructions, and harvested after 48h. Luminescence was measured with a Wallac Victor3 1420 multilabel counter and data (n=3) was analysed by one-way ANOVA with post-hoc Dunnett's tests.
TERT intron 4 was synthesised by GenScript and subcloned into pIRES-TERT44. The minor alleles at rs10069690 and rs2242652 were introduced by site-directed mutagenesis (Agilent Technologies). The resultant plasmids, designated pIRES-TERTint4-WT (wild-type intron 4), pIRES-TERTint4-rs10069690, pIRES-TERTint4-rs2242652 and pIRES-TERTint4-DM (minor alleles at both sites), were transfected using siPORT NeoFX Transfection Agent (Ambion) and cells harvested after 24h. Total RNA was extracted using the RNeasy Mini Kit (Qiagen) and digested with DNaseI (Invitrogen). cDNA was synthesised from 1μg RNA by random priming using SuperScript III reverse transcriptase (Invitrogen). Samples were screened for the presence of TERT splice variants by RT-PCR.
For each gene within 1Mb we performed the following assays: (1) gene expression analysis in ovarian cancer cell lines (N=50) compared to ovarian surface epithelial and fallopian tube secretory cell lines (N=73) and tissues from high-grade serous ovarian cancers; (2) methylation analysis in high-grade serous ovarian cancers compared to normal tissues, and methylation quantitative trait locus (mQTL) analysis; (3) expression quantitative trait locus (eQTL) analysis to evaluate genotype-gene expression associations in normal high-grade serous ovarian cancers precursor tissues. We also evaluated these genes in silico in the somatic data from The Cancer Genome Atlas (TCGA49). We also profiled the spectrum of non-coding regulatory elements in ovarian surface epithelial and fallopian tube secretory cell lines using a combination of formaldehyde-assisted isolation of regulatory elements sequencing (FAIRE-seq40) and RNA sequencing (RNA-seq).
COGS is funded through a European Commission's Seventh Framework Programme grant (agreement number 223175 - HEALTH-F2-2009-223175). BCAC is funded by CR-UK (C1287/A10118 and C1287/A12014). BCAC meetings have been funded by the European Union COST programme (BM0606). Telomere length measurement and analysis was funded by CR-UK project grant C1287/A9540 and Chief Physician Johan Boserup and Lise Boserup's Fund. CIMBA data management and analysis were supported by Cancer Research – UK grants C12292/A11174 and C1287/A10118. OCAC is supported by a grant from the Ovarian Cancer Research Fund thanks to the family and friends of Kathryn Sladek Smith (PPD/RPCI.07). Genotyping of the iCOGS array was funded by the European Union (HEALTH-F2-2009-223175), Cancer Research UK (C1287/A10710), the Canadian Institutes of Health Research for the “CIHR Team in Familial Risks of Breast Cancer” program (J.S. & D.E.), and the Ministry of Economic Development, Innovation and Export Trade of Quebec – grant # PSR-SIIRI-701 (J.S. & D.E, P.Hall). Scientific development and funding of the OCAC portion of this project were supported by the Genetic Associations and Mechanisms in Oncology (GAME-ON; U19-CA148112). CIMBA genotyping was supported by NIH grant CA128978, an NCI Specialized Program of Research Excellence (SPORE) in Breast Cancer (CA116201), a U.S. Department of Defense Ovarian Cancer Idea award (W81XWH-10-1-0341), grants from the Breast Cancer Research Foundation and the Komen Foundation for the Cure. This study made use of data generated by: The Wellcome Trust Case Control Consortium (A full list of WTCCC investigators is available from http://www.wtccc.org.uk/; funding was provided by Wellcome Trust award 076113) and The Cancer Genome Atlas (TCGA) Pilot Project established by the National Cancer Institute and National Human Genome Research Institute (The investigators and institutions constituting the TCGA research network, can be found at http://cancergenome.nih.gov/.)
We thank all the individuals who took part in these studies and all the researchers, clinicians, technicians and administrative staff who have enabled this work to be carried out.
URLs TERT gene, http://www.ncbi.nlm.nih.gov/gene/7015.
HapMap3 catalogue, http://www.sanger.ac.uk/resources/downloads/human/hapmap3.html.