|Home | About | Journals | Submit | Contact Us | Français|
G.M.P., L.A., C.S.F., P.K., R.Z.S-S., K.B.J., S.M.L., J.B.M., G.S.T., R.N.H., P.H. and S.J.C. organized and designed the study.
L.A., A.H., K.B.J., G.T. and S.J.C. supervised genotyping of samples.
L.A., P.K., R.Z.S-S., C.S.F., K.B.J., C.K., H.P., Z.W., K.Y., R.N.H., P.H. and S.J.C. contributed to the design and execution of statistical analysis.
LA., G.M.P., P. K., R Z.S-S R.N.H., P.H. and S.J.C. wrote the first draft of the manuscript.
G.M.P., C.S.F., R Z.S-S., A.A.A., H.B.B., S.G., M.G., K.H., E.A.H., E.J.J., A.P.K., A.L., D.L., M.T.M., S.H.O., H.A.R., W.Z., D.A., W.R.B., C.D.B., M.B., J.E.B., P.M.B., F.C., S.C., M.C., M.A., E.J.D., J.M.G., E.L.G., M.G., G.H., S.E.H., M.H., B.H., D.J.H., M.J., R.K., V.K., R.C.K., R.R.M., D.S.M., A.V.P., P.H.M.P., A.R., E.R., L.R., X.S., A.T., D.T., S.K.V.D.E., J.V., J.W., B.M.W., H.Y. and A.Z-J. conducted the epidemiologic studies and contributed samples to the PanScan GWAS and/or replication. All authors contributed to the writing of the manuscript.
We conducted a genome-wide association study (GWAS) of pancreatic cancer in 3,851 cases and 3,934 controls drawn from twelve prospective cohort studies and eight case-control studies. Based on a logistic regression model for genotype trend effect that was adjusted for study, age, sex, self-described ancestry and five principal components, we identified eight SNPs that map to three loci on chromosomes 13q22.1, 1q32.1 and 5p15.33. Two correlated SNPs, rs9543325 (P=3.27×10−11; per allele odds ratio, OR 1.26, 95% CI=1.18-1.35) and rs9564966 (P=5.86×10−8; per allele OR 1.21, 95% CI=1.13-1.30) map to a non-genic region on chromosome 13q22.1. Five SNPs on 1q32.1 map to NR5A2; the strongest signal was rs3790844 (P=2.45×10−10; per allele OR 0.77, 95% CI=0.71-0.84). A single SNP, rs401681 (P=3.66×10−7; per allele OR 1.19, 95% CI=1.11-1.27) maps to the CLPTM1L-TERT locus on 5p15.33, associated with multiple cancers. Our study has identified common susceptibility loci for pancreatic cancer that warrant follow-up studies.
Pancreatic cancer is one of the most lethal cancers with mortality rates approaching incidence rates1. Established risk factors for pancreatic cancer include diabetes, an elevated body-mass index, current or recent smoking, and family history2. However, only a small fraction of familial aggregation can be explained by highly penetrant mutations previously identified in BRCA2, p16/CDKN2A, STK11/LKB, APC, BRCA1, PRSS1, and SPINK2,3. Truncating mutations and deletions in PALB2 have recently been shown to be involved in familial pancreatic cancer4,5.
We recently reported common risk variants for pancreatic cancer that map to the first intron of the ABO gene on chromosome 9q34.2 based on a genome-wide association study of 1,896 individuals diagnosed with pancreatic cancer and 1,939 controls6. Individuals were drawn from 12 prospective cohort studies (the Pancreatic Cancer Cohort Consortium) and one hospital-based case-control study, the Mayo Clinic Molecular Epidemiology of Pancreatic Cancer Study (see Online Methods)6. In the first scan, we genotyped approximately 550,000 SNPs and followed up the most significant SNPs that had been found in eight case-control studies (see Online Methods)6.
To identify additional loci, we conducted a second GWAS in which we genotyped approximately 620,000 single nucleotide polymorphisms (SNPs) in an additional 1,955 cases and 1,995 controls drawn from the same eight case-control studies used to replicate the initial GWAS finding on chromosome 9q34.2. After quality control analysis of genotypes, we combined the data sets, resulting in 551,766 SNPs available for analysis (Illumina HumanHap550 and Human 610-Quad chips) in 3,851 pancreatic cancer cases and 3,934 controls (Online Methods). A logistic regression model was fit for genotype trend effects (1 d.f.) adjusted for study, age, sex, self-described ancestry and five principal components of population stratification. The quantile-quantile (Q-Q) plot showed little evidence for inflation of the test statistics as compared to the expected distribution (lambda=1.013), that excludes the likelihood of substantial hidden population substructure or differential genotype calling between cases and controls (Supplemental Figure 1). A Manhattan plot displays the results of the combined GWAS (Supplemental Figure 2A) and the results from the case-control studies including the full Mayo data set (Supplemental Figure 2B). Our combined analysis identified three novel genomic regions on chromosomes 13q22.1, 1q32.1 and 5p15.33 associated with pancreatic cancer risk that were below the threshold for genome-wide significance (P<5×10−7) shown in Table 1 and Figure 17. Two different haplotype analyses were conducted for each of the three regions, a regularized regression approach8 and a sequential haplotype scan method9, both of which employ different test statistics (see Online Methods). Haplotype analysis across each of the three regions did not identify new or independent markers, thus indicating that the current tag SNPs probably point to single loci in each region (Supplemental Figure 3).
For the locus on 13q22.1, we observed two highly significant SNPs that ranked number 1 and 6 in the combined analysis: rs9543325 (P=3.27×10−11; per allele OR 1.26, 95% CI=1.18-1.35; unconstrained ORHet 1.23, 95% CI=1.11-1.36 and ORHom 1.61, 95% CI=1.40-1.86) and rs9564966 (P=5.86×10−8; per allele OR 1.21, 95% CI=1.13-1.30; unconstrained ORHet 1.21, 95% CI=1.09-1.34 and ORHom=1.48, 95% CI=1.27-1.72). These SNPs, 20 kb apart, are highly correlated (r2=0.82 in 3,650 study controls of European ancestry and r2=0.85 in HapMap CEU). SNP rs9564966 was no longer nominally significant after adjusting for rs9543325 (P=0.47), suggesting the two SNPs mark a single signal in the non-genic region of approximately 600 kb between two genes in the family of kruppel-like transcription factors, KLF5 and KLF12 that regulate cell growth and transformation10,11. This segment of chromosome 13 is frequently deleted in a spectrum of cancers, including pancreatic cancer12,13 and may harbor a breast cancer susceptibility locus based on linkage analysis in breast cancer families negative for mutations in BRCA1 and BRCA2 genes14.
Five highly significant SNPs (ranked 2, 3, 4, 7 and 9 in the combined analysis; P≤5×10−7) map to a region of chromosome 1q32.1, that harbors the nuclear receptor subfamily 5, group A, member 2 (NR5A2) gene. The SNPs are distributed across a 105 kb genomic region, which includes the 5′ end of NR5A2 extending to 91 kb upstream of the gene. The two most significant SNPs in this region map to the first intron of NR5A2 (rs3790844, P=2.45×10−10; per allele OR.0.77, 95% CI=0.71-0.84; unconstrained ORHet 0.75, 95% CI=0.68-0.83 and unconstrained ORHom 0.64, 95% CI=0.52-0.79) and approximately 32 kb upstream of the gene (rs10919791, P=6.37×10−10; per allele OR 0.77, 95% CI=0.71-0.84; unconstrained ORHet 0.76, 95% CI=0.68-0.84 and unconstrained ORHom 0.63, 95% CI=0.50-0.79)). The LD between these two SNPs is high, r2=0.81 in study controls and r2=0.71 in HapMap CEU. In this region, there were three additional SNPs, rs3790843, rs12029406 and rs4465241 that were highly significant (P < 5×10−7). Of these three SNPs, the telomeric one, rs3790843 is highly correlated with rs3790844 and rs10919791 (r2 of 0.59 and 0.72 in PanScan European controls). The two SNPs centromeric to rs3790844 and rs10919791 are not as strongly correlated (r2=0.05-0.38 in PanScan European controls). In an analysis adjusted for the most highly associated SNP, rs3790844, three of the other four SNPs, namely, rs10919791, rs3790843, and rs12029406 were no longer nominally significant (p>0.05) whereas the significance of the association with rs4465241 (which had the lowest LD) decreased by several orders of magnitude after adjustment (p=0.004). Together these findings suggest that the five SNPs mark a single common allele, but further fine-mapping is needed.
NR5A2 encodes a nuclear receptor of the fushi tarazu (Ftz-F1) subfamily that is predominantly expressed in exocrine pancreas, liver, intestine and ovaries in adults. The widespread expression of NR5A2 in early embryos and early lethality of knockout mice implies a critical role in development15. NR5A2 plays a role in cholesterol and bile acid homeostasis, steroidogenesis and cell proliferation (for review see16). Evidence for its involvement in transformation stems from the fact that NR5A2 interacts with β-catenin to activate expression of cell cycle genes while haploinsufficiency of NR5A2 attenuates intestinal tumor formation in the ApcMin/+ tumor model17.
The third locus identified is marked by rs401681 (P=3.66×10−7; per allele OR 1.19, 95% CI=1.11-1.27; unconstrained ORHet 1.20, 95% CI=1.07-1.34 and unconstrained ORHom 1.41, 95% CI=1.23-1.61), which maps to chromosome 5p15.33. It resides in intron 13 of the cleft lip and palate transmembrane 1-like gene (CLPTM1L), part of the CLPTM1L-TERT locus that includes the telomerase reverse transcriptase gene (TERT), only 23 kb away. Both genes have been implicated in carcinogenesis: the CLPTM1L gene is up-regulated in cisplatin-resistant cell lines and may play a role in apopotosis18 whereas the TERT gene encodes the catalytic subunit of telomerase, essential for maintaining telomere ends. When over-expressed in normal cells, TERT can lead to prolonged cell lifespan and transformation19,20. While telomerase activity cannot be detected in most normal tissues, it is seen in approximately 90% of human cancers21. This region of chromosome 5p15.33 has been identified in genome-wide association studies of a number of different cancers, including brain tumors, lung cancer, basal cell carcinoma, melanoma and now pancreatic cancer22-26. In a recent analysis of lung cancer in smokers, the signal on chromosome 5p15.33 has been shown to be strongly associated with the adenocarcinoma histology subtype27. Moreover, variants in this region, in LD with our strongest signal, rs402710, have been suggested to be associated with levels of smoking-related bulky aromatic DNA adducts, a relevant mechanism for pancreatic cancer which is also tobacco related28. Germ-line mutations have been shown to contribute to the development of acute myelogenous leukemia, whereas mutations in TERT account for a proportion of individuals with an inherited bone marrow failure syndrome that is prone to hematologic malignancies29-31. SNPs in the CLPTM1L-TERT region, including rs401681, also have shown possible associations in additional cancers, namely bladder and prostate cancer22-24. Of note, the C allele of rs401681 is associated with an increased risk of lung, prostate and bladder cancers as well as basal cell carcinoma22-25 whereas the T allele is associated with increased risk of pancreatic cancer (this study) and melanoma25. Lastly, a highly suggestive SNP in this region that did not meet genome-wide significance, rs4635969 (ranked 12th in combined analysis, P=1.05×10−6) is located between the CLPTM1L and TERT genes (r2=0.26 in 3,650 study controls and r2=0.36 in HapMap CEU).
It is notable that the estimated odds ratio for the variants meeting genome-wide significance on chromosomes 13q22, 1q32 and 5p15 were consistent when restricted to data from either the case-control studies or the cohort studies6. This similarity of estimated effect size between the two study designs was also observed for rs505922 in the ABO locus in our previous report6. The consistency of effect supports a role for loci at 13q22.1, 1q32.1, 5p15.33 and ABO, and the divergent results for SHH (reported earlier6) on chromosome 7q36 indicate the need for further investigation of the potential influence of study sampling design on detection of regions using the GWAS strategy.
GWAS have emerged as a powerful, hypothesis-independent approach to identify common alleles that influence disease risk. Our results show that pancreatic cancer is similar to other complex diseases, in that multiple common disease alleles with small effects influence disease risk. Our study has good power to detect common alleles with large effects (over 90% power to detect a per allele relative risk of 1.4 or greater for an allele with 10% frequency at the alpha=5×10−7 level) but less power to detect smaller effect sizes. Thus, although it is unlikely that there are common alleles with large effects on the majority of sporadic pancreatic cancer risk, it is likely that additional susceptibility alleles with moderate to small effects exist. The list of susceptibility alleles should increase as further GWAS are performed for pancreatic cancer to catalogue the variants with estimated risks below 1.3. Additional studies are needed to assess the clinical utility of risk stratification that combines genetic markers with epidemiologic risk factors already established for pancreatic cancer, namely adiposity, smoking, diabetes and family history.
Our combined analysis of 3,851 individuals with pancreatic cancer and 3,934 controls has yielded three new genomic regions associated with the risk of pancreatic cancer. Two regions harbor candidate genes while the third locus on chromosome 13q22.1 maps to a large nongenic region analogous to the 8q24 region; however, though the latter is associated with risk of multiple cancers, including prostate, breast, colorectal and bladder cancers, the locus on chromosome 13q22.1 appears to be specific for pancreatic cancer. The CPTM1L-TERT region on chromosome 5p15.33 has been implicated in a disease spectrum that also includes lung cancer, brain tumors, acute myelogenous leukemia, bone marrow failure syndromes and pulmonary fibrosis. The fine-mapping of signals in the three regions identified by our GWAS should guide selection of the optimal variants for functional studies into the biological mechanism underpinning pancreatic carcinogenesis. These results, in turn, should help to inform new preventive, diagnostic and/or therapeutic approaches designed to lessen the burden of this highly fatal disease.
The authors gratefully acknowledge the energy and contribution of our late colleague, Sheila Bingham. Additional acknowledgements are in the Supplemental Note.