|Home | About | Journals | Submit | Contact Us | Français|
We conducted a two-stage genome-wide association study of renal cell carcinoma (RCC) in 3,772 cases and 8,505 controls of European background from 11 studies, and followed up 6 SNPs in three replication studies of 2,198 cases and 4,918 controls. Two loci on the regions of 2p21 and 11q13.3 were associated with RCC susceptibility below genome-wide significance. Two correlated variants (r2 = 0.99 in controls), rs11894252 (P = 1.8×10−8) and rs7579899 (P = 2.3×10−9), map to EPAS1 on 2p21, which encodes hypoxia-inducible- factor-2 alpha, a transcription factor previously implicated in RCC. The second locus, rs7105934, at 11q13, contains no characterized genes (P = 7.8×10−14). In addition, we observed a promising association on 12q24.31 for rs4765623 which maps to the scavenger receptor class B, member 1 (SCARB1) gene (P = 2.6×10−8). Our study reports novel genomic regions associated with RCC risk that may lead to new etiological insights.
Kidney cancer accounts for approximately 2% of new cancer diagnoses worldwide1 and is the deadliest urologic malignancy with an estimated 5-year survival rate between 50% and 60%2. Approximately 80–90% of kidney cancers develop in the renal parenchyma, and are known as renal cell carcinoma (RCC). Epidemiological studies have conclusively identified three risk factors, all modifiable: hypertension, obesity and smoking2, 3. Furthermore, there is evidence that genetic factors influence susceptibility to RCC; for instance, the life-time risk increases approximately twofold for those with a first-degree relative with RCC4–7. The tumor is also commonly observed in pedigrees with von Hippel-Lindau (VHL) syndrome as well as other genetic disorders, such as hereditary papillary renal cell carcinoma, Birt-Hogg-Dubé syndrome, and hereditary leiomyomatosis and renal cell cancer (HLRCC)2, 8. However, familial RCC cases represent less than 5% of RCC overall9. To date, candidate gene studies have not yielded genetic variants that conclusively replicate. In search of common genetic variants with moderate effect sizes, we have therefore conducted a genome-wide association study (GWAS) of RCC.
We report the findings of a two-stage GWAS of RCC, based on two parallel scans followed by replication of six notable SNPs in three studies. The two scans were coordinated by (i), the International Agency for Research on Cancer (IARC) and the Centre National de Génotypage (CNG), based on 2,639 RCC cases and 5,392 controls of European background drawn from 7 studies conducted in Europe with the Illumina Infinium HumanHap 300 and 610 Bead Chips; and (ii), the U.S. National Cancer Institute (NCI) scan, based on 1,453 RCC cases and 3,531 controls of European background from 4 studies with the Illumina Infinium HumanHap 500 and 610 chips (Supplementary Table 1, Online Methods and Supplementary note). All subjects from the IARC/CNG study were genotyped at the CNG with the exception of 305 cases and 323 controls from Russia that were genotyped at the Center “Bioengineering” and at the “Kurchatov Institute” in Moscow. All subjects from the NCI study were scanned at the NCI Core Genotyping Facility. In addition, 1,438 controls from the Wellcome Trust Case-Control Consortium were genotyped at the Sanger Institute, UK10. All RCC cases were defined on the basis of the International Classification of Diseases for Oncology, Second Edition (ICD-O-2), and included all cancers that were coded as C64.
Comparable quality control metrics were applied to the two scanned data sets and following sample and SNP exclusions, genotype data for up to 577,547 SNPs were available for 2,461 cases and 5,081 controls in the IARC/CNG scan, while data for 585,576 SNPs were available for 1,311 cases and 3,424 controls in the NCI scan (Online Methods). Primary analyses were conducted using unconditional logistic regression models for genotype trend effects (1 degree of freedom) and adjusted for sex, country, eigenvectors, and study for the USA (Online Methods). In order to compute summary findings across both scans, a meta-analysis was performed using a fixed effects model with inverse variance weighting followed by a pooled analysis with individual level data. Quantile-quantile plots of the combined results showed little evidence for inflation of the test statistics compared to the expected distribution (λ = 1.018, overall, Supplementary Fig. 1). Genomic control was subsequently applied, and all reported p-values and confidence intervals were corrected for the observed inflation. A Manhattan plot summarizing the combined results of 586,069 SNPs is shown in Supplementary Figure 2.
Based on the meta-analysis using SNPs genotyped in both centers, six SNPs associated with RCC at a significance level approaching or surpassing genome-wide statistical significance (P < 5×10−7 in two-tailed tests)10 were selected for replication in three additional case-control series from Europe and the US (2,198 RCC cases, 4,918 controls) (Supplementary Table 1). Performing genomic control showed that hidden population substructures or differential genotype calling between cases and controls did not substantively influence these results (Online methods). Three SNPs on 2p21 (rs11894252, rs7579899 and rs6758592) were selected as well as single SNPs on 3q26.31 (rs9839909), 11q13.2 (rs7105934), and 12q24.31 (rs4765623). For the replication study, rs11894252 could not be optimized; thus a highly correlated SNP, rs1867785 (r2 = 1.0 in HapMap CEU11), was genotyped (Online Methods). For the other five SNPs, there was a high concordance between genotype calls on the Illumina bead chip and optimized TaqMan assays in both centers (100% for IARC/CNG and 98.9–100% for NCI)12. Because rs9839909 (3q26.31) and rs7105934 (11q13.2) were not included on the Illumina HumanHap 300 bead chip, subjects genotyped with this chip in the GWAS (908 cases and 2,415 controls) were also genotyped by TaqMan and included in the replication phase. In a meta-analysis of the pooled GWAS and replication results, SNPs in three of the four regions achieved genome-wide significance and mapped to 2p21, 11q13.3 and 12q24.31 (Table 1 and Fig. 1). Imputing SNPs in the implicated regions 2p21, 11q13.3 and 12q24.31, using the 1000 Genomes data13 as scaffold did not reveal additional SNPs with stronger, independent associations to those genotyped directly (Supplementary Table 2).
In the combined analysis14, two SNPs on 2p21 achieved genome-wide significance, rs7579899 (P = 2.3×10−9; per allele odds ratio (OR) = 1.15, 95% confidence interval (CI): 1.10–1.21) and rs11894252 (P = 1.8×10−8; OR = 1.14, 95% CI: 1.09–1.20). Further, rs7579899 was significant in the independent replication analysis (P = 0.008; OR = 1.11, 95% CI: 1.03–1.20) whereas rs1867785, a highly correlated surrogate for rs11894252, suggested a comparable effect that did not achieve independent significance (P = 0.06; OR = 1.08, 95% CI: 1.00–1.16) (Table 1). When stratified by either SNP marker, the signal of the second was extinguished (data not shown). Together with the high correlation between the two markers (r2 = 0.99 in controls), these results point towards a single common susceptibility locus. An additional SNP rs4952818 achieved genome-wide significance in the combined scan (P = 1×10−7, Figure 1), but its association was accounted for by rs11894252 and rs7579899 (Padjusted = 0.45 and Padjusted = 0.36, respectively) and was therefore not selected for replication. The third SNP selected for replication, rs6758592, was minimally correlated with the previous two (r2 = 0.12 and 0.11 with rs11894252 and rs7579899, respectively), and only showed an association in the NCI data (PNCI = 1.8×10−7, PIARC = 0.16, Pheterogeneity = 0.0004, Supplementary Table 3) not accounted for by rs11894252 and rs7579899 (Padjusted = 1x10−5 for both). While rs6758592 did not replicate, the combined analysis yielded P= 4.0×10−5, suggesting that in the NCI scan data there could be evidence for a more complex genomic architecture underlying the association of this locus with RCC.
Our finding on 2p21 is notable because the candidate gene, EPAS1, has already been implicated in RCC15–19. The two SNPs, rs11894252 and rs7579899, are distributed across a 4.2 kb region of intron 1 in the EPAS1 gene, which encodes the hypoxia-inducible factor 2α (HIF-2α), a key gene in the VHL-HIF pathway. The VHL complex targets HIF subunits for ubiquitin-mediated degradation20. Accumulation of HIF-2α leads to up-regulation of vascular endothelial growth factor (VEGF) and epidermal growth factor receptor (EGFR). The inactivation of VHL in renal carcinoma cell lines leads to unchecked HIF-2α mediated expression of HIF-responsive tumorigenic factors, most notably VEGF16, 17. Further, tumor formation in VHL-deficient renal carcinoma cells has been found to be suppressed by inhibition of HIF-2α18,19. The findings from our GWAS provide further evidence that EPAS1 is a key gene in RCC development, but additional studies are needed to identify the functionally relevant common variants associated with increased risk.
A variant, rs7105934, on 11q13 was associated with RCC in the combined analysis (P = 7.8×10−14, OR = 0.69, 95% CI: 0.62–0.76). The SNP was independently replicated with a comparable risk estimate to the initial GWAS results (P = 6.8×10−7; OR = 0.71, 95% CI: 0.62–0.81). Overall, the magnitude of the association with this relatively uncommon SNP (minor allele frequency = 0.08 in controls) is comparatively large compared to risk markers previously identified in the GWAS of other cancers21. This SNP maps to a 350 kb region of 11q13 containing no characterized genes; flanking genes are Homo sapiens myeloma over-expressed (in a subset of t(11;14) positive multiple myelomas) (MYEOV) and cyclin D1 (CCND1), situated approximately 140 kb centromeric and 220kb telomeric, respectively, from rs7105934. In the control samples, there is little evidence for linkage disequilibrium with markers in these genes (r2 < 0.01 in scanned controls). Similarly, we did not observe LD with a complex susceptibility locus for prostate cancer also identified within 11q1322, 23, nor with a SNP marker, rs614367, 89kb telomeric to rs7105934 recently associated with breast cancer risk24.
A third locus, marked by rs4765623 on 12q24, also achieved genome-wide significance overall (P = 2.6×10−8; OR = 1.15, 95% CI: 1.09–1.20), although it did not independently replicate using a two-tailed significance test (P = 0.09; OR = 1.07, 95% CI: 0.99–1.16). The SNP maps to intron 1 of the scavenger receptor class B, member 1 (SCARB1) gene, a cell surface receptor that binds to high-density lipoprotein cholesterol (HDL-C) and mediates HDL-C uptake25–27. Its role in cancer biology is not as well established, and the signal was stronger in the European studies (scan and replication studies) than in the US studies (Supplementary Table 3 and Fig. 2). While this SNP marks a promising association, further confirmatory work is required to establish its association with RCC risk.
For each of the three regions associated with RCC risk, we conducted further pooled analyses stratified by study, age, gender and established modifiable risk factors: body mass index, smoking status and history of diagnosed hypertension. The associations with rs11894252 and rs7579899 were notable in former and current smokers but not in never-smokers, suggesting an interaction with smoking (P heterogeneity = 0.003) (Fig. 2). This observation raises the possibility that the effect of EPAS1 could be dependent on tobacco smoking, but further studies are needed to explore this promising finding. The associations with the two 2p21 (EPAS1) SNPs were stronger among men than women, possibly a result of the different risks by smoking status. The stratified analyses suggested no other evidence of interaction.
This study was well powered to detect common alleles with large effect sizes (greater than 90% power to detect a per-allele OR of 1.5 for a variant of allele frequency of 20% at an alpha of 5×10−7), but the statistical power was limited for detecting effects of weaker size or those due to uncommon SNPs. Additional studies are needed to identify susceptibility markers of weaker effects or lower allele frequency.
Our study has identified novel regions of the genome associated with risk of RCC. Two regions on 2p21 and 12q24 map to candidate genes EPAS1 and SCARB1, respectively, while one maps to a region of 11q13 with no characterized genes. Further fine-mapping of these regions is required prior to investigating the optimal variants for studies into the biological underpinnings of the observed associations. Moreover, these loci should be pursued in follow-up studies in distinct populations, such as African Americans who have an increased risk of RCC2, 3. Similarly, it will be important to evaluate these regions in studies that address clinical endpoints such as response to therapy and survival. The discovery of additional susceptibility loci should lead to further advances in understanding the etiology of RCC as well its risk prediction and early detection.
The authors thank all of the participants who took part in this research, and the funders and support staff who made this study possible. Funding for the genome-wide genotyping was provided by the Institut National du Cancer (INCa), France, for those studies coordinated by IARC/CNG, and by the intramural research program of the National Cancer Institute (NCI), National Institute of Health (NIH), USA, for those studies coordinated by the NCI. Additional acknowledgments can be found in the supplementary note.
Author contributionsM.P.P., M.J., J.R.T., G.S., L.E.M., V.G., W.H.C., J.D.M., N.R., S.J.C., and P. Brennan contributed to the design and execution of the overall study. M.P.P., M.J., J.R.T., G.S., L.E.M., L.A.K., X.W., V.G., K.B.J., J.D.M., N.R., S.J.C., and P Brennan contributed to the statistical analysis. M.P.P., M.J., S.J.C. and P. Brennan wrote the first draft of the manuscript. D. Zeleniak, E.P., L.A.K., X.W., K.B.J., S.H.V., S.L.M., Y.Y., A.M.M., E.S.B., N.N.C., M.F., D.L., I.G., S.H., H. Blanche, A.H., G.T., Z.W., M.Y., K.G.S., S.J.C., and M.L. supervised or conducted the genotyping. The remaining authors conducted the epidemiologic studies and contributed samples to the GWAS and/or replication. All authors contributed to the writing of the manuscript.