|Home | About | Journals | Submit | Contact Us | Français|
Genome-wide association studies (GWAS) have identified ~30 single-nucleotide polymorphisms (SNPs) consistently associated with prostate cancer (PCa) risk. To test the hypothesis that other sequence variants in the genome may interact with those 32 known PCa risk-associated SNPs identified from GWAS to affect PCa risk, we performed a systematic evaluation among three existing PCa GWAS populations: CAncer of the Prostate in Sweden population, a Johns Hopkins Hospital population, and the Cancer Genetic Markers of Susceptibility population, with a total sample size of 4723 PCa cases and 4792 control subjects. Meta-analysis of the interaction term between each of those 32 SNPs and SNPs in the genome was performed in three PCa GWAS populations. The most significant interaction detected was between rs12418451 in MYEOV and rs784411 in CEP152, with a Pinteraction of 1.15 × 10−7 in the meta-analysis. In addition, we emphasized two pairs of interactions with potential biological implication, including an interaction between rs7127900 near insulin-like growth factor-2 (IGF2)/IGF2AS and rs12628051 in TNRC6B, with a Pinteraction of 3.39 × 10−6 and an interaction between rs7679763 near TET2 and rs290258 in SYK, with a Pinteraction of 1.49 × 10−6. Those results show statistical evidence for novel loci interacting with known risk-associated SNPs to modify PCa risk. The interacting loci identified provide hints on the underlying molecular mechanism of the associations with PCa risk for the known risk-associated SNPs. Additional studies are warranted to further confirm the interaction effects detected in this study.
Prostate cancer (PCa) is the most common non-skin cancer affecting men in western countries. Inherited genetic variants play an important role in contributing to familial aggregation of PCa. Since 2007, genome-wide association studies (GWAS) successfully identified at least 33 PCa risk-associated single-nucleotide polymorphisms (SNPs) (1–15).
Although those risk-associated SNPs are well replicated in multiple studies (16–20), very few studies assess the potential epistasis or gene–gene interaction between those SNPs and the rest of SNPs that reside in the genome. In fact, epistatic effect is the norm rather than exception for complex diseases, such as PCa. Inference from tumorigenesis and results from genetic modeling studies suggest that multiple susceptibility genes, either additively or multiplicatively, determine individual risk to PCa. The importance of epistasis is also supported by empirical evidence from model organisms and human studies (21–23).
The evidence of epistatic effect from the empirical data suggests that gene–gene interactions need to be examined in GWAS when searching for PCa risk variants. Actually, assessment of gene–gene interaction may reveal additional PCa risk variants, especially in the situation where multiple risk-associated variants have been identified. It is computationally possible to use a logistic regression model to search the genome and to identify additional variants that interact with these known risk variants to modify the risk of developing PCa.
Recently, Ciampa et al. (24) reported a two-stage GWAS of epistasis between 13 known PCa risk-associated SNPs and SNPs across the genome in the National Cancer Institute Cancer Genetic Markers of Susceptibility (CGEMS) Stage I population with 523841 SNPs and Stage II population with 27383 SNPs, which were selected based on the main effects in Stage I. No SNP–SNP interaction was identified that reached a genome-wide significance level in Stage I or Stage II data, and a list of top interactions were suggested and warranted replication in other studies. The lack of replication data in Ciampa’s study emphasized the importance of evaluating gene–gene interaction in multiple GWAS populations. More importantly, combining individual level data of multiple GWAS can improve the power to identify SNPs that interact with the known risk-associated SNPs to impact PCa risk. To this end, we performed a combined genome-wide search for SNPs that interact with 32 PCa risk-associated variants identified from GWAS in three case–control populations of European descents, including 1583 PCa cases and 519 control subjects from the CAncer Prostate in Sweden (CAPS), 1964 PCa cases and 3172 control subjects from a Johns Hopkins Hospital (JHH) PCa and iControl database and 1176 PCa cases and 1101 control subjects in the National Cancer Institute CGEMS study. We also evaluated the list of SNP–SNP interactions suggested by Ciampa’s study in the two independent GWAS populations (CAPS and JHH).
The first GWAS population included 1583 PCa patients and 519 control subjects that matched the age distribution of case subjects from CAPS, a population-based PCa case–control study from Sweden (CAPS) (6). Briefly, the CAPS population was recruited from four regional cancer registries in Sweden and diagnosed between July 2001 and October 2003. The clinical characteristics of these patients are presented in Supplementary Table 1, available at Carcinogenesis Online.
The second population was from a JHH PCa GWAS, which included 1964 PCa cases and 3172 control subjects. The cases are Caucasian PCa patients who underwent radical prostatectomy for the treatment of PCa at JHH from 1 January 1999 through 31 December 2008 (25). The clinical characteristics of these patients are presented in Supplementary Table 2, available at Carcinogenesis Online. The control subjects for this population were an independent group of Caucasian individuals from the Illumina iControlDB (iControls) dataset (https://www.illumina.com/science/icontrodb.ilmn).
The third population was obtained from Stage I of the National Cancer Institute CGEMS study. It included 1176 PCa cases and 1101 control subjects, selected from the Prostate, Lung, Colon and Ovarian Cancer Screening Trial (6,9). The genotype and phenotype data of the study are publicly available and our use of the data was approved by CGEMS.
GWAS of the CAPS population was performed using Affymetrix 5.0 chip. GWAS of the JHH case population was performed using the Illumina 610K chip (24). GWAS of the iControls population (25) was performed using Illumina Hap300 and Hap550 chips. GWAS of the CGEMS population was performed using HumanHap300 and HumanHap240 assays from Illumina Corp.
For each GWAS population, we imputed all the known SNPs that are catalogued in HapMap Phase II (www.hapmap.org) using the IMPUTE computer program (26) with a posterior probability of 0.9 as a threshold to call genotypes. Individuals with a call rate <0.95 were removed from GWAS analysis. The following quality control criteria were used to filter SNPs: Minor Allele Frequency < 0.01, Hardy-Weinberg Equilibrium < 0.001 and call rate <0.95.
The 33 PCa-known risk-associated SNPs were discovered by GWAS and the following fine-mapping studies, with P-values ≤ 10 × 10−7 (1–15). The detailed information for the 33 risk SNPs are presented in Table I. The SNP rs16901979 was not evaluated in the following interaction analysis due to the unavailability of imputation of this SNP since it was not catalogued in the HapMap database.
Multiplicative interactions between each one of the 32 known PCa risk variants and each SNP in the genome were systematically tested by including both SNPs and an interaction term (product of two SNPs), as implemented in the computer program PLINK (27). Ancestral proportions obtained based on EIGENSOFT software (28) were included as covariates to minimize the impact of potential population stratification in the JHH population. An additive genetic model was used, where the genotypes were coded as 0, 1 and 2 and each SNP was treated as a continuous variable. The interaction term was tested using a Wald test with degree of freedom of 1. A meta-analysis of the interaction term for the three study populations was performed using the method developed by Manning et al. (29). Briefly, the meta-odds ratio (ORM) of the interaction term across the three populations was estimated using an inverse variance weighted meta-analysis, where , and (29).
After imputation and applying quality control criteria, 1314700, 1646196 and 1757946 SNPs remained for CAPS, JHH and CGEMS studies, respectively. A total of 1117531 common SNPs for those three populations were used in the interaction analysis.
We examined the inflation factor and the quantile–quantile plots for interaction tests in the combined analysis of three populations. No systematic bias was observed as the inflation factors for the 32 GWAS scans for SNP–SNP interactions ranged from 0.98 to 1.03 (Supplementary Table 1 is available at Carcinogenesis Online).
The results for the top-ranked SNPs that interacted with each of the 32 known PCa risk SNPs (Pinteraction < 1.0 × 10−5 in the meta-analysis) were presented in Supplementary Table 2, available at Carcinogenesis Online. For SNPs in linkage disequilibrium (as defined by r2 > 0.5), only the one with the smallest P-value based on meta-analysis was included in the Supplementary Table 2, available at Carcinogenesis Online. We then further examined the interaction effects for the top-ranked SNPs (Pinteraction < 1 × 10−5) in each of the three populations. SNPs that significantly interacted with the 32 SNPs in all three populations at a nominal Pinteraction of 0.05 were presented in Table II. No SNP–SNP interaction reached a genome-wide significant level of 1.5 × 10−9 [0.05/(1 × 10−6 × 32)]. The most significant interaction was observed between rs12418451 in the MYEOV gene region and rs784411 in the intron of CEP152, with a Pinteraction of 1.15 × 10−7 [ORinteraction = 1.42; 95% confidence interval (CI): 1.25–1.61] in the meta-analysis. This interaction pair was significant in all three populations and the effects of the interaction were in the same direction [Pinteraction = 0.008, ORinteraction = 1.55 (95% CI: 1.12–2.16) for CAPS; Pinteraction = 0.005, ORinteraction = 1.34 (95% CI: 1.14–1.58) for JHH and Pinteraction = 0.001, ORinteraction = 1.53 (95% CI = 1.18–1.99) in CGEMS, respectively] (Table II).
Among the other 34 pairs of interactions that were significant at a Pinteraction cutoff of 1 × 10−5 in the meta-analysis, two pairs were noteworthy to be emphasized when considering possible biological function. The first pair involved an interaction between rs7127900 at insulin-like growth factor (IGF2)/IGF2AS region and rs12628051 in the intron of TNRC6B, with a Pinteraction of 3.39 × 10−6 (ORinteraction = 1.30; 95% CI = 1.17–1.46) (Table II). The interaction was significant in all three populations and the effects of the interaction were in the same direction (Pinteraction = 0.002, ORinteraction = 1.50, 95% CI = 1.16–1.93 in CAPS; Pinteraction = 0.006, ORinteraction = 1.24, 95% CI = 1.06–1.44 in JHH and Pinteraction = 0.014, ORinteraction = 1.32, 95% CI = 1.06–1.65 in CGEMS). The second pair of interaction was between rs7679763 in TET2 gene region and rs290258 in the promoter region of SYK, with a Pinteraction of 1.49 × 10−6 (OR = 0.75; 95% CI = 0.67–0.84) (Table II). Similarly, the interaction effect was consistently observed in all three populations with the same direction of interaction effect (Pinteraction = 0.002, ORinteraction = 0.66, 95% CI = 0.51–0.86 in CAPS; Pinteraction = 0.003, ORinteraction = 0.78, 95% CI = 0.67–0.92 in JHH and Pinteraction = 0.014, ORinteraction = 0.75, 95% CI = 0.59–0.94 in CGEMS).
We then carefully examined the significant pairs of SNP–SNP interactions reported by Ciampa et al. (24) in CAPS and JHH population. Among the 25 pairs reported in the previous study, 16 pairs were also evaluated in our data. Three pairs of SNP–SNP interaction reached nominal Pinteraction of 0.05 in CAPS population (Table III). The most significant interaction replicated in CAPS was between rs6983267 and rs4953347 (Pinteraction = 0.001, ORinteraction = 1.42). However, this interaction was not significant in the JHH population (P= 0.69). The other two pairs of SNPs were replicated in CAPS at a nominal Pinteraction of 0.05, including the interaction between rs2735839 and rs12196677 (Pinteraction = 0.017, same direction of interaction effect) and the interaction between rs10934853 and rs10458466 (Pinteraction = 0.02 but with opposite direction of interaction effect). The interaction effect of the two pairs of SNP–SNP interactions that were significant in JHH population at a Pinteraction cutoff of 0.05 were in the opposite direction compared with the previous study (24) (Table III).
To our knowledge, our study represents one of the first comprehensive gene–gene interaction scans in three PCa GWAS populations. Specifically, we performed a genome-wide gene–gene interaction scan for each of the 32 known PCa risk-associated variants identified from GWAS in three case–control populations of European descents, which includes a total of 4723 PCa cases and 4792 controls. In the meta-analysis, we found 35 pairs of SNP–SNP interactions that were significantly associated with PCa risk (Pinteraction < 1 × 10−5). In addition, the interactions for those 35 pairs were significant in all three populations (all Pinteraction < 0.05). Among those 35 pairs of statistically significant interactions, we emphasized three pairs of interactions with potential biological implication, including an interaction between rs12418451 in MYEOV and rs16961635 in CEP152, with a Pinteraction of 1.15 × 10−7 (OR = 1.42, 95% CI = 1.25–1.61), an interaction between rs7127900 at IGF2/IGF2AS region and rs12628051 in the intron of TNRC6B, with a Pinteraction of 3.39 × 10−6 (OR = 1.30, 95% CI = 1.17–1.46) and an interaction between rs7679763 in TET2 gene region and rs290258 in the promoter region of SYK, with a Pinteraction of 1.49 × 10−6 (OR = 0.75, 95% CI = 0.67–0.84).
The discovery of approximately three dozen PCa risk variants using single SNP analysis suggests that it is possible to detect individual risk variants. However, when the underlying genetic model involves interaction of multiple genes, a single gene approach is less effective and may not be able to explain the complex etiology of the disease. Therefore, evaluation of the joint effect (epistasis) of multiple genetic variants is critical to understand the underlying causes of complex diseases (30), especially in the situation where several individual risk variants have been identified. The next question is to explore whether other SNPs interact with those SNPs to modify risk to PCa. The identified loci that interact with the known PCa risk-associated SNPs may help to elucidate the underlying molecular mechanisms of the associations of those risk SNPs.
The most significant interaction was seen between the PCa risk-associated SNPs rs12418451 and rs784411. The SNP rs12418451 is located at the 11q13.2 that is ~77 kb upstream of TPCN2, a putative cation-selective ion channel gene and ~126 kb upstream of MYEOV, an oncogene that has been implicated in multiple cancers (31–35). The SNP rs784411 resides in the intron of CEP152, a centrosomal protein that was recently shown to function as a regulator of genomic integrity (36) and cellular response to DNA damage (37). Given the limited information, we speculate that observed interaction may reflect the close collaboration of MYEOV (or TPCN2, even though it is less likely) and CEP152 in the same or different oncogenic pathways that drive the tumorigenesis of prostatic epithelial cells.
Among the two SNPs that were shown to consistently interact with the PCa risk-associated SNP rs7127900 at 11p15.5, one SNP (rs12628051) is located within TNRC6B, which encodes an RNA interference (RNAi) machinery component protein crucial for the microRNA/small interfering RNA-dependent translational repression or degradation of target messenger RNAs. It is worthy to mention that this gene also contains a GWAS-identified PCa risk-associated SNP (rs9623117). Several mechanisms may potentially explain for these interactions. Firstly, we noticed that at ~70 kb telomeric to rs7127900 reside the PCa-implicated IGF2 gene and its antisense transcript-encoding IGF2AS. IGF2 encodes a member of the insulin family of polypeptide growth factors that promotes cell proliferation during fetal development but becomes less active in healthy adults due to genomic imprinting. Dysregulated overexpression of IGF2 caused by loss of imprinting has been associated with a variety of human cancers, including PCa (38–41). IGF2AS encodes a predictably non-coding RNA that is antisense to IGF2 and thus may potentially regulate IGF2 expression through RNAi in a similar manner as some other natural antisense transcripts. Thus, one plausible scenario is that TNRC6B may affect the RNAi-mediated transcriptional regulation of IGF2AS on IGF2, which may underlie the observed interaction between genetic variants within these two loci. Secondly, there are two microRNA genes located at 11p15.5, miR-4686 (~40 kb from the PCa-risk SNP rs7127900) and miR-483 (~80 kb from rs7127900). Although the role of miR-4686 remains to be determined, miR-483 has been demonstrated to act as an oncogene to suppress proapoptotic BBC3 (PUMA) or tumor suppressive DPC4 (Smad4) in a variety of human cancers (42,43). Thus, an alternative mechanism for the observed interaction between the 11p15.5 locus and the TNRC6B locus is that genetic variants in TNRC6B may affect the miR-483 (or miR-4686)-mediated RNAi toward its/their target tumor suppressor genes.
Another pair of interacting SNPs was found between rs7679673 (~6 kb upstream of TET2) and rs290258 (~8 kb upstream of SYK). TET2 encodes an enzyme hydroxylating methylcytosine and is implicated in epigenetic programming that involves DNA methylation and demethylation (reviewed in ref. 44). The critical role of TET2 in cancer is suggested by the observation that loss-of-function mutations of TET2 are frequently identified in various hematological malignancies (45,46). As a non-receptor tyrosine protein kinase that mediates cellular proliferation and differentiation, SYK is believed to function as a potential tumor suppressive gene (reviewed in ref. 47). It is noteworthy that hypermethylation of SYK gene promoter has been frequently found in and widely associated with lung, gastric and breast cancer (48,49). Thus, although it remains to be determined whether SYK promoter in prostatic tumors also undergoes silencing via DNA methylation, the observed interaction between TET2 and SYK suggests that it is a plausible hypothesis.
Two SNPs (rs731174 and rs10812303) were found to interact with the GWAS-identified PCa risk-associated SNP rs4430796, residing within HNF1B, a homeodomain-containing transcription factor whose expression alteration has been widely implicated in various human cancers, including PCa. The SNP rs731174 is located within the intron of EPHA10, a member of the EPH subfamily of receptor tyrosine kinases. This family of receptor tyrosine kinases play an important role in cell–cell communication regulating cell attachment, shape and mobility in epithelial cells and are believed to be implicated in carcinogenesis (reviewed in ref. 50). It is possible that HNF1Ba and EPHA10 collaborate in the signaling network that is crucial for the well-being of prostatic cells, whereas the genetic variants located within these two genes may synergistically contribute to the oncogenesis of PCa. The other SNP rs10812303 is ~40 kb upstream of TUSC1, an intronless gene that has been suggested to serve as a tumor suppressor in lung tumorigenesis (51). Thus, the interaction between genetic variants in TUSC1 and HNF1B may also suggest a plausible collaboration of these two genes.
Besides our novel findings, we also replicated the most notable finding reported by the study of Ciampa et al. (24) in our CAPS population. The interaction involves 8q24 region 3 (rs6983267) and EPAS1 (rs4953347). We also observed the interaction between the KLK2–KLK3 (rs2735839) and PNPLA1 in the CAPS population. However, none of those two pairs of interaction were significant in the JHH dataset. Therefore, further statistical evidence supports from additional replication studies are needed to reach a more robust conclusion.
Our results need to be interpreted with caution. The most significant SNP–SNP interaction pair detected in the meta-analysis had a Pinteraction of 1.15 × 10−7. It did not reach a genome-wide significance level of 1.5 × 10−9 as we performed multiple tests in this study (1 million × 32 = 32 million tests). One possible reason for not achieving genome-wide significance may be due to limited statistical power to detect small to modest interaction effects. In the meta-analysis of 4723 PCa cases and 4759 controls, we had 80% power to detect relatively large interaction effects (OR > 1.7), using the stringent Bonferonni-corrected P-value cutoff of 1.5 × 10−9 to claim a genome-wide significant level. Therefore, additional samples were needed to detect modest interaction effects at a genome-wide significant level, which was the case in our study (OR approximately ranged from 1.2 to 1.5). However, the interaction effect detected in our study was consistently implicated in all three populations and with the same direction of interaction effects. In addition, similar pattern of integrations were observed (quantitative interactions) among the three populations. This may represent statistically meaningful SNP–SNP interactions with a modest magnitude of interaction effect. However, the pairs of SNP–SNP interactions identified by our study still warrant follow-up in other populations to further exclude the possibilities of false positive findings. More importantly, functional studies are needed to better understand the underlying molecular mechanisms of the interactions implicated.
The limitations of our study include the use of a pairwise gene–gene interaction approach and limiting the search of interactions on SNPs that confer main effects only. Firstly, the search for gene–gene interactions based on a pairwise approach leads to reduced or no power when high-order interactions are present. However, the search for high-order interactions in a GWAS scale still represents a computational challenge and warrants novel statistical approaches to handle this difficult task. Secondly, we did not evaluate gene–gene interactions among SNPs that didn’t confer a main effect. We can’t exclude the possibilities that the epistasis between other pairs of SNPs without strong main effect may also affect risk of PCa. However, the exhaustive search for gene–gene interaction is beyond the scope of the current study as we focused on identifying genes that interact with known risk-associated SNPs. We would like to continue our research on this field in future studies. In addition, the current analysis of the imputed data was based on the most likely genotype with the posterior probability of ≥0.90. Alternatively, using the imputed dosage data would be more accurate and may improve the power of this study. Future epistasis analysis based on the imputed dosage data would be explored once the statistical and computation methods are available.
In summary, our systematic evaluation of gene–gene interactions in three GWAS populations suggested a list of loci interacting with known PCa risk-associated SNPs that may warrant follow-up in other study populations. Three pairs of interactions are worthwhile to be emphasized, including an interaction between rs12418451 in the MYEOV gene region and rs784411 in the intron of CEP152, an interaction between rs7127900 in the IGF2/IGF2AS gene region and rs12628051 in the intron of TNRC6B and an interaction between rs7679673 in the TET2 gene region and rs290258 in the intron of SYK. Those results showed statistical evidence for genes interacting with known risk-associated SNPs on PCa risk. The interacting loci identified also provide more hints on the underlying molecular mechanism of the associations with PCa risk for the known risk-associated SNPs. Additional studies are warranted to further confirm the gene–gene interaction effects detected in this study.
This work was supported by the Department of Defense (W81XWH-09-1-0488 to J.S.); an intramural funding from the Van Andel Research Institute to J.X. and the National Cancer Institute (CA129684 J.X.).
We thank all of the study subjects who participated in the CAncer of the Prostate in Sweden study and the urologists who provided their patients to the CAncer of the Prostate in Sweden and Johns Hopkins Hospital studies. We acknowledge the contribution of multiple physicians and researchers in designing and recruiting study subjects, including Dr H.-O.Adami. We also acknowledge the National Cancer Institute Cancer Genetic Markers of Susceptibility Initiative for making the data publicly available.
Conflict of Interest Statement: None declared.