|Home | About | Journals | Submit | Contact Us | Français|
Ovarian cancer (OC) accounts for more deaths than all other gynecological cancers combined. To identify common low-penetrance OC susceptibility genes, we conducted a genome-wide association study (GWAS) of 507,094 SNPs in 1,768 cases and 2,354 controls, with follow-up of 21,955 SNPs in 4,162 cases and 4,810 controls, leading to the identification of a confirmed susceptibility locus at 9p22 (BNC2)1. Here, we report on nine additional candidate loci (p≤10-4), identified after stratifying cases by histology, genotyped in an additional 4,353 cases and 6,021 controls. Two novel susceptibility loci with p≤5×10-8 were confirmed (8q24, p=8.0×10-15 and 2q31, p=3.8×10-14); two additional loci were also identified that approached genome-wide significance (3q25, p=7.1×10-8 and 17q21, p=1.4×10-7). The associations with serous OC were generally stronger than other subtypes. Analysis of HOXD1, MYC, TiPARP, and SKAP1 at these loci, and BNC2 at 9p22, supports a functional role for these genes in OC development.
Invasive epithelial OC is a rare but often lethal disease. Individuals with an OC family history have approximately a two-fold increased risk, even after accounting for mutations in known highly-penetrant susceptibility genes, suggesting that other risk alleles await identification. GWAS have identified common genetic variants influencing risks for a range of cancers, including our recent identification of low-risk OC alleles at 9p22.21. The most significant association conferred a 20% reduction in risk with each copy of the minor allele; this association was stronger for serous subtype, suggesting that disease heterogeneity may have reduced the power of our GWAS1. To identify further susceptibility alleles, we have followed-up additional candidate loci from that study after stratifying cases by histological subtype.
Our GWAS was performed for women of European ancestry diagnosed with the major histological subtypes of invasive epithelial OC: serous, mucinous, endometrioid, and clear cell (Table 1, Supplementary Table 1). Phase I included 1,768 cases and 2,354 controls from the UK genotyped using the Illumina Infinium 610K and 550K arrays, respectively. Data were available for 507,094 genotyped SNPs and 1,549,784 imputed SNPs from HapMap. The top-ranked 21,955 SNPs, selected based on analysis of all subtypes combined, were genotyped in an additional 4,162 cases and 4,810 controls in Phase II leading to the identification of a compelling association near the basonuclin 2 (BNC2) gene at 9p22.2 (p=5.4×10-22, rs3814113 for serous subtype)1. Quantile-quantile (Q-Q) plots of the distribution of test statistics to compare genotype frequencies in cases versus controls in the stage 1 and stage 2 data, suggested there was little evidence of any general inflation of the test statistics (estimated inflation factor λ1000 = 1.026 and 1.0086 respectively based on the bottom 90% of the distribution)1.
In the current study, combined Phase I and Phase II data from this GWAS were re-analyzed overall and restricting to ‘serous only’. This revealed nine loci with a p-value ≤10-4 for all subtypes (1p31, 1p36, 2q31, 11p14, and 17q21) and/or for serous subtype (2p22, 3q25, 7p21, and 8q24). Thirty SNPs from these loci were genotyped in an additional 4,353 cases and 6,021 controls in Phase III; thus, data were available for combined analysis of 10,283 cases (including 5,841 serous) and 13,185 controls. In addition, results on an independent 194 cases and 40,933 controls were incorporated using metaanalyses. There was little additional evidence of association for five loci in either the Phase III or combined data, although trends in association were in the same direction for some SNPs (Supplementary Table 2; Supplementary Table 3). For SNPs at two loci (2q31, 8q24), there was strong support for associations in the Phase III data alone (p<0.001) and in the combined analyses of all cases or serous cases only (p≤5×10-8). At two other loci, 3q25 and 17q21, there was good evidence for associations in the combined analysis of serous cases, but these did not quite reach genome wide significance (p=7.1×10-8 and p=1.4×10-7 respectively). No heterogeneity across studies was observed (p>0.05); these data are summarized in Table 2, Figure 1 and Supplementary Figure 1).
At the 8q24 locus, the minor allele of rs10088218 was associated with a decreased risk (Table 2), with the association being more significant for serous subtype (odds ratio, OR, 0.76; 95% confidence intervals, 95%CI, 0.70-0.81; p=8.0×10-15); the additional 8q24 SNPs rs1516982 (r2=0.46) and rs10098821 (r2=0.80) were also highly significant (Supplementary Table 2). The minor allele of rs2072590 at 2q31 was associated with an increased risk which was primarily significant for serous subtype (OR 1.20, 95%CI 1.14-1.25, p=3.8×10-14); more significant results were also seen for serous subtype for minor alleles of rs2665390 at 3q25 (OR 1.24, 95%CI 1.15-1.34, p=7.1×10-8) and rs9303542 at 17q21 (OR 1.14, 95%CI 1.09-1.20, p=1.4×10-7).
We observed significant heterogeneity for associations at the 8q24 and 2q31 loci when cases were stratified into four histological subtypes (p=2.9×10-4 for rs2072590, p=1.1 ×10-7 for rs10088218) (Table 2). For both loci, the trends in association for endometrioid and mucinous OC were in the same direction as serous cases, but there was no evidence of association at either locus in clear cell OC. To a lesser degree, a difference in risk by subtype was also observed at 3q25 (p=0.02 for rs2665390). We also examined SNP associations by age and family history of OC in first-degree relatives. No differences in risk were observed by age (Supplementary Table 4) and, among the four novel loci, no interactions by family history were observed (Supplementary Table 5).
We also tested the most significant risk-associated SNPs at 8q24, 2q31, 3q25 and 17q21 for association with overall survival after a diagnosis of OC, in ‘all’ case and ‘serous only’ case subgroups. For the latter group, we also performed an analysis adjusting for tumor stage and grade. We found borderline evidence for an association with survival at 17q21 in all cases (HR=1.06, 95% CI 1.00-1.13, P=0.04) and in serous only cases (HR=1.08, 95% CI 1.00-1.16, P=0.04). This effect was not greatly attenuated when adjusting for stage and grade amongst serous cases (HR=1.06, 95% CI 0.97-1.15, P=0.15). There was no evidence of association with survival at 8q24, 2q31 and 3q25.
We used Pupasuite (http://pupasuite.bioinfo.cipf.es/) for in silico analyses of risk-associated and strongly-correlated SNPs (r2>0.80) in the four loci but failed to find compelling evidence for any functional role (Supplementary Table 6). Genotyping of additional SNPs identified from HapMap and the 1,000 Genomes Project (http://www.1000genomes.org/page.php) is required to fine-map these loci to identify both causal variants and target genes. Even so, known genomic architecture may provide some insights into functional mechanisms underlying OC susceptibility. For example, common variants at 8q24 have previously been shown to confer susceptibility to multiple cancer phenotypes including prostate, colorectal, breast, and bladder cancers2-7, and previous functional studies suggest that common variants may be associated with transcriptional regulation of MYC8,9. Most risk associations at 8q24 are located 5′ of MYC; but the three most significant SNPs for OC lie >700 kb 3′ of MYC in an apparent gene desert, suggesting either that MYC is not the target gene for OC, or possibly that variants in this region are also capable of distant regulation of MYC (Figure 1a).
rs2665390 at 3q25 is intronic to the TiPARP gene; there are no other candidate genes within 200 kb of this SNP (Figure 1b). TiPARP is a poly (ADP-ribose) polymerase (PARP)10 and is a particularly intriguing candidate gene for OC for two reasons. First, recent reports show that BRCA1/2 deficient cells survive by using PARP1 as an alternative DNA repair mechanism11. This has led to the development of a novel therapy based on synthetic inhibition of PARP1 for breast and ovarian cancer patients carrying BRCA1 or BRCA2 mutations12. Second, TiPARP is inducible by dioxin13, raising the hypothesis that this environmental contaminant may influence OC risk among susceptible women.
The 2q31 locus contains a family of homeobox (HOX) genes involved in regulating embryogenesis and organogenesis (Figure 1c). Altered expression of HOX genes has been reported in many cancers14,15. The OC risk-associated SNP rs2072590 lies in non-coding DNA downstream of HOXD3 and upstream of HOXD1, and it tags SNPs in the HOXD3 3′ UTR. Both genes have been implicated in neoplastic development16,17.
Finally, associated and correlated SNPs at 17q21 are intronic to SKAP1, which has strong homology to the SRC oncogene at the C-terminal end (Figure 1d). SKAP1 regulates mitotic progression, specifically at the transition of metaphase to anaphase18. In T-cells, constitutive expression of SKAP1 suppresses activation of RAS and RAF1, both of which have been implicated in the early stage development of OC19.
We evaluated risk-associated SNPs and candidate genes from these four novel loci, as well as from BNC21, for evidence of a functional role in OC development by examining genotype associated gene expression for BNC2 (9p22), MYC (8q24), TiPARP (3q25), HOXD1 and HOXD3 (both at 2q31), and SKAP1 (17q21). We found no evidence of genotype associated gene expression in an analysis of 48 normal primary human ovarian surface epithelial (POE) cell cultures (Supplementary Table 7) although power was limited due to the relatively small numbers. We also compared the expression of each of the five candidate genes between 48 POE and 24 OC cell lines and found highly-significant differences in gene expression between normal and cancer cells for BNC2, TiPARP, HOXD1, and SKAP1 (Figure 2; Supplementary Figure 2). These data suggest that BNC2 and TiPARP have a loss-of-function role, and that HOXD1 and SKAP1 have a gain-of-function role in OC development.
Gene expression was also examined in an in vitro model of OC initiation and progression established through oncogenic expression of MYC and mutant KRAS G12V in POE cells (Figure 2; Supplementary Table 9). We found that BNC2 and TiPARP expression decreases, and SKAP1 expression increases, with neoplastic development, consistent with the expression of these genes in POE versus OC cells. We also investigated gene expression for 216 primary serous OC samples analysed by The Cancer Genome Atlas Project (http://cancergenome.nih.gov). These data support frequent loss of BNC2 and TiPARP and gain of HOXD1 expression in OC development (Supplementary Figure 3).
In summary, we report two confirmed novel common low-penetrance OC susceptibility loci, and a further two candidate susceptibility loci, adding to a growing list that includes BNC2 and the 19p13 locus presented in an accompanying report by Bolton and colleagues. All six susceptibility loci suggest possible functional relevance of candidate genes that could plausibly be involved in OC development and aid in our understanding of disease aetiology. Strikingly, these data also suggest that common germline variation influences the clinico-pathological development of disease, as previously reported for the highly-penetrant (BRCA1 and BRCA2) germline variants20.
As described previously1, Phase I self-reported white European participants were from four collections of invasive epithelial OC cases and two collections of controls22,23 (Supplementary Table 1). Logistic regression and linear trend tests examined SNP associations (including imputed genotypes) using imputed genotype weights and ethnicity-related principal components. Phase II SNPs were selected based on Phase I ranked test statistics (all, serous) weighted by imputation status and accuracy (r2)24 and design score, and participants were of European ancestry from 12 studies. Logistic regression adjusted for a HapMap-based ancestry score25 and an ancestry-informative-marker-based principal component.
Sixteen studies contributed invasive epithelial OC cases and controls of European ancestry (Supplementary Table 1). Nine loci (30 SNPs, Supplementary Table 9) were selected based on test statistic rankings from combined Phase I and Phase II analyses (p <10-5 in all or serous cases). Eleven studies used Taqman (two SNPs) and Sequenom MassARRAY (remaining SNPs). Data for four studies were available from a genome wide scan using the same Illumina Infinium 610K array that was used in Phase 1 excluding samples with call rate <95%, >1% discordance, <80% European ancestry, or ambiguous gender. For one study, Illumina Infinium 317K data were used with imputation based on HapMap CEU data following 100 iterations in MACH version 1.0.16, excluding SNPs with r2 <0.30. For one study, cases and ~2,900 controls were genotyped on a Centaurus (Nanogen) platform (excluding SNPs with >1.5% HapMap mismatch), and additional control data used the Illumina Infinium 317K and HumanCNV370-duo Bead Arrays; per-SNP call rate was >97%, and concordance was >98.5%. Logistic regression modeled the number of observed or imputed minor alleles; no confounding by age was observed. Combined analyses adjusted for study and tested heterogeneity with Cochran's Q statistic and I2 values. Effect modification and differences in risk by subtype were tested with interaction terms and polytomous regression, respectively. Summary-level data was available for the ICE study; thus meta-analytic techniques fitted fixed and random effects26,27.
The Ovarian Cancer Association Consortium (OCAC) has established robust genotyping quality control (QC) guidelines to ensure accurate genotyping, particularly across multiple studies. Data included must pass the following quality control criteria: (1) > 3% sample duplicate samples included in each study; (2) samples from cases and controls mixed on 384 well plates; (3) samples that consistently fail (e.g. for >20% of all SNPs) are removed; (4) genotype data for a SNP are removed if call rates > 90% for each individual 384-well plate, but if > 25% plates from any site are excluded for this reason then all the data from that site are excluded; (5) the overall call rate for a SNP for each study > 95%, which is calculated after ineligible samples and plates are excluded; (6) genotyping concordance rates for the duplicates >98%; (7) Hardy Weinberg Equilibrium (HWE) in White European controls must be P≥0.005. If HWE P = 0.05-0.005 the genotyping clustering quality is reviewed by members of the OCAC genotyping committee (PDP; GC-T, SJR, HS) before inclusion. In addition, as part of overall QC, genotyping consistency across labs is evaluated by genotyping a panel of CEPH-Utah trios including 90 individual DNA samples, 5 duplicate samples and 1 negative control (http://ccr.coriell.org/Sections/Search/Panel_Detail.aspx?PgId=202&Ref=HAPMAPPT01). Genotyping concordance between centres has to be > 98% in order for the genotype data to be included. Genotyping QC for Phase 1 and Phase 2 of the GWAS have been described previously1.
In Phase 1, 507,094 out of 540,573 SNPs (94%) passed QC. In both Phase 1 and Phase 2 duplicate concordance rates were 99.9%. For phase 3 the data from the five studies using GWAS data met the following overall quality control criteria; (1) Caucasian and greater than 80% European Ancestry; (2) Sample call rate ≥95%: (3) SNP call rate ≥95%; (3) 100% concordance of duplicates (81 replicate pairs) (4) HWE for each SNP P ≥ 10-4. The Sequenom and Taqman data met the OCAC criteria described above. The results from Phase 3 genotype QC for each of the 30 SNPs are summarized in supplementary table 10.
Among the study participants, 7,222 cases had survival time information available and 2,791 died within five years after diagnosis. The effect of the 8q24, 2q31, 3q25 and 17q21 loci on time to all-cause mortality after EOC diagnosis was assessed using Cox regression stratified by study and modeling the per-allele effect as log-additive. Because the EOC cases showed a variable time from diagnosis to study entry, we allowed for left truncation with time at risk starting on date of diagnosis and time under observation beginning at the time of study entry. The analysis was right censored at 5 years after diagnosis in order to reduce the number of non-EOC related deaths.
Normal POE cell lines were established from brushings of histologically-confirmed disease-free ovaries from total hysterectomies at University College London Hospital, UK; short-term cultures were established as previously described28. The non-neoplastic status and epithelial (non-fibroblastic) nature of cells was confirmed by staining for CA125, CK7, FVIII, and FSP. RNA was extracted from POE and OC cell lines (Supplementary Table 11) using RNeasy Mini Kits (QIAgen). Reverse transcribed (RT) RNA was analysed for expression by semi-quantitative real-time PCR using the Applied Biosystems 7900HT genetic analyzer. Gene expression was normalized against the endogenous controls glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and β-actin. Real time expression data were analysed using the comparative Delta-Delta Ct method. Expression values of all cell lines were generated relative to either the lowest or highest expression of a POE cell line, normalized against GAPDH and β-actin. Differences in the relative expression of each candidate gene between OC and POE cell lines were assessed using a nonparametric Wilcoxon Rank sum test. For allele-specific expression analysis, expression was calculated relative to the average expression of common homozygotes for each SNP normalized against the expression of the endogenous control genes; linear regression and Wilcoxon Rank sum tests assessed difference in expression across genotypes.
Methods used for the immortalization of ovarian epithelial cells and over-expression of MYC (IOEcmyc) are described elsewhere29. All cell lines were grown in NOSE-CM28. Using FuGene6 (Roche), iOEcmyc cell lines were transfected with pcDNA3.1.neo.KRASG12V (Addgene) to create cell lines stably expressing a mutant form of KRAS (KRASG12V). KRASG12V expression was confirmed by RFLP-PCR. Mutant allele expression was confirmed by RFLP-PCR. Anchorage-independent growth assays were performed as previously described30. To test invasiveness, 0.125×106 cells were resuspended in serum-free medium and added to rehydrated invasion membranes (Millipore) for 24 hours. 10% serum (Invitrogen) was added to the lower chamber as a chemo-attractant. Invaded cells were lysed, stained with a fluorimetric dye, and analysed on a Varioskan platereader (Thermo). To culture cells in 3D, tissue culture plastics were coated twice with 1.5% polyHEMA dissolved in 95% ethanol and cultured for 14 days. For immunohistochemistry, spheroids were fixed in neutral buffered formalin (VWR), processed into paraffin and stained for MIB1 or WT1 using standard techniques. For microarray analyses, RNA was extracted from spheroids using the QIAgen RNeasy kit (Qiagen), and experiments used the Illumina HT-12 BeadChip platform (Illumina).
We thank all the individuals who took part in this study. We thank all the researchers, clinicians and administrative staff who have enabled the many studies contributing to this work. In particular we thank A. Ryan and J. Ford (UKO); J. Morrison, P. Harrington and the SEARCH team (SEA), U. Eilber and T. Koehler (GER); D. Bowtell, A. deFazio, D. Gertig, A. Green, A. Green, P. Parsons, N. Hayward, D. Whiteman (AUS); L. Gacucova (HMO); S. Haubold, P. Schürmann, F. Kramer, W. Zheng, T.-W. Park-Simon, K. Beer-Grondke and D. Schmidt (HJO); L. Brinton, M. Sherman, A. Hutchinson, N. Szeszenia-Dabrowska, B. Peplonska, W. Zatonski, A. Soni, P. Chao, M. Stagner (POL1). The genotyping and data analysis for this study was supported by a project grant from Cancer Research UK. We acknowledge the computational resources provided by the University of Cambridge (CamGrid). This study makes use of data generated by the Wellcome Trust Case-Control consortium. A full list of the investigators who contributed to the generation of the data is available from www.wtccc.org.uk. Funding for the project was provided by the Wellcome Trust under award 076113. The Ovarian Cancer Association Consortium is supported by a grant from the Ovarian Cancer Research Fund thanks to donations by the family and friends of Kathryn Sladek Smith.
The MAL study is supported by grants from Mermaid 1, The Danish Cancer Society and National Cancer Institute, Bethesda, USA (R01-CA-61107). The MAY study and Phase III/combined analyses were supported by the National Institutes of Health, National Cancer Institute grants R01-CA-122443 and funding from the Mayo Foundation. The PBCS was funded by Intramural Research Funds of the National Cancer Institute, Department of Health and Human Services, USA. The Fox Chase Cancer Center ovarian cancer study, part of the NCI collaboration, is supported by an Ovarian Cancer SPORE (P50 CA083638). The NCO study is supported by the National Institutes of Health, National Cancer Institute grant R01-CA-76016. The SEA study is funded by a programme grant from Cancer Research UK; we thank the SEARCH team and the Eastern Cancer Registration and Information Centre for patient recruitment. The TBO study was supported by the National Institutes of Health (R01-CA106414); American Cancer Society (CRTG-00-196-01-CCE); Advanced Cancer Detection Center Grant, Department of Defense (DAMD17-98-1-8659). The TOR study was supported by grants from the Canadian Institutes for Health Research, the National Cancer Institute of Canada with funds provided by the Canadian Cancer Society, and the National Institutes of Health (R01-CA-63682 and R01-CA-63678). Additional support for the TOR, NCO, MAY, TBO, and NCI studies was provided by R01-CA-114343. The UCI study is supported by the National Institutes of Health, National Cancer Institute grants CA-58860, CA-92044 and the Lon V Smith Foundation grant LVS-39420. The UKO study is supported by funding from Cancer Research UK, the Eve Appeal, and the OAK Foundation; some of this work was undertaken at UCLH/UCL who received a proportion of funding from the Department of Health's NIHR Biomedical Research Centre funding scheme; and we particularly thank Ian Jacobs, Eva Wozniak, Andy Ryan, Jeremy Ford and Nayala Balogun for their contribution to the study. The AUS study is supported by the National Health and Medical Research Council of Australia (199600), U.S. Army Medical Research and Materiel Command under DAMD17-01-1-0729 (Award no. W81XWH-06-1-0220), and the Cancer Council Tasmania and Cancer Foundation of Western Australia; GCT and PW are Research Fellows of the National Health and Medical Research Council; the Australian Ovarian Cancer Study (AOCS) Management Group (D Bowtell, G Chenevix-Trench, A deFazio, D Gertig, A Green, and PM Webb) gratefully acknowledges the contribution of all the clinical and scientific collaborators (see http://www.aocstudy.org/); the Australian Cancer Study (ACS) Management Group comprises A Green, P Parsons, N Hayward, PM Webb, and D Whiteman. The BAV study is supported by the ELAN Foundation and Erlangen University Hospital. The BEL study is supported by the National Cancer Plan - Action 29 for the support of Translational Research. The DOV study (Seattle Diseases of the Ovary) was supported by the US. NIH grants R01-CA-112523 and R01-CA-87538. The GER study was supported by the German Federal Ministry of Education and Research of Germany; Programme of Clinical Biomedical Research (01 GB 9401); genotyping in part by the state of Baden-Württemberg through the Medical Faculty, University of Ulm (P.685); and data management by the German Cancer Research Center. The HAW study was supported by US Public Health Service grant R01-CA-58598 and contracts N01-CN-55424, N01-PC-67001, and N01-PC-35137 from the National Cancer Institute, NIH, Department of Health and Human Services. Funding for the USC study was received from the California Cancer Research Program grants 00-01389V-20170 and 2110200, U.S. Public Health Service grants CA14089, CA17054, CA61132, CA63464, N01-PC-67010 and R03-CA113148, and California Department of Health Services sub-contract 050-E8709 as part of its statewide cancer reporting program (University of Southern California). The HJO study gratefully acknowledges the contribution of Drs. Frauke Kramer and Wen Zheng to the recruitment of patients at Hannover Medical School. The HMO study gratefully acknowledges the help of Lena Gacucova in sample preparation. The HOC study has been financially supported by the Helsinki University Central Hospital Research Fund, Academy of Finland and the Finnish Cancer Society.
Author Contributions: P.D.P.P., S.A.G., D.F.E. and A.B. designed the overall study and obtained financial support. P.D.P.P., S.A.G., S.J.R., and H.S. coordinated the studies used in Phase I and Phase II. H.S., G.C.-T., and E.L.G. coordinated Phase III. J.T. and H.S. conducted primary Phase I and Phase II analysis and Phase III SNP selection. H.S., J.B., and J.M.C. conducted Phase III genotyping, R.A.V. and M.C.L. conducted Phase III and combined data statistical analyses, and S.A.G., M.N., and K.L. designed and performed ‘functional’ analysis of candidate genes. E.L.G. and S.A.G. drafted the manuscript with substantial input from G.C-T., H.S., S.J.R., T.A.S., and P.D.P.P. The remaining authors coordinated contributing studies, and all authors contributed to the final draft.