|Home | About | Journals | Submit | Contact Us | Français|
Users may view, print, copy, download and text and data- mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms
Epithelial ovarian cancer (EOC) is the leading cause of death from gynecological malignancy in the developed world accounting for 4 percent of deaths from cancer in women1. We performed a three-phase genome-wide association study of EOC survival in 8,951 EOC cases with available survival time data, and a parallel association analysis of EOC susceptibility. Two SNPs at 19p13.11, rs8170 and rs2363956, showed evidence of association with survival (overall P=5×10−4 and 6×10−4), but did not replicate in phase 3. However, the same two SNPs demonstrated genome-wide significance for risk of serous EOC (P=3×10−9 and 4×10−11 respectively). Expression analysis of candidate genes at this locus in ovarian tumors supported a role for the BRCA1 interacting gene C19orf62, also known as MERIT40, which contains rs8170, in EOC development.
Factors related to tumor aggressiveness, response to therapy, and underlying patient health are major predictors of survival in EOC. Germline genetic variation could impact every step in the process from the likelihood of secondary mutational events to host tissue tolerance of a metastatic lesion and treatment response. Evidence for the role of germline genetics comes from the observations that rare EOC predisposition-alleles of BRCA1 and BRCA2 are associated with improved overall survival following a diagnosis of EOC2, 3. Many studies have investigated the association between common genetic variation in candidate genes and EOC survival, but no positive findings have been convincingly replicated. GWAS have successfully identified common genetic variants influencing a spectrum of phenotypes4; but, to date, there are no published reports of GWAS for cancer survival outcomes.
We conducted a three-phase GWAS to identify SNPs associated with variation in the time from invasive EOC diagnosis to death (Supplementary tables 1 and 2). Genotyping was carried out in parallel with a multi-phase GWAS of EOC susceptibility5. Phase 1 comprised 1,768 cases with invasive EOC from four UK studies. Survival time data, predominantly through routine notification of deaths through the Office of National Statistics, was available for 86 percent of cases. Controls were taken from two studies previously used as part of a GWAS for other phenotypes, the UK 1958 Birth Cohort and the UK Colorectal Control Cohort. Cases were genotyped using the Illumina Infinium 610K array and controls were genotyped using the similar 550k Illumina array5–7.
Association between SNP genotypes and survival were evaluated using a 1 degree of freedom trend test based on the Cox model (see methods). The 4,649 SNPs showing the strongest evidence for association with EOC survival were selected for genotyping in phase 2 together with 22,790 SNPs selected for the susceptibility study and 800 SNPs that reported on ancestry. Phase 2 comprised 4,238 cases and 4,810 controls from ten different studies across the USA, Europe and Australia; SNPs were genotyped using a custom Illumina iSelect array. The majority of cases (80 percent) had survival time data available through a variety of sources including death certificate flagging and medical records. Finally, we genotyped the three SNPs most strongly associated with survival - rs1125436, rs8170 and rs2363956 - in a phase 3 analysis that included 4,501 cases (of which 4,076 had survival time data) and 6,021 controls from twenty two additional studies that are part of the Ovarian Cancer Association Consortium (OCAC). The SNPs rs10426843 and rs8100241 that correlate perfectly with rs8170 and rs2363956, respectively, were included as proxies in the event of assay failure. We also genotyped thirty SNPs from the top nine loci from the analysis of susceptibility8. Genotyping of rs2363956 was poor for phase 3 studies genotyped by iPlex (see Methods and Supplementary note) and genotype data for the surrogate marker was used in analyses.
Characteristics of the cases by study phase are shown in Supplementary table 1. Cases from all three phases provided 21,127 person-years of follow-up; 3,358 deaths occurred within five years following diagnosis of EOC in the combined dataset. There was little evidence of any general inflation of the survival test statistics in either phase 1 or phase 2 (estimated inflation factor phase 1 λ1000 =0.99, phase 2 λ1000 =0.99) (Supplementary figure 1). In the analysis of the combined phase 1 and 2 data the SNP most strongly associated with risk of death was rs1125436 at 13q32 (HR=1.22, 95% CI 1.12–1.32, P=3×10−6). There was no association of this SNP with EOC susceptibility (P=0.57). The next most strongly associated locus with survival was at 19p13, containing rs8170 (risk allele t) and rs2363956 (risk allele t) (HR = 1.18, 95% CI 1.09–1.27, P=2×10−5, and HR = 1.13, 95% CI 1.06–1.21, P=2×10−4 respectively). Neither SNP reached the threshold of significance in phase 1 to be selected for phase 2 of the EOC susceptibility GWAS, but in the combined phase 1 and 2 data both showed some evidence for susceptibility to EOC (OR=1.15, 95% CI 1.08–1.23, P=7×10−6, and OR=1.08, 95% CI 1.03–1.14, P=2×10−3 respectively). This association was stronger among ovarian cancers with serous histology (OR=1.22, 95% CI 1.13–1.31, P=1×10−7, and OR=1.14 95% CI 1.07–1.21, P=2×10−5 respectively). These effects were similar in analyses unadjusted for population stratification by principal components (data not shown). Risk allele frequencies of these SNPs in cases and controls by study are shown in Supplementary table 3.
In the phase 3 data there was no evidence for the association of rs1125436, rs8170 or rs2363956 with survival time (P=0.12, 0.85 and 0.25 respectively) with the effect of rs1125436 in the opposite direction to phases 1 and 2 (data not shown). The direction of the survival effect was the same for rs8170 and rs2363956, with the effect size being larger in phase 1 compared to phase 2 and 3 (Supplementary figure 2b). In the combined analysis of all three phases, rs8170 and rs2363956 showed similar levels of association with survival (HR 1.11, 95% CI 1.04–1.17, P=5×10−4 and HR 1.09, 95% CI 1.04–1.14, P=6×10−4; Table 1). The association with survival was not attenuated after adjusting for tumor grade, tumor stage, age at diagnosis and histology.
The phase 3 data, however, provided strong support for the association of rs8170 and rs2363956 with EOC susceptibility (Table 1). The association was considerably stronger when the analysis was restricted to serous cases and the association for both SNPs reached genome-wide significance in the combined data analysis of serous only cases (P=3×10−9 and 4×10−11 respectively). These remained highly significant (P<10−9) after a conservative Bonferroni correction for three tests (all cases, serous cases, non-serous cases). There was little evidence of association with other histological subtypes (Table 2). No heterogeneity was seen in the OR of serous EOC risk or HR estimates for rs2363956 (Supplementary Figure 2a–b) or rs8170 (forest plots not shown) among studies for any phase. rs8170 and rs2363956 are separated by 4kB and are weakly correlated (r2 = 0.23). In multivariate models, the associations with susceptibility to serous cancer and survival could not be fully explained by either SNP alone.
The SNP rs8170 localizes to C19orf62, also known as MERIT40, a gene with 5 distinct transcripts described to date. Depending on the alternative splice form, it is either synonymous (K279K) or non-synonymous (S281R). It may also act as an exonic splice enhancer (http://pupasuite.bioinfo.cipf.es/). rs2363956 is a non-synonymous SNP (W184L) in ANKLE1. Both amino acids are neutral and nonpolar suggesting this is a conservative change. Three recent reports have described interactions between MERIT40 and a complex including BRCA1, RAP80, BRCC45 and CCDC989–11. MERIT40 appears to regulate the retention of BRCA1 at double strand DNA breaks and maintain stability of this complex at the sites of DNA damage. Our data suggesting that common genetic variants in MERIT40 may predispose women mainly to serous ovarian cancer are also consistent with a similar subtype specificity associated with inactivating germline BRCA1 mutations12.
Common genetic variants can influence the expression of target genes through cis- and trans-regulation13. Because rs8170 and rs2363956 in MERIT40 and ANKLE1 respectively are located in the coding regions of these genes, we were able to evaluate cis-regulating expression by looking at both genotype associated expression and differential allelic expression, in 48 normal primary ovarian epithelial (POE) cell lines. We found no evidence of cis-regulated expression using either approach, although the power of these analyses was limited by the small sample size (Supplementary table 4 and Supplementary figure 3).
Array comparative genomic hybridization (aCGH) analysis was used to evaluate genomic alterations at the 19p13.11 locus in 105 high-grade serous ovarian cancers. Forty-six percent of tumors exhibit copy number gain/amplification of the p-arm of chromosome 19, with a peak of amplification in the region containing MERIT40 and ANKLE1 (Figure 1b and Figure 1c). This suggests that target genes in this region are functionally activated during tumor development. We compared the expression of MERIT40 and ANKLE1 between 48 POE cell lines and 23 ovarian cancer (OC) cell lines. Consistent with aCGH data, MERIT40 was significantly over expressed in OC cell lines compared to POE cell lines (P=5×10−9, Figure 1d), but there were no differences in expression of ANKLE1 (p = 0.54) (Figure 1e). The data from The Cancer Genome Atlas (TCGA) Pilot Project analysis of 216 serous ovarian tumors also suggests that the expression of MERIT40 (but not ANKLE1) is elevated in the majority of EOCs compared to normal tissues (Figure 1f).
The data suggesting a role for MERIT40 in EOC development need to be treated with caution. The risk associated SNPs within MERIT40 and ANKLE1 may represent markers in linkage disequilibrium with the true functional variant(s) and target genes at this locus. Based on resequencing data from the 1000 genomes project (http://www.1000genomes.org/page.php) there are fifteen SNPs perfectly correlated with rs8170 and nine SNPs correlated with rs2363956. Thus, genotyping of additional SNPs will be required to fine map this region in order to nominate optimal variants to investigate function. The peak of DNA copy number gain identified by aCGH analysis in primary EOCs spans approximately 3.5Mb (nucl. 16390797–19830868; build v37) and contains 119 genes. Within this, a 330kb region defined by the haplotype block harboring rs8170 and rs2363956 contains 14 known genes (Supplementary table 5). Gene expression data from TCGA suggests other candidate genes that could be the targets of amplification at this locus, some of which some are plausible cancer associated genes. These include NR2F6 (or EAR-2)14 which may be involved in regulation of disease progression in breast cancer, and TMEM16H, one of a family of trans-membrane proteins that may be over-expressed in several cancers15.
We can only speculate on the possible functional role of MERIT40 in the initiation and development of serous subtype EOCs, if it is the target susceptibility gene at the 19p13 locus. Any hypotheses would need to consider the apparent paradox suggested by our data that MERIT40 is over-expressed in EOCs, while BRCA1 is expected to show loss of function in its role in the double strand break (DSB) repair pathway. MERIT40 appears to act downstream of poly-ubiquitination of DNA (which occurs at all DSBs), and upstream of BRCA110. MERIT40 is necessary for BRCA1 assembly at γH2AX foci although BRCA1 is not usually a stable member of this complex9–11. Over-expression of MERIT40 may ectopically stabilize mutant BRCA1 protein into the assembled complex. Since MERIT40 knockdown makes cells more sensitive to ionizing radiation10, 11, MERIT40 over-expression could have the opposite effect, protecting cells with dysfunctional BRCA1 and DSB repair activity and enabling them to tolerate more DNA damage.
The association with survival was only apparent in phases 1 and 2, and did not reach genome-wide significance overall. The clear evidence of association with serous EOC risk suggests that the survival association could still be of interest, but further study will be required to clarify the magnitude of the association. We would not have detected the association at 19p13 with risk of EOC if SNPs had not been selected for phase 2 as a result of its association with survival time. The failure to detect an association with susceptibility may simply be the play of chance – the power in phase 1 to detect an odds ratio of 1.12 (combined data estimate) at the P-value threshold required for a SNP to be taken into phase 2 was 50 percent. It may also have been the result of other factors such as disease heterogeneity - the association was stronger for serous EOC and our initial analysis of phase 1 data (for selection of SNPs for Phase 2) was based on cases of all histological types. Furthermore, the majority of the phase 1 cases were prevalent and, if the association of this locus with survival time is real (but small), this would bias the susceptibility association towards the null.
These data add to a growing list of genetic loci with common susceptibility alleles for EOC. Our data suggesting that the BRCA1 interacting gene MERIT40 may be the gene underlying the genetic associations add weight to the significance of the 19p13 locus for susceptibility in EOC. This is further emphasized by the finding of Antoniou et al. in the accompanying article16 that genetic variants in this region appear to modify the risks of breast cancer in individuals carrying germline BRCA1 mutations.
The ovarian cancer case-control studies that participated in phases 1, 2 and 3 are summarized in Supplementary table 2. Phase 1 comprised invasive epithelial ovarian cancer cases from UK and genotype data of UK controls from GWAS of other phenotypes. Phase 2 comprised ten case-control studies from the Ovarian Cancer Association Consortium. Phase 3 comprised 16 case-control studies from the OCAC and five case-only studies. All studies provided data on age at diagnosis and date of blood draw, self-reported ethnic group and histological subtype. Tumor histology was collected for all cases based on pathology reports or central pathological review and was categorized according to the World Health Organization classification system for ovarian cancer17.
Genotyping for phase 1 cases was conducted using the Illumina Infinium 610K array at Illumina Corporation. Existing data from two sets of controls, genotyped on the Infinium 550k array, were used in phase 1 analyses: the Welcome Trust Case-Control Consortium 1958 birth cohort and a national colorectal control study. All cases were from the UK and confirmed as invasive epithelial ovarian cancer. Genotyping the phase 2 studies was conducted using a custom Illumina iSelect array at Illumina Corporation.
For four phase 3 studies (TOR, NCO, MAY, MOF) genotype data were available from an independent, ongoing GWAS study that also used the Illumina Infinium 610K platform. Genotyping and QC were performed at the Mayo Clinic genotyping shared resource. deCODE ovarian cancer cases were assayed by single SNP genotyping on the Centaurus (Nanogen) platform and controls were from a GWAS using the Human Hap300 and HumanCNV370-duo Bead Arrays. The SNP rs2363956 was genotyped using ABI Taqman for five of the phase 3 case-only studies (LAX, PVD, SCO, YAL and additional cases from HOP). The remaining phase 3 studies were genotyped using Sequenom iPlex. Quality control procedures for all study phases are described in the supplementary materials.
We used the program LAMP18 to assign intercontinental ancestry to phase 1 samples based on the HapMap genotype frequency data for European, African and Asian populations (release no.22). LAMP was also used to assign ancestry to the Phase 2 samples using the HapMap data on European (CEU), African (ASW), East Asian (JPT-CHB-CHD), Mexican (MEX) and Indian (GIH). Subjects with less than 90 percent European ancestry were excluded. For both the phase 1 and 2 samples, we used AIMs to calculate principal components for the subjects of European ancestry. The first principle component explained 0.42 percent of the variability and was included as a covariate in subsequent association analyses. Subsequent principal components were not included as they explained less variability and there was little difference in their eigenvalues. In the phase 3 dataset, we excluded samples if their self-reported ethnicity was other than non-Hispanic white.
We imputed missing genotype data for all the common variants in the HapMap for phase 1 samples in order to increase genome coverage. We used an in-house method that combines the features of fastPHASE19 and IMPUTE20 to impute the ungenotyped or missing SNPs, using the phase 2 HapMap data (CEU) which contains phased haplotypes for 60 individuals on 2.5 million SNPs. For each imputed genotype the expected number of minor alleles carried was estimated (weights). Genotyped SNPs were assigned weights of 0, 1 or 2 (actual number of minor alleles carried). We estimated the accuracy of imputation by calculating the estimated r2 between the imputed and actual SNP. SNPs with r2 < 0.64 were excluded (n = 152,401) leaving a total of 2,563,972 SNPs for phase 1 analysis.
In the analysis of the phase 1 and phase 2 data the effect of each SNP on time to all-cause mortality after EOC diagnosis was assessed using Cox regression stratified by study and modeling the per-allele effect as log-additive. The Cox proportional hazards assumption was evaluated by inspection of standard log-log plots. Individual level data for the deCODE study were not available and so for the analysis of the phase 3 data and for the combined analyses, each study was analyzed separately and the results pooled by estimating an average of the study specific loge hazard ratios with each weighted by the inverse of its variance. Because the EOC cases showed a variable time from diagnosis to study entry, we allowed for left truncation with time at risk starting on date of diagnosis and time under observation beginning at the time of study entry. This generates an unbiased estimate of the hazard ratio provided the Cox proportional hazards assumption is correct21. The analysis of phase 1 data was right censored at 10 years after EOC diagnosis. In subsequent analyses, we right censored at 5 years after diagnosis in order to reduce the number of non-EOC related deaths. We used logistic regression to test for association between genotype and case-control status. For phase 1 and 2 data we adjusted for study phase and study by including phase and study specific indicators in the model. For phase 3 data we analyzed each study separately and then pooled the results using an inverse-variance weighted average of the study specific loge odds ratios.
aCGH analysis was performed using a whole genome tiling path microarray (http://www.instituteforwomenshealth.ucl.ac.uk/academic_research/gynaecologicalcancer/trl/arrayfacility) consisting of 32,450 BAC clones22. Regions containing >80 percent neoplastic cells were micro-dissected from formalin fixed paraffin embedded tumor tissue sections, and DNA extracted by proteinase K digestion. Tumor DNA and matching peripheral blood DNA were amplified using the GenomePlex whole genome amplification kit (Sigma) and fluorescently labelled using the BioPrime Total Kit (Invitrogen). Microarrays were co-hybridised with the labelled DNA as described previously23, scanned using a Scanarray Express laser scanner (Perking Elmer), and spot signal intensities extracted using BlueFuse (BlueGnome). Raw data were analysed using R and the Bioconductor packages MANOR, LIMMA, DNAcopy and CGHcall as described elsewhere. BAC clone locations were derived from NCBI Human Genome build 36 (HG18).
Normal, primary ovarian epithelial (POE) cell lines were established from brushings of normal ovaries of patients undergoing total hysterectomies at University College London Hospital (UCLH), UK. All ovaries were histologically confirmed as free of disease. UCLH ethical committee approval was given for the collection and analysis of all patient samples. Short-term cultures of POE cells were established as previously described24. The non-neoplastic status and epithelial (non fibroblastic) nature of cells was confirmed by staining for the markers CA125, CK18, FVIII and FSP. RNA was extracted from POE and OC cell lines (Supplementary table 4) using RNAeasy Mini Kits (QIAgen). Reverse transcribed (RT) RNA was analyzed for candidate gene expression by semi-quantitative real-time PCR using the Applied Biosystems 7900HT genetic analyzer. Gene expression was normalized against 2 endogenous controls Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and β-actin. Real time expression data were analyzed using the comparative Delta-Delta Ct method. The expression values for genes in all cell lines that are given are relative to either the lowest or highest expression of a POE cell line, normalized against GAPDH and β-actin. Differences in the relative expression of each candidate gene between EOC and POE cell lines were assessed using the nonparametric two-sided Wilcoxon Rank sum test using R. For allele specific expression analysis, gene expression was calculated relative to the average expression of the common homozygotes for each candidate SNP normalized against the expression of the endogenous control genes. Wilcoxon Rank sum tests were used to assess the difference in expression between common homozygotes, heterozygotes and rare homozygotes.
For each SNP, 8ng of cDNA from the heterozygous POE cell lines (10 for rs8170 and 15 for rs2363956) were analyzed by real time RT_PCR using Taqman custom genotyping assays (Applied Biosystems). Genomic DNA extracted from lymphocytes from two heterozygous individuals was used for a standard curve to adjust for dye bias as there would be equal copies of each allele. All samples were analyzed in triplicate. Differential allelic expression was determined from the log2 ratio of the VIC allele / FAM allele with a cut-off of log2(1.20)=0.263 as described previously13.
We thank all the individuals who took part in this study and all the researchers, clinicians and administrative staff who have enabled the many studies contributing to this work. In particular we thank A. Ryan and J. Ford (UKOPS); J. Morrison, P. Harrington and the SEARCH team (SEA), U. Eilber and T. Koehler (GER); D. Bowtell, A. deFazio, D. Gertig, A. Green (AOCS); A. Green, P. Parsons, N. Hayward, D. Whiteman (ACS); L. Gacucova (HMOCS); S. Haubold, P. Schürmann, F. Kramer, W. Zheng, T.-W. Park-Simon, K. Beer-Grondke and D. Schmidt (HJOCS); L. Brinton, M. Sherman, A. Hutchinson, N. Szeszenia- Dabrowska, B. Peplonska, W. Zatonski, A. Soni, P. Chao, M. Stagner (POL2).
The genotyping and data analysis for this study was supported by a project grant from Cancer Research UK. We acknowledge the computational resources provided by the University of Cambridge (CamGrid). This study makes use of data generated by the Wellcome Trust Case-Control consortium. A full list of the investigators who contributed to the generation of the data is available from www.wtccc.org.uk. Funding for the project was provided by the Wellcome Trust under award 076113. The Ovarian Cancer Association Consortium is supported by a grant from the Ovarian Cancer Research Fund thanks to donations by the family and friends of Kathryn Sladek Smith. The results published here are in part based upon data generated by The Cancer Genome Atlas Pilot Project established by the NCI and NHGRI. Information about TCGA and the investigators and institutions who constitute the TCGA research network can be found at http://cancergenome.nih.gov. S.J.R. is supported by the Mermaid/Eve Appeal, G.C.-T. and P.M.W. by the NHMRC, P.A.F. by the Deutsche Krebshilfe, MG acknowledges NHS funding to the NIHR Biomedical Research Centre at the Royal Marsden Hospital and D.F.E. is a Principal Research Fellow of Cancer Research UK. Funding of the constituent studies was provided by the Danish Cancer Society, the Ovarian Cancer Research Fund (PPD/RPCI.07), the Roswell Park Cancer Institute Alliance Foundation, the US National Cancer Institute (CA58860, CA92044, P50CA105009, R01CA122443, R01CA126841-01, R01CA16056, R01CA61107, R01CA71766, R01CA054419, R01CA114343, R01CA87538, R01CA112523, R01CA58598, N01CN55424, N01PC35137, and Intramural research funds), the US Army Medical Research and Material Command (DAMD17-01-1-0729), Cancer Council Victoria, Cancer Council Queensland, Cancer Council New South Wales, Cancer Council South Australia, Cancer Council Tasmania and Cancer Foundation of Western Australia, the National Health and Medical Research Council of Australia (199600 and 400281), the German Federal Ministry of Education and Research of Germany Programme of Clinical Biomedical Research (01 GB 9401), the state of Baden-Wurttemberg through Medical Faculty of the University of Ulm (P.685), the Mayo Foundation, the Lon V. Smith Foundation (LVS-39420), the Oak Foundation, the University College Hospital National Institute for Health Research Biomedical Research Centre and the Royal Marsden Hospital Biomedical Research Centre.
Author contributionsP.D.P.P., S.A.G., and D.F.E. designed the overall study and obtained financial support. P.D.P.P., S.A.G., S.J.R., and H.S. coordinated the studies used in phase 1 and phase 2. H.S., K.L.B. G.C.-T., and E.L.G. coordinated phase 3. J.T. and K.L.B. conducted primary phase 1 and phase 2 analysis and phase 3 SNP selection; K.L.B. conducted phase 3 and combined data statistical analyses; H.S., J.B., and J.M.C. conducted phase 3 genotyping. S.A.G., M.N., C.J. and T.S. designed and performed the functional analyses. The remaining authors coordinated contributing studies. K.L.B. and P.D.P.P. drafted the manuscript with substantial input from S.A.G, H.S., and S.J.R. All authors contributed to the final draft.