|Home | About | Journals | Submit | Contact Us | Français|
This study assesses the ability of multidrug resistance (MDR)-associated gene expression patterns to predict survival in patients with newly diagnosed carcinoma of the ovary. The scope of this research differs substantially from that of previous reports, as a very large set of genes was evaluated whose expression has been shown to affect response to chemotherapy.
We applied a customized TaqMan Low Density Array, a highly sensitive and specific assay, to study the expression profiles of 380 MDR-linked genes in 80 tumor specimens collected at initial surgery to debulk primary serous carcinoma. The RNA expression profiles of these drug resistance genes were correlated with clinical outcomes.
Leave-one-out cross-validation was used to estimate the ability of MDR gene expression to predict survival. Although gene expression alone does not predict overall survival (P=0.06), four covariates (age, stage, CA125 level and surgical debulking) do (P=0.03). When gene expression was added to the covariates, we found an 11-gene signature that provides a major improvement in overall survival prediction (log-rank statistic P<0.003). The predictive power of this 11-gene signature was confirmed by dividing high and low risk patient groups, as defined by their clinical covariates, into four specific risk groups based on expression levels.
This study reveals an 11-gene signature that allows a more precise prognosis for patients with serous cancer of the ovary treated with carboplatin- and paclitaxel-based therapy. These 11 new targets offer opportunities for new therapies to improve clinical outcome in ovarian cancer.
Ovarian cancer is the fifth most deadly cancer in women and has the highest mortality rate of all gynecological cancers, with approximately 16,000 deaths per year in the US (1). This may be explained since more than 70% of women with ovarian cancer are diagnosed at stages III and IV according to the International Federation of Gynecology and Obstetrics (FIGO) (2). The initial management of woman with advanced stage ovarian cancer is cytoreductive surgery, followed by platinum-based chemotherapy typically combined with paclitaxel (3). Although more than 50% of advanced stage patients will achieve a complete clinical response, almost all of these women still harbor subclinical disease with less than 20% alive 5 years after their primary diagnosis (2, 4).
This may be explained by resistance to chemotherapeutic treatment, which can be attributed to a number of mechanisms including decreased drug uptake, activation of detoxifying systems, activation of DNA repair mechanisms, intracellular changes that raise the apoptotic threshold, alteration of target proteins, or increased drug efflux. These mechanisms can act individually or synergistically, leading to multidrug resistance (MDR), in which cells become resistant to a variety of structurally and mechanistically unrelated drugs (5). Although extensive research has characterized many of the mechanisms involved in MDR in vitro,(5) translating this information to the clinic still represents a major conceptual and technical challenge (6). For example, conflicting reports on the role of drug efflux transporters in the clinic may be in part explained by the inability of high-throughput gene expression profiling platforms to accurately detect individual genes in a highly homologous gene superfamily (7). We have recently demonstrated the superiority of TaqMan-based qRT-PCR over established technologies in assessing the expression profiles of highly homologous ABC transporter genes (8).
We then used the Taqman Low Density Array (TLDA), a medium throughput Taqman-based qRT-PCR assay, to evaluate the clinical relevance of 380 MDR-related genes, selected in the literature published over the past 30 years (9). Most of these genes have been reported to be involved in multidrug resistance in studies of cultured cancer cells, and may be involved in either intrinsic or acquired MDR, in many cancer types.
In this study, we tested the hypothesis that expression of some of these genes may contribute to the intrinsic drug resistance of ovarian cancer. The gene expression profiles obtained from untreated ovarian primary serous carcinoma samples from 80 patients were correlated with clinical outcomes. Our data reveal that the expression level of 11 genes associated with MDR define an intrinsic drug resistance signature that significantly improves prediction of overall survival in patients with ovarian cancer compared to predictions based only on clinical covariates. This suggests that these genes may be useful targets to inhibit or circumvent these resistance mechanisms.
The collection of tumors for research and specifically molecular analysis was first approved by the institutional review board of each of the participating cancer treatment centers and written informed consent was obtained from the patients. We then obtained 80 specimens of untreated ovarian primary serous carcinoma from the Massachusetts General Hospital Tissue Repository, the M.D. Anderson Cancer Center Tumor Bank, and the Cleveland Clinic (Table 1). Pathological analysis at the home institutions confirmed that each sample contained at least 75% cancer cells. Data on clinical outcomes were obtained from patients' records. Treatment response was divided into four categories: Complete Response (CR), Partial Response (PR), Stable Disease (SD) and Progressive Disease (PD). All the patients analyzed received the same carboplatin/taxol regimen. The time periods of sample collection were as follows: MDA: 1995–2003, MGH: 2002–2007, Cleveland: 2001–2005,
Detailed procedures for the preparation of total RNA, reverse transcription, the use of TaqMan Low Density Arrays, data import and preprocessing including the use of 18s as quality control and median normalization were previously reported (8, 10).
Data from TaqMan Low Density Arrays was collected for each of the 80 untreated ovarian primary serous carcinomas, and is available in Gene Expression Omnibus (GEO; accession number GSE30009, http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE30009).
The objective of our statistical analysis was to correlate survival (overall and progression free) with gene expression to estimate if a gene expression signature might provide prognostic information over and above well validated clinical variables. This analysis was carried out using BRB ArrayTools, a microarray-data statistical analysis tool (http://linus.nci.nih.gov/BRB-ArrayTools.html).
Genes expressed in fewer than 50% of the samples were filtered out, resulting in a final number of 356 genes that were used in all subsequent analyses.
We compared the prediction for overall survival (OS) and progression free survival (PFS) using three different models: a) a model with gene expressions alone used as factors, b) model with clinical covariates as factors, and c) a model using gene expression and clinical covariates as factors. Prediction of risk for PFS and OS was done using the built-in risk prediction function in BRB Array Tools based on the supervised principal component method described in Bair and Tibshirani (11). For each of the three models, a Cox Proportional Hazards (CPH) model was used to fit a weighted sum of the factors to the log-hazard. For the clinical covariates Age, Surgical debulking, Stage and CA125 level were used as the factors. For gene expression, we considered those genes that had expression correlated with survival (OS or PFS) with P<0.01 using the logrank statistic. The first four principal components of these genes were used as the factors in the model.
To get an unbiased estimate of the predictive ability of each of the three models, we used cross-validation of the entire algorithm. This algorithm has been shown to be an efficient alternative to having a separate set of samples for validation (11), and yields an unbiased estimate of the prediction accuracy. For a review on this subject, see Simon et al. (12). One sample (out of 80) was left out (the testing set). Using the remaining samples (the training set), we fitted each of the three models. For the model with clinical covariates only, we fitted a CPH model with the four covariates as the factors. For the model with gene expression only, we looked at the individual correlation of each gene with survival using a CPH model and selected genes with P<0.01. These genes were used to compute the first four principal components which were used as factors to fit a CPH model. For the model with covariates and gene expression, we looked at the individual correlation of each gene with survival using a CPH model that includes the four covariates and selected genes with P<0.01. Then we computed the first four principal components of these genes. The four principal components, along with the four clinical covariates were used as factors to fit the CPH model.
Each model results in selection of different genes (since the dataset used is different in each). Each model also gives a weight or coefficient to the factors used for its creation (covariates and/or gene principal components). We use the coefficients to calculate an estimate of the log hazard for each sample in the training set. The median of these log hazard estimates was used as a threshold. Then, to predict the risk for the left out sample, we computed its estimated log-hazard using the covariates and principal components derived from the training set and then used the threshold to predict the sample as low or high risk. This was repeated for each of the samples omitted in turn.
This procedure divides the samples into training and testing sets at each iteration to ensure that the same samples are not used to both create and test the model. The entire algorithm, including selection of genes used in the models, is repeated each time, to ensure unbiased estimation.
For each sample, we calculated a low/high prediction of risk using either: a) the clinical covariates alone or b) gene expression alone or c) clinical covariates along with gene expression. Each method results in a log-rank statistic and the statistics for the two methods can be compared with each other to decide whether there is a statistically significant difference between the two. A statistically significant difference implies that the gene expression data add significantly to the risk prediction accuracy. The calculation of statistical significance for the difference was done using BRB ArrayTools by randomly shuffling the survival times to estimate the null distribution of the log-rank differences. A p-value for the observed difference was calculated using this null distribution. The clinical covariates used in our study were age, stage, CA-125 level (13) and residual tumor (optimal vs. sub-optimal debulking).
We also used a two-step method to divide the subjects into four risk groups. In the first step, the clinical covariates alone were used to divide the subjects into low and high risk groups. Then within each risk group, the members were further divided into two risk sub-groups (denoted low/high) using the sum of the gene expressions weighted by their CPH coefficient. The median of this sum was used as a threshold to further subdivide each group into high and low. This gives a four-level risk prediction consisting of high/high, high/low, low/high and low/low groups.
The gene expression profiles of the genes correlated to OS alone do not effectively predict the OS of women with ovarian cancer (P=0.06) (Fig. 1A). Four covariates (age, stage, CA-125 level and surgical debulking), however, are highly significant discriminators of OS (P=0.03) (Fig. 1B). A univariate analysis of the clinical covariates is shown in Supplementary Table 1, while Kaplan-Meier curves for CA125 and debulking covariates are provided in Supplementary Fig. 1. When gene expression data were added to the clinical covariates, we identified an 11-gene signature (Table 2) whose measured expression significantly improves the power of the covariates to predict poor survival (log-rank statistic P<0.003) (Fig. 1C). These 11 genes had P<0.01 in a Cox proportional hazards model (see Methods section on statistical analysis of data for thorough explanation of this model) including the four covariates. All the genes present on the array were tested, but only 11 genes were selected, based on their p-value using a CPH model. The other genes were not found to be statistically significantly correlated with overall survival.
We validated the gene signature using a leave-one-out cross-validation scheme. The algorithm used to test the predictive ability of the covariates or genes was repeated with each sample left out in turn (resulting in 80 iterations). A final model with all the samples was used to find the 11-gene signature. Table 2 shows the percentage of times each of the 11 genes were selected in the 80 iterations.
We also found that neither gene expression profiles alone (P=0.2) (Fig. 1D), nor the clinical covariates alone (P=0.17) (Fig.1E), effectively predict the chance of progression-free survival of these patients. Adding the gene expression profiles to the clinical covariates does not improve the prognostic accuracy for this outcome (log-rank statistic P=0.85) (Fig. 1F).
The initial high and low risk groups defined using the clinical covariates alone were then refined using the gene signature. Fig. 2 shows the results from the two-step, four-level risk prediction method (splitting the clinical-covariates-based risk group “High” into “high/high” and “high/low”, and the clinical-covariates-based risk group “Low” into “low/high” and “low/low” groups). We observe that incorporating gene expression data provides additional predictive information to the clinical covariates. A statistical significance cannot be assigned to this Kaplan-Meier plot, since the Cox PH coefficients for the genes were calculated using all the data. However, the values for the median OS difference between the High and Low risk patients, without taking CA125 level into account are, in months : High/high 20.1, High/low 60.0, Low/high 34.6, Low/low NA (not applicable) (Fig. 2A), and including CA125 level: High/high 18, High/low 68, Low/high 32, and Low/low NA (Fig. 2B).
The lack of a standard approach to study the cancer transcriptome as well as in analysis of gene expression is a major impediment to formulating hypotheses about the etiology and effective treatment of cancer (for review, see Gillet et al. (14, 15)). The variety of gene expression profiling platforms and normalization processes that have been utilized by various researchers renders an integrative computational and analytical approach extremely challenging (16). A second drawback linked to the technology is insufficient specificity and sensitivity of the assay utilized (14, 15). This is a major problem, especially in studies investigating multidrug resistance mechanisms, as many of the mediators belong to highly homologous gene superfamilies. We and others recently showed that Taqman low-density arrays (TLDAs) provide the highest sensitivity and specificity in measuring ABC transporter gene-expression patterns, a superfamily of 48 highly homologous members, initially studied in the NCI-60 cancer cell line panel (8, 17). Therefore, we chose to configure such a platform to study the expression profiles of multidrug resistance-associated genes, selected through the literature published over the last thirty years, in clinical specimens and to assess their utility in survival prediction.
We found that the expression signature of a group of 11 genes statistically improves the prognostic power of four clinical covariates (age, stage, residual tumor status after cytoreductive surgery and CA-125 level) for overall survival. The predictive ability of the 11-gene signature with the four covariates is estimated using a leave-one-out cross-validation scheme. Note that the cross-validation is only used as a method to estimate the predictive ability. The final model (which would potentially be used on other samples or as a clinical diagnostic) is created using all the samples, without any samples left out. This final model is the one that contains the 11 genes.
To date, three clinical covariates, age, stage, and residual tumor status after cytoreductive surgery, have been validated as prognostic variables for ovarian cancer. Another variable, the presence of elevated serum levels of a protein known as CA-125 (the product of the MUC16 gene) is being used here as another prognostic factor. Although CA-125 has proven useful in the clinic to predict response to treatment and to detect the recurrence of ovarian cancer, the independent prognostic power of this antigen is more controversial (13). The specificity of the test for CA-125 is poor, as a number of benign and malignant conditions may result in falsely elevated CA-125 values, and while it is found to be elevated in the serum of approximately 80% of the patients with advanced stage ovarian cancer, it is found in only 50% of the patients with early stage disease. When CA-125 is removed from the statistical analysis, the addition of gene expression data does not result in any improvement in the prediction of overall survival (see supplementary Fig. 2), indicating that this marker adds prognostic power to our cohort, which was composed exclusively of advanced-stage (III and IV) ovarian cancer samples.
Clinically, 75 – 80% of the patients that present with ovarian cancer are stage III or IV. Some of these patients are much more responsive to treatment than others. If clinicians could have a better handle on the molecular profile of patients that will or will not respond to standard chemotherapy (if they know upfront which patients are likely not to respond to the standard chemotherapeutic regimens), they could provide them with an alternative initial treatment. For that reason, we attempted to better categorize the high and low risk patient groups predicted by the covariates only into more specific risk groups (i.e., high-high, high-low, low-high, low-low) using the expression levels of the 11 genes. We observed that the highest and lowest risk patient groups are confirmed by adding the gene expression signature. Remarkably, we also found that patients considered as high risk by clinical covariates have a better prognosis than the low risk patient group if the expression of the 11 genes is low. Similarly, the low risk patient group identified by clinical covariates has a worse prognosis if these patients highly express those 11 genes. Although it will be necessary to repeat this analysis in an independent set of samples from ovarian cancer patients, the statistical approach used in this work argues strongly that any appropriate sample size of patients at a similar disease stage, treated similarly, should yield a similar signature. In a much smaller group of 23 ovarian cancer patients from the Norwegian Radium Hospital, we were unable to confirm the 11-gene signature as a predictor of poor response to chemotherapy, but this is likely due to the small sample size and the differences in diagnosis and treatment between the United States and Norway (data not shown). The data also indicate that our gene expression profile alone is insufficient to provide significant prognostic or predictive information. Others have successfully correlated gene expression profiles with either overall or progression-free survival (18–21). However, these gene signatures, identified through whole genome microarrays, exhibit very little overlap. The lack of similarity does not necessarily preclude the value or robustness of these signatures, and can be explained in part by the technical limitations of microarray analysis previously mentioned as well as by the heterogeneity among the various cohorts analyzed. For example, Birrer and colleagues uncovered two prognostic signatures using two different strategies (19, 20). One of the signatures was found to be relevant for suboptimally debulked patients based on the analysis of 185 untreated late-stage serous ovarian cancer patients (19). Later, another prognostic signature was discovered from the study of 53 laser-captured microdissected samples from untreated late-stage serous ovarian cancer patients, further validated in 64 additional samples (20). Jazaeri et al. investigated 45 samples of serous ovarian cancer composed of 21 chemosensitive samples (defined as those with a complete response to chemotherapy and a platinum-free interval of ≥13 months) and 24 chemoresistant samples including patients who had either surgical debulking following chemotherapy (i.e., neoadjuvant chemotherapy) or residual tumor at the time of a second-look procedure (21). Berchuck et al. highlighted a predictive signature studying 101 samples composed of early stage, late stage and borderline cases of serous ovarian cancer (18), while Spentzos and colleagues studied a cohort of 68 samples mostly composed of stage III and IV serous ovarian cancer (22). To our knowledge, none of these predictors, some of which were poorly characterized open reading frames (ORFs), have yet been validated in prospective studies.
The 11-gene signature identified in the current study includes some genes for which a role as markers of poor prognosis in ovarian cancer had previously been suggested, but is still unclear. The EGFR and MAPK3 (ERK1) genes may affect numerous cell processes as mediators in cell signalling. A recent meta-analysis of 15 studies addressing the role of EGFR in ovarian cancer showed considerable publication bias and concluded that this receptor is unlikely to be useful as a prognostic marker in clinical practice (23). However, two other studies showed that the use of EGFR inhibitors stabilized disease in 11–44% of patients and produced objective regression in 4–6% (24, 25). It is known that MAPK3 (ERK1) is downstream from EGFR and activated mainly through Ras. Therefore, MAPK3 is a relevant target for patients in whom inhibitors against both EGFR and Ras are ineffective due to mutations in those proteins. Matrix Metallopeptidase 9 (MMP9), which is involved in cell invasion/adhesion, was found to be expressed in surgical specimens of ovarian cancer (26). Three genes are linked to the apoptotic pathway; BAG3 is an anti-apoptotic mediator, FASL and BNIP3 are pro-apoptotic markers. BNIP3 is related to the BH3-only family, which induces both cell death and autophagy (27). This gene was found to be strongly correlated with poor prognosis in non-small cell lung cancer (28). Further analysis showed BNIP sequestration in the nucleus, which was confirmed in primary glioblastoma multiforme tumors (29). Another study reported the inhibition of BNIP3 cell death activity by growth factors such as EGF and IGF in epithelial cells (30), which is consistent with the presence of EGFR in our 11-gene signature. This mechanism of inhibition may be explained by the conflicting signals promoting cell death and the survival mediators that may allow the selection of tumor cells that survive chronic BNIP3 overexpression. Fas ligand (FASL) induces apoptosis through the binding of its receptor, FAS. Recent studies have shown that Fas loss is associated with bad prognosis in patients with ovarian cancer (31, 32). The phase II metabolism enzyme Glutathione peroxidase 3 (GPX3) was found to be highly expressed in clear cell adenocarcinoma cell lines. Its down-regulation in these cells increased cisplatin sensitivity 3- to 4-fold (33). TAP is a heterodimeric protein complex consisting of TAP1 and 2 (ABCB2-B3) subunits, which transports antigens into the ER lumen for subsequent loading onto major histocompatibility complex (MHC) class I molecules (34). A recent study reported the upregulation of these genes in approximately 70% of the specimens profiled from 150 patients with invasive epithelial ovarian cancers (35). Defects in the antigen processing machinery (APM) may allow tumor cells to escape immune recognition. The ITGAE gene included in our signature was also in a gene signature that predicted platinum resistance in a set of 72 ovarian tumors (36). The current study confirms that expression of the genes mentioned above predicts poor outcome in patients with ovarian cancer, providing independent verification of their importance in this type of cancer.
Some of the eleven genes in our signature had not previously been linked to ovarian cancer. To our knowledge, there have been no previous reports on the role of either S100A10 or APC in ovarian cancer. The tumor suppressor adenomatosis polyposis coli (APC) has been confirmed to participate in the Wnt signaling pathway by downregulating beta-catenin and thereby controlling gene transcription and cell proliferation. Mutations inactivating the APC gene are found in approximately 80% of all human colon tumors (37). S100 proteins regulate cellular differentiation, energy metabolism, cytoskeletal membrane interactions and cell-cycle progression (38). They are commonly upregulated in tumors of epithelial origin. For example, S100A4 upregulation was found to be associated with an unfavourable outcome in early breast cancer (39) and its nuclear expression in advanced-stage ovarian carcinoma was associated with poor response to chemotherapy and with worse overall survival (40). Another example is S100A7, which was found to be upregulated in high-grade ductal carcinoma in situ (DCIS) and some invasive breast carcinomas (41, 42). Although one study reported the down-regulation of APC in both primary and metastatic serous ovarian carcinomas (43), the role of both APC and S100A10 as markers in ovarian cancer has yet to be elucidated. These genes, whose prognostic significance was previously unreported in regards to ovarian cancer, provide potential new targets for improved therapy.
We also sought to understand why the signature did not predict progression-free survival, the clinical surrogate for drug-resistant cancer. Although a high percentage of our patients (57%) failed to respond to carboplatin, indicating the advanced state of disease in these banked tumor specimens, we must acknowledge that a relatively small number of samples analyzed in our study were from patients presenting with progressive disease (22.5%), while more than half (56%) of the samples came from those who exhibited a complete response. This significant imbalance may explain the failure to find a statistically significant predictive gene signature. We must also recognize the dramatic effect of chemotherapy on gene expression profiling, which could not be assessed due to the lack of availability of samples taken after recurrence of the disease. Finally, we can exclude neither the potential role of any of the 380 genes in intrinsic drug resistance in individual primary ovarian serous carcinomas, nor the possibility of an undetected acquired drug resistance signature. The inter-individual specificity in gene expression, as well as the relevance of the 380 genes to platinum and taxane agents was confirmed experimentally in a study investigating 32 unpaired ovarian serous carcinoma effusion samples obtained at diagnosis or at disease recurrence following chemotherapy. Our analyses demonstrated that gene expression alone can effectively predict the survival outcome of women with ovarian serous carcinoma (OS: log-rank p=0.0000 and PFS: log-rank p=0.002) (10).
In summary, our study reveals that the expression level of 11 genes associated with MDR define an intrinsic drug resistance signature that significantly improves prediction of overall survival in patients with ovarian cancer compared to predictions based on clinical covariates only. The potential role of these genes in drug resistance or alternatively, how they support the malignant phenotype and/or growth rate of the tumor remains to be determined (Table 3).
Despite aggressive surgical cytoreduction and first line combination chemotherapy of carboplatin and paclitaxel, there has been no significant improvement in survival of patients with ovarian cancer. This high mortality rate may be attributed to drug-resistant primary tumors. Using a state-of-the-art TaqMan-based qRT-PCR platform, we evaluated a set of 380 genes whose expression has been shown to affect response to chemotherapy, in specimens consisting of at least 75% cancer cells, of untreated primary serous carcinoma of the ovary from 80 patients. This study reveals an 11-gene signature that considerably improves the prediction of overall survival in these patients. Indeed, using the expression level of this 11-gene signature, it was found that a subset of clinically high risk patients had a better outcome than would be predicted by purely clinical criteria (including age, stage, CA-125 level and surgical debulking).
We thank Drs. Michael Dean, Ding-Wu Shen, Matthew D. Hall and Mitsunori Okabe for their help in compiling the list of 380 MDR-linked genes. We thank George Leiman for editorial assistance. We would also like to thank Ms. Petra Sergent for preparation of the nucleotide samples from Massachusetts General Hospital.
This research was supported by the Intramural Research Program of the National Institutes of Health, National Cancer Institute, Center for Cancer Research. Anna Maria Calcagno was supported by the NIGMS Pharmacology Research Associate (PRAT) Program. Ben Davidson and Mari Bunkholt Elstrand were supported by the Inger and Jon Fredriksen Ovarian Cancer Research Foundation. Anil Sood and Aparna Kamat were supported by the University of Texas M.D. Anderson Cancer Center Ovarian Cancer Specialized Program of Research Excellence (P50 CA083639), and the Betty Ann Asche Murray Distinguished Professorship (Anil Sood). The sample collection and preparation from Massachusetts General Hospital was funded in part by the Advanced Medical Research Foundation, Ovarian Cancer Research Fund, Ovarian Cancer Education and Awareness Network (O.C.E.A.N.), Vincent Memorial Research Funds, MGH Cancer Center and the DFHCC Ovarian SPORE (5P50CA105009). Ram Ganapathi was supported by CCF Translational Research Core (P30CA043703), FISH fund and CARES grant.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors’ ContributionsJP Gillet, AM Calcagno, S Ambudkar and M Gottesman were responsible for the study concept and design. B Davidson, M Bunkholt Elstrand, R Ganapathi, A Kamat, A Soof, B Rueda, and M Seiden collected tumor specimens for molecular analysis. JP Gillet, AM Calcagno, S Ambudkar, B Davidson, M Bunkholt Elstrand, R Ganapathi, A Kamat, and A Sood analyzed the specimens and the data was interpreted by JP Gillet, M Gottesman, B Davidson, R Ganapathi, A Sood, S Ambudkar, M Seiden, and B Rueda. S Varma did the statistical analysis. JP Gillet, AM Calcagno, S Varma, M Gottesman, B Rueda, and A Sood drafted the report, which was critically reviewed for content by all authors.