|Home | About | Journals | Submit | Contact Us | Français|
To evaluate the potential of gene expression signatures to predict response to treatment in locally advanced cervical cancer treated with definitive chemotherapy and radiation.
Tissue biopsies were collected from patients participating in Radiation Therapy Oncology Group (RTOG) 0128, a phase II trial evaluating the benefit of celecoxib in addition to cisplatin chemotherapy and radiation for locally advanced cervical cancer. Gene expression profiling was done and signatures of pretreatment, mid-treatment (before the first implant), and “changed” gene expression patterns between pre- and mid-treatment samples were determined. The ability of the gene signatures to predict local control versus local failure was evaluated. Two-group t test was done to identify the initial gene set separating these end points. Supervised classification methods were used to enrich the gene sets. The results were further validated by leave-one-out and 2-fold cross-validation.
Twenty-two patients had suitable material from pretreatment samples for analysis, and 13 paired pre- and mid-treatment samples were obtained. The changed gene expression signatures between the pre- and mid-treatment biopsies predicted response to treatment, separating patients with local failures from those who achieved local control with a seven-gene signature. The in-sample prediction rate, leave-one-out prediction rate, and 2-fold prediction rate are 100% for this seven-gene signature. This signature was enriched for cell cycle genes.
Changed gene expression signatures during therapy in cervical cancer can predict outcome as measured by local control. After further validation, such findings could be applied to direct additional therapy for cervical cancer patients treated with chemotherapy and radiation.
Cervical cancer is the most prevalent form of cancer in women in developing countries and the second most common cause of cancer death. In the Unites States, because of the sensitivity and availability of Pap smears, the incidence is much lower, with just more than 11,000 cases of invasive cervical cancer diagnosed in 2007 (American Cancer Society Facts and Figures, 2007). Cervical cancer can be a lethal disease, however, with the survival rate of patients diagnosed with locally advanced cervical cancer (stage II–IV) of only 50% at 5 years.
Treatment for cervical cancer has improved through randomized trials showing the benefit of sensitizing chemotherapy added to radiotherapy for locally advanced disease (1–3). However, there remain a number of patients who still fail treatment, for whom salvage therapy has limited success because it is applied too late. Cancer treatment decisions are based on the results of trials evaluating a group of patients with a certain disease as a whole. Therefore, by following group guidelines, there is a subset of patients who will be overtreated (who perhaps did not need the full treatment), as well as a subset of patients who will be undertreated (who will fail even with the standard treatment). A test identifying these two groups could be used to individualize treatments, giving less aggressive therapy to the first group and more aggressive therapy to the second group. In other malignancies, such as breast cancer, biological markers are already used to guide treatment decisions.
The biological signature of a cancer can be used to identify tumors that behave clinically different. Often these signatures are based on gene expression profiling (also referred to as cDNA microarray profiling), in which the cellular mRNA levels are measured. Microarray profiling has been done for more than 15 years, and it generates enormous quantities of data that reflect at some level the innate biology of the tumors studied. This approach has been successful in some cancer types in identifying groups of patients with more aggressive tumors who are more likely to fail treatment, such as has been done in breast cancer (4). Gene expression profiling in cervical cancer has been done and can differentiate the pathologic subtypes of cancer versus benign conditions (5–7). There are additionally cervical cancer gene signatures identified that predict resistance to radiation alone (8, 9). A recent study evaluated patients by microarray that were treated with modern therapy consisting of chemotherapy and radiation (10). Although the sample numbers were small, they were able to identify gene expression patterns that were different in patients that did or did not fail treatment. However, a significant gene signature separating patients who failed and those who did not was not identified.
This article describes the results of gene expression analysis correlated without come from a phase II RTOG trial. We find that gene expression changes are able to predict response as measured by local-regional control in cervical cancer to standard treatment with chemotherapy and radiation. This seven-gene signature, predicting outcome from treatment, could be applied in the future to predict who with cervical cancer will fail standard treatment and may require adjuvant treatment or surgery to achieve cure. Whereas gene expression profiling is not novel, our finding that changes in gene expression can be correlated with outcome has not been previously shown.
We performed gene expression profiling using tumor tissue from a national cervical cancer trial, Radiation Therapy Oncology Group (RTOG) 0128. This study was a phase II trial evaluating the benefit of adding Celecoxib to standard 5-fluorouracil and cisplatin chemotherapy and radiotherapy for the treatment of locally advanced cervical cancer (stage IB2 and IIB-IVA). In this study, biopsies were obtained both before treatment (pretreatment samples) and during treatment (mid-treatment samples). Samples were obtained from 34 patients, and 22 pretreatment, 14 mid-treatment, and 13 paired samples had RNA of sufficient quality to perform gene expression profiling. As previously reported (11), ~ 9,000 genes were studied for expression changes, and no differences in gene expression were measurable in pretreatment samples when comparing stage, age, or ethnic groups. There was a gene signature that separated histologic subtypes, however. In addition, unsupervised cluster analysis was able to separate gene expression profiles between biopsies obtained pre- and mid-treatment in the great majority of cases, showing 91 genes up-regulated and 251 genes down-regulated. Gene expression patterns also clustered patients into groups, although at the time of that analysis, clinical data had not been collected, so correlation of these “gene signature groups” with outcome was not possible.
Because the clinical data from RTOG 0128 have now matured, in this report we were able to analyze the gene expression profiles and their correlation with clinical outcome. We studied gene expression patterns pretreatment, mid-treatment, and also the changes in gene expression between these two time points to identify gene expression signatures that would predict failure or success of current standard treatment consisting of chemotherapy and radiation therapy for locally advanced cervical cancer. Although we were unable to identify a gene signature from patients before treatment that predicted outcome, we were able to identify a statistically significant seven-gene signature of gene expression changes that separated patients by outcome, defined as local-regional control or local-regional failure. Although others have reported gene expression changes with treatment in cervical cancer (10), this is the first study that we know of to show that these changes can predict the response to treatment as measured by local control.
The tissue samples were obtained from patients that participated in clinical trial RTOG C0128, a phase II study of the cyclooxygenase-2 inhibitor Celebrex (celecoxib) and chemoradiation in patients with locally advanced cervix cancer and who consented to have their samples used for research purposes. Briefly, 5-fluorouracil and cisplatin chemotherapy was administered starting with the initiation of radiation, according to the experimental arm of RTOG 9001 (1, 12). Celecoxib was administered at 400 mg twice daily for 1 y. The radiation treatment consisted of 45 Gy given to the pelvis in 25 fractions delivered once daily followed by intracavitary radiation to a total dose of ~ 85 Gy to point A as is standard.
A biopsy of the cervix was done with two to three passes of Tischler biopsy forceps. Tumor samples were obtained before treatment, referred to as the pretreatment sample, and at the first implant, referred to as the mid-treatment sample. Fresh tissue was placed immediately into RNAlater solution and sent via overnight mail to the RTOG tissue repository. The tissue was divided, with the large portion weighed and frozen in a small aliquot of RNAlater and the small portion placed into a paraffin block for histologic analysis. Total RNA was extracted with TRIzol reagent from homogenized tumor tissue. Clean-up procedures were done on the total RNA using RNeasy Midi kit (Qiagen) to ensure removal of any remaining contaminants. Total RNA quality was assessed using a spectrometer, and ratio absorbance at 260 versus 280 nm was determined. The quality of the amplified RNA was assessed using a 2100 Bioanalyzer (Agilent Technologies) to evaluate the purity of the RNA. The ratios of the integrated 28S RNA peak to the 18S RNA peak were used as an indicator of RNA quality after amplification.
The hybridization assays and data collection of expression values were done by the Microarray Core Facility at Utah Health Sciences Center. This facility consists of an Amersham BioSciences GEN III Array Spotter and a Gen III Array Scanner. To print microarrays, cDNA clones made from mRNA are PCR amplified and deposited onto a chemically modified glass surface to form a high-density microscopic array of these genes. Hybridization of labeled RNA samples from cervical cancer tissue versus Universal Human Reference RNA to the microarray defines the genetic expression differences. Specialized microarray chips were designed to include ~ 9,000 particular genes of interest in reference to carcinoma, in duplicate.
Data extracted from the scanner were imported into R. Appropriate R BioConductor (package limma) procedures were used for background correction, intra-array normalization, and inter-array normalization. Specifically, background correction was done according to the method of Smyth (13). The corrected intensities are the expected values given the observed intensity values of the model of fitting a convolution of normal and exponential distributions of foreground intensities using the background intensities as a covariate. Within-array normalization was done by print-tip LOESS (14, 15). Data between arrays were further normalized by the scale normalization method (14–16). M values, short for log 2(R/G), were averaged for duplicate spots. These M values were used for further data analysis.
Unsupervised hierarchical clustering analyses were done with Cluster software and visualized with TreeView (17) using complete linkage with appropriate distance calculations indicated under corresponding figures.
Paired t tests were used to study the treatment-regulated gene expression between pretreatment samples and mid-treatment samples.
Two-group t tests were used to discover various gene expression signatures as candidates for toxicity outcomes and efficacy outcomes. Reported P values are the adjusted P values using the methods proposed by Benjamini and Hochberg (18) except in the case of the initial finding of candidate genes for local regression efficacy markers. The cutoff of P values is generally 0.01 unless otherwise specified. In addition, gene signatures exclude genes that do not have any raw R values >100.
Pretreatment gene expression, mid-treatment gene expression, and Mdiff values were evaluated in relation to toxicity and efficacy end points. Mdiff values are calculated as mid-treatment M values minus pretreatment M values for paired samples and thus are measures of the gene expression changes. The toxicity outcomes were dichotomized by (a) whether patients experienced any grade 4 toxicity; (b) whether patient’s median toxicity grade is ≥2; (c) whether patients experienced any grade 3 or higher gastrointestinal toxicity; and (d) whether patients experienced any genitourinary toxicity. The cutoff points for toxicity dichotomization are rather arbitrary with the goal to have a relatively balanced sample size in two groups. Efficacy binary end points include local-regional control (LRC) status, local control (LC) status, and patient survival status. Because the LRC status was completely correlated with patient survival status, we only report LRC in the results.
Supporting vector machine methods implemented in R have been used to validate and refine the gene signature. Leave-one-out and 2-fold cross-validations were done. For the Mdiff efficacy signature validation, the first 2-fold cross-validation subgroup includes all the possible test samples including at least one LC status 1 sample; the second subgroup includes all possible test samples with both samples having LC status 0 (local control). Because the sample is small, we can generate the whole permutation set by relabeling response groups using the same number of responses. In this study, for the permutation set of 13 samples total with 2 with LC status 1 (local failure), we have a permutation set of 78. For each subset of genes, we try to generate a supporting vector machine model for each of the 78 data sets. We define the permutation separation rate as how many data sets can have a supporting vector machine that totally separates the two response groups. We use this as a benchmark to curb overparameterization. Heuristic methods used to refine gene signatures will be elucidated with corresponding results.
To corroborate our results with biological plausibility, genes that met the P value cutoff criteria indexed by Unigen cluster id along the P values were uploaded into the Ingenuity Pathway Analysis (IPA) software. Most of the reported gene names were the IPA annotation mapped through Unigen Cluster id. Genes were associated with biological functions and/or diseases in the IPA knowledge base. Fisher’s exact test was used to calculate a P value determining the probability that each biological function and/or disease assigned to that data set is due to chance alone. The cutoff for P values is 0.05.
Biopsies were obtained both before (pretreatment samples) and during (mid-treatment samples) treatment from 34 patients, and 22 pretreatment, 14 mid-treatment, and 13 paired samples had RNA of sufficient quality to perform gene expression profiling. We began by evaluating whether pretreatment or mid-treatment gene expression patterns alone could predict binary treatment outcomes, namely the LRC, LC, and/or patient survival status. Whereas we found that there were patterns of gene expression that varied across these outcomes, we found no genes expressed differentially using the adjusted P value cutoff we applied to these studies. Therefore, there is not a significant gene signature that can predict outcome from either the pretreatment or mid-treatment samples alone in this study.
Using the 13 paired pre- and mid-treatment samples with a multiple comparison adjusted P value cutoff of 0.01, we discovered 29 genes whose expression changed 2- to 5-fold pre- to mid-treatment. The unsupervised hierarchical clustering grouped these genes into two distinct groups. The first group of genes, labeled as cluster 1 in Table 1, started with lower expressions than normal tissue controls (Universal Human Reference RNA) pretreatment; interestingly, the expression of these genes went further down by the mid-treatment biopsy (B versus A, Fig. 1). The mean fold of down-regulation was 2.43 with a range of 1.25 to 4.05. In contrast, the second group of genes, labeled as cluster 2 in Table 1, started with higher expression compared with normal tissue controls, and on treatment, their expression went further up (B versus A, Fig. 1). The mean fold up-regulation was 2.05 with a range of 1.33 to 3.07. The functional analysis revealed that the major molecular and cellular functions of these genes are cell cycle (12 of 29), DNA replication, recombination and repair (16 of 29), and cell death (12 of 29). Many of these genes are also known to be associated with cancer (16 of 29; Table 2).
In the 13 patients with paired pretreatment and posttreatment samples, the LC, LRC, and survival status classification overlapped. We therefore used only the LC classification in all of the following analysis. Two patients are in the LC status 1 class [indicating they failed locally (median time to failure, 18.8 months)] and 11 patients are in the LC status 0 class [indicating that they achieved cure (median follow-up, 26.6 months)]. We studied the Mdiff values, generated as the M value difference between mid-treatment and pretreatment gene expression, in relation to the LC classification.
Three hundred thirty-eight genes were identified that have different Mdiff values between the two LC status classes with a false discovery rate–adjusted P value cutoff of 0.01. Of these, 287 could be mapped to IPA id. The top molecular and cellular functions associated with these genes are cellular growth and proliferation (83), cell death (63), gene expression (63), cellular development (58), and cell cycle (38; data not shown). The top diseases and disorders that these genes are associated with include cancer (78), connective tissue disorders (48), and hematologic disease (33; data not shown).
We were able to define a smaller subset of 35 genes that met the adjusted P value cutoff of 0.001. Thirty-two of them were mapped to IPA ids. The top molecular and cellular functions associated with these genes are cell growth and proliferation (11), cell death (9), cell morphology (8), cellular movement (7), and cell-to-cell signaling and interaction (6). The top diseases and disorders these groups of genes are associated with are cancer (11) and reproductive disease (9; Table 3). The two LC classes of samples can be clearly separated into hierarchical clustering using 338 genes (image not shown) or 35 genes (Fig. 2).
We were able to further obtain a validated gene signature consisting of seven genes heuristically. Specifically, 31 data sets were set up by including the top 5 up to 35 genes ranked by ascending P values, respectively. For example, the first data set includes the top 5 genes, the second the top 6, etc. The self-prediction rate for all data sets is 100%. For each gene set, the prediction rate of leave-one-out cross-validation prediction rates for the two subgroups with 2-fold cross-validation and the permutation separation rate are depicted in Fig. 3A. The self-prediction rate, 2-fold cross-validation rates, increases to and plateaus at 100%. The permutation separation rate is low at seven genes and increases when we include more genes. Therefore, we chose the top seven genes as the final gene signature of seven. The hierarchical clustering could still cleanly separate the two efficacy classes with these seven genes (Fig. 3B).
In this study, we have analyzed the association of gene expression signatures in cervical cancer with outcome, measured as local control, in patients from a national trial treated with chemotherapy and radiation, the standard of care. Although we were unable to find a gene signature from either the pre- or mid-treatment biopsy as a group that could predict outcome, we were able to identify a seven-gene signature whose expression changes significantly predict local failure. This is the first study of gene expression changes that we are aware of that predicts response to cancer treatment. Although other tumors may not be as amenable to a biopsy midway through treatment as is cervical cancer, our findings support protocols to attempt this approach because it seems to allow the identification of significant gene signatures even with fewer patient numbers. Knowing the future outcome in a patient midway through treatment would allow the application of additional therapy to those who are destined to fail, such as surgical intervention, when possible, or additional adjuvant chemotherapy. These decisions will of course need to be made only with additional validation of the findings of this work.
In cervical cancer, because of the ease of obtaining tissue from patients and the need to better classify patients as poor risk, gene expression profiling has been previously done. Several groups have obtained microarray findings using techniques similar to those used in this study to predict resistance to radiation alone in the treatment of cervical cancer (8, 9). Both of the studies have more vigorous cross-validation than this study due to their bigger sample size and more balanced sample number in different response classes. However, none of them have used the Mdiff measures because both use biopsies from a single time point, pretreatment. In this study, we obtained a later tissue biopsy, and we were able to show that altered gene expression is a powerful predictor of response to treatment, even in this small study. We hypothesize that because there are several levels of translational control, perhaps changing gene expression levels might be more accurate surrogates for gene expression response than stable mRNA levels, allowing significant findings even with small patient numbers.
We were able to find similarities between our Mdiff signatures and findings from both studies predicting radioresistance. The caveat is that all the cDNA arrays are custom made and may not necessarily represent the same set of transcripts. For example, 16 genes are described in the study by Wong et al. (9), but only 7 of them are represented on our cDNA arrays based on National Center for Biotechnology Information accession number. Of these seven genes, DDB1 (damage-specific DNA binding protein 1) was also identified in our study with significant Mdiff values between the two response groups. The cDNA microarrays used in the Kitahara et al. (8) study have the most extensive gene presentation; also, the study was based on more patient samples. Our long Mdiff gene signatures do overlap with this study as well; both studies found SRF (serum response factor, c-fos serum response element-binding), FBLN5 (fibulin 5), SPARC (SPARC/osteonectin), CREM (cyclic AMP responsive element modulator), and DDX1 [DEAD/H(Asp-Glu-Ala-Asp/His) box polypeptide 1] as predictors of local control versus failure. Whereas the similarities in our findings are reassuring, we believe that in the era of standard therapy consisting of chemotherapy and radiation, our findings are currently most applicable to predict response to treatment.
A recent report from M. D. Anderson Cancer Center also described a study where biopsies were taken at two time points, with gene expression measured at both points in locally advanced cervical cancer treated with chemotherapy and radiation (10). In this study, there was a pretreatment biopsy and a biopsy obtained 3 days after the initiation of chemotherapy and radiation. The authors also confirmed gene expression changes, yet with unsupervised clustering, they were unable to identify gene signatures that predicted recurrence for their patients. In our study, the second biopsy occurred approximately halfway through treatment, and it is possible that the gene expression changes we are detecting reflect cell death, and this could account for our ability to use these gene expression changes as biomarkers of local control.
The cluster analysis of the raw gene list whose expression changes predicted local control versus failure identified several that grouped in cell cycle regulation. In fact, in our list of differentially expressed genes, 7 of the 35 gene signature are cell cycle associated, and the odds of 7 cell cycle genes appearing in a random list of 35 is P = 4.3 × 10e−7. These findings suggest that control of cell cycle after exposure to chemotherapy and radiation is an important factor in ultimate tumor cell death and local control in cervical cancer. The genes remaining in the seven-gene signature also were interesting. One example is CD58, which has linkages with T-cell immunity and, possibly, p53 function (19). High CD58 has been associated with progression of cervical intraepithelial neoplasia (20) to invasive cervical cancer. In addition, the WFS1 gene was found in our seven-gene signature and has been implicated in cancer and is part of the endoplasmic reticulum stress response (21). Finally, NKG7 was identified in the signature and has been shown to be induced by bovine papillomavirus type 1, suggesting that this gene might be regulated by papilloma virus (22). Because human papillomavirus infection has been shown be associated with a better response to radiation (23), one could hypothesize that one mechanism may be through regulation of NKG7. Some of these genes could be potential future targets to enhance response to therapy.
This study has some major limitations that should be noted, including our inability to study survival status separately from local control or to perform multivariate analysis due to the small sample size. The gene signature identified in this study will need to be validated in a large independent study. However, our findings do support the future study of gene expression changes in cervical cancer, as well as other cancers, as potential new ways to predict local control.
We thank Tony Magliocco for assistance in interpreting the gene findings.
Grant support: RTOG U10 CA21661, CCOP U10 CA37422, and Stat U10 CA32115 grants from the National Cancer Institute. J.B. Weidhaas was supported by K08 grant CA124484.
Note: The contents of this article are the sole responsibility of the authors and do not necessarily represent the official views of the National Cancer Institute.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.