|Home | About | Journals | Submit | Contact Us | Français|
Adjuvant radiotherapy is an important clinical treatment for the majority of gastric cancer, a common cancer. However, radiotherapy is a double-edged sword. It is necessary to develop a method to predict radiosensitive patients who are most likely to benefit from radiotherapy. Using the publicly available data of gastric cancer from The Cancer Genome Atlas (TCGA), we developed a gene signature that predicts radiosensitive patients through estimating a new index, nominal HR (nHR) (HR product of sensitive genes), for each patient. In this study, we provided several results to validate our prediction. Cross-validation results showed that the predicted radiosensitive patients who received radiotherapy had significantly better survival than predicted radiosensitive patients who did not receive radiotherapy. After adjusting for other clinical factors, including age, sex, target therapy, histologic diagnosis, tumor stage, the benefit of radiotherapy on predicted radiosensitive patient remained significant. In addition, predicted radiosensitive patients who received radiotherapy had a significantly reduced rate of disease progression. Taken together, we have obtained a set of genes, to identify radiosensitive patients with gastric cancer. These genes may be potential biomarkers for diagnosis and treatment of gastric cancer, which could give new insight for revealing the underlying mechanism of radiosensitivity of gastric cancer.
Based on GLOBOCAN 2012, gastric cancer (GC) is the fifth most frequently diagnosed cancer and the third leading cause of death from cancer worldwide (1). A recent study shows that an estimated 679,100 new GC cases and 498,000 deaths occurred in China (2). Except for a few regions where screening is performed widely, GC is often presented in the advanced stages in most parts of world. Surgery remains the primary treatment for the patients with GC. Non-surgical treatments such as chemotherapy radiotherapy and targeted therapies also play a key role in prolonging patient life (3).
Radiation therapy can be an important part of treatment for GC. Several randomized trials had assessed the clinical effect of radiation therapy (preoperative, postoperative or palliative) as treatment for GC (4–7). Yet, these studies have provided neither uniform nor positive results. In order to reduce radiation toxicity as much as possible, intensity-modulated and 3-D conformal radiotherapy had been developed (8). In the present study with a different direction, we considered if there is a set of genes that could predict the radiosensitivity of patients. This would allow these patients to obtain the maximum benefit from radiotherapy. The identification of molecular markers is a useful tool for clinical management in GC patients, assisting in diagnosis, in evaluation of response to treatment, and in development of novel therapeutic modalities. Furthermore, an identification of radiosensitive or non-radiosensitive signatures for GC would be beneficial in guiding radiotherapy in clinical practice.
We obtained the RNA sequence data for GC from The Cancer Genome Atlas (TCGA, http://cancergenome.nih.gov/). An internal cross validation was developed using the cross-validated adaptive signature design. This design combined the gene signature development and the validation test into a single data set, as introduced by Freidlin et al (9), Freidlin and Simon (10) and Tang et al (11). Using this novel idea of combination as a guide, we extended the approach to the proportional hazard model and developed a radiosensitive gene signature for predicting radiosensitive patients with GC.
All data including clinical information and RNAseq expression were downloaded from The Cancer Genome Atlas (TCGA, http://cancergenome.nih.gov/, update at March 2016). First, we combined the clinical information, including survival time, radiotherapy, chemotherapy, and other information from clinical files downloaded from TCGA. Clinical data are available for 445 patients. Then, we filtered the patients with missing survival information and obtained the data for 418 patients. After removing duplicated patients from raw data with 452 samples and 20,532 genes, expression data were selected including 418 patients and 20,502 genes with clearly identified gene names. We merged the clinical and the expression data and obtain 393 patients for further analysis. Thereafter, we filtered the genes with a maximum expression value <10 as they showed almost no expression. Genes with proportion of zero expression >75% were also removed. We calculated the variance of expression for each gene and kept the genes with variance >20% quartile. Then we standardized the expression data after filtering the patients with missing radiotherapy information, obtaining data for 371 patients with 16,125 genes expression profiles for the final analysis. Lastly, we entered the missing values in clinical data by multiple imputations using the R package mice. The cleaned clinical data are summarized in Table I.
In the present study, the radiosensitive patients were defined as a group of patients who had better survival if they received radiotherapy. To develop the patient radioactive sensitive signature for predicting radio-sensitive patients, we used the following modeling assumption: there is a subset of S predictive/sensitive genes that significantly interact with radiotherapy. The survival benefit of radiotherapy is associated with these predictive genes through the Cox proportional hazards model with the following equation:
where h0(t) is the baseline hazard function; λ is the effect of radiotherapy; r is an indicator for radiotherapy with '1' indicating radiotherapy and '0' otherwise; b1 to bS are the main effects for these S sensitive genes; and i1 to iS are radiotherapy-expression interaction effects that reflect the degree by which the effect of radiotherapy on survival is influenced by the expression levels of sensitive genes.
If the main effects and radiotherapy-expression interaction effects are negative, patients who overexpress the sensitive genes will have a higher survival probability under radiotherapy as compared with non-radiotherapy. We assume that a fraction of the patient population overexpress some, but not necessarily all, of the sensitive genes. The total hazard ratio (HR) would tend to be less than a preset threshold value (such as <1). Then, these patients who have a relative high probability of survival are called radiosensitive patients.
Freidlin et al (9), and Freidlin and Simon (10) developed a novel cross-validated adaptive signature design to identify sensitive patients in clinical trial for binary outcome. Following their framework, we extended and modified this approach to a proportional hazards model (11) and applied it to develop a radiosensitive gene signature for the present data. A K-fold cross-validated procedure for gene signature development is described by the following three-step procedure.
Step 1. Training step. The data were randomly split into K parts with the same sample size (usually K=10). Then, K-1 parts were used as training data to fit the model and to predict the radiosensitive patients in the left-out part (validation data). In the training data for each gene j, Cox proportional hazards model was fit using the following equation: h(t/X)=h0(t)exp(rλ+xjbj+rxjij). Then, the p-values for ij were used to rank and select the genes.
Step 2. Prediction step. The top significant g genes was used to build a gene signature, and to calculate an index called nominal HR (nHR) using the equation
for patients in the validation data (k-th part). Here, λ was the value averaged over the estimates from g single gene models. Patients in the validation set who had nHR lower than a specified threshold R were classified as radiosensitive patients.
Step 3. Validation step. The above two procedures in steps 1 and 2 were cycled through and validated on each of the K pieces in turn. Each study patient only appears once in one of the validation data. After the cross validation, each patient is classified as either radiosensitive or not radiosensitive. For radiosensitive patients, a log-rank test was then performed to test the survival difference between radiotherapy and non-radiotherapy groups at a specified significance level, such as 0.05. A significant result indicated that radiotherapy is beneficial for predicted radiosensitive patients. Then the gene signature was considered potentially effective and the prediction of radiosensitive patients was accurate.
In the above procedure, there are two key tuning parameters g and R in the prediction step. The optimal values of the tuning parameters g and R are usually not known in advance. Therefore, all the possible combinations for g and R were tried and tested. We used a nested inner loop of K-fold cross-validation approach on the training data to select the best tuning parameter values without affecting statistical validity of the procedure. A similar procedure could be found in Freidlin et al (9) and Freidlin and Simon (10). More details with a work flow plot can also be found in the supplementary files in our previous study (11).
The 10-fold cross validation in the above procedure was recommended which permitted the maximization of the portion of study patients contributing to the development of the diagnostic signature and the minimization of prediction error (12). Beyond 10-fold cross validation, leave-one-out cross-validation (LOOCV) is often mentioned in internal validation. It is known that LOOCV could provide similar and stable results, compared with 10-fold cross-validation. However, LOOCV can be very time-consuming to implement (12).
Table I summarizes the results of the clinical information. The median survival is 30.86 months with the 95% CI (25.6–57.4). The 5-year survival rate is 36.60% (28.0–47.8%). Univariate analysis shows that several clinical factors, including radiotherapy, are significant factors for overall survival. However, multivariate survival analysis shows that radiotherapy did not show significance for the overall survival, with an HR of 0.556(0.307–1.004) and p-value of 0.0516.
Following the proposed three-step procedure, we analyzed the present data to get the tuning parameters by 10-fold cross-validation. Then, the gene signature was developed for predicting radiosensitivity. Fig. 1 shows the corresponding p-values profiled by log-rank test among radiotherapy and non-radiotherapy groups for predicted sensitive patients. Our result shows that gene signatures including the top 11 significant genes can provide a powerful prediction with the smallest p-value as 3.810E-09. The smallest p-value appears when the tuning parameters g and R are 11 and 0.065, respectively. Table II summarized the 11 genes included in the radiosensitive gene signature and their interaction effects with radiotherapy.
Following the standard validation procedure we proposed herein, 189 patients were predicted as radiosensitive patients, while the other 182 patients considered non-radiosensitive. We compared the survival of radiosensitive patients who received radiotherapy and non-radiothreapy. Fig. 2A shows the survival curve for these predicted radiosensitivity patients. The significant difference with p-value 9.84e-09 suggested that the predicted radiosensitive patients were reasonable, as they strongly benefited from radiotherapy. Fig. 2B shows the comparison for non-radiosensitive patients under radiotherapy and non-radiotherapy. No obvious difference was detected between the two groups, suggesting that the benefit of radiotherapy on these non-radiosensitive patients may not be as expected. We further compared the survival among radiosensitive and non-radiosensitive patients when they were all under radiotherapy treatment as shown in Fig. 2C. As expected, a strong positive effect of radiotherapy on radiosensitive patients was observed. Taken together, the predicted radiosensitive and non-radiosensitive patients were reasonable results. The radiosensitive gene signature is predictive for both radiosensitive and non-radiosensitive in radiotherapy. Fig. 1D shows that predicted radiosensitive patients had a poorer survival than the predicted non-radiosensitive patients when they were all under non-radiotherapy.
We further performed univariate and multivariate analyses using the cox proportional hazards regression analysis to assess prognostic benefit amount of radiotherapy. Fig. 3A demonstrates that radiotherapy was strongly associated with the improved survival for non-radiosensitive patients, with adjusted HR of 0.11 (0.04–0.30). For non-radiosensitive patients, radiotherapy may not improve the overall survival, with adjusted HR of 1.43 (0.72–2.84). We also compared the survival between predicted radiosensitive patients and non-radiosensitive, when both of these patients received radiotherapy as shown in Fig. 3B. It is clearly shown that there is a significant survival benefit for predicted radiosensitive patients as compared with non-radiosensitive patients, with adjusted HR of 0.06 (0.01–0.22). These results suggest that the prediction on radiosensitive patient was accurate, and the positive radiotherapy effect on predicted radiosensitive patients was effectively validated as expected.
To further validate the predicted radiosensitive and non-radiosensitive patients, we performed an association study between radiotherapy and two clinical assessment indexes: the new tumor event and the progressive disease. The new tumor event is the important clinical index for prognostic outcome. According to TCGA, the new tumor event was defined as metastatic recurrent and new primary tumor after initial treatment. The results of Chi-square analyses are summarized in Fig. 4A for the new tumor event and in Fig. 4B for the progressive disease. Although the results did not suggest a significant difference in a lower rate of the new tumor event among different groups, the rate of progressive disease were significantly lower for predicted radiosensitive patients who received radiotherapy (Fig. 4B). These results were consistent with survival analysis and further validated our predicted sensitivity results.
To find the association between predicted sensitivity and clinical factors, we performed univariate and multivariable logistic analyses. Table III summarizes the results. The univariate and multivariable analyses suggested that only T and M stages had a significant association with predicted radiosensitivity. We performed a strata analysis for T and M stages. Log-rank tests suggested that the predicted radio-sensitive patients in the radiotherapy group had better survival as compared with non-radiotherapy group, independent of stages (Fig. 5A, C and E). For non-radiosensitive patients, radiotherapy may not be a factor that is beneficial (Fig. 5B, D and F). A similar result was also reached for different M stage (Fig. 6).
We extracted the expression pattern of these 11 genes to perform a hierarchical cluster analysis by using R packages pheatmap. As shown in Fig. 7. All patients were classified into two groups according to the hierarchical cluster analysis. The blue bar below the dendrogram denotes the predicted radiosensitive patients, while the yellow bar denotes the predicted non-radiosensitive patients. Interestingly, most of the predicted radiosensitive and non-radiosensitive patients were well matched with the result of the hierarchical cluster based upon the selected gene signature. The dendrogram shows 162 out of all 189 predicted radiosensitive patients located on the right branch and 128 out of all 182 predicted non-radiosensitive patients located the left branch. This result further validated the predictions of radiosensitive patients and suggested that the radiosensitive gene signature we developed is predictive and reasonable.
GC is a radioresponsive cancer. Radiotherapy plays an important role in the treatment of locally advanced GC in preoperative, intraoperative, postoperative or palliative settings (5,7,13–18). The clinical trial conducted by Zhang et al suggested that preoperative radiotherapy improves resection rates and survival (4). But radiotherapy as a single modality treatment cannot improve survival in patients with locally unresectable GC (5). There are several trials that have evaluated radiotherapy in a combination modality treatment for GC. For instance, in 2001, the pivotal intergroup trial SWOG 9008/INT-0116 has established a standard postoperative chemoradiation therapy (CRT) in the management of GC (6). Their results suggested that postoperative CRT should be considered for all patients at high risk for recurrence of GC and who have undergone curative resection with D0 or D1 lymph node dissection. However, D2 lymph node dissection, which is considered a standard lymph node dissection for advanced GC, was not widely performed at the time. Therefore, the recently completed phase III trial, the ARTIST trial, evaluated the role of postoperative CRT in patients who received curative resection with D2 lymph node dissection. Regrettably, their results showed that postoperative CRT did not significantly reduce recurrence (7).
Radiotherapy still has controversial effects on treatment for GC. In theory, radiotherapy is a double-edged sword, which not only can kill tumors but also can cause damage to surrounding normal tissue. In fact, 41% of patients in INT-0116 had grade 3 or 4 toxic effects, with 17% stopped treatment because of toxic effect (6). Therefore, in order to solve this problem, most studies have focused on how to more accurately carry out radiotherapy to reduce the damage of normal tissue (8,19–21). In this study, we questioned whether we could find a way to identify the patients who would be more sensitive to radiotherapy. These patients could then receive the maximum therapeutic effect with minimal toxicity. As advocated by the principles of precision medicine, the treatments should be targeted on the need of individual patients on the basis of genetic characteristics (22). With continued progress in identifying biomarkers of radiotherapy response, the role of radiotherapy in GC treatments will likely become better defined. Therefore, we developed a radiosensitive gene signature for GC patients.
External validation is the best way to develop gene signatures, especially the gene signature development based on high-dimensional gene expression data (23–25). In this study, a nested inner loop 10-fold cross validation was used to find the radiosensitive gene signature and radiosensitive patients. The 10-fold cross validation maximized the portion of patients in the study contributing to the development of a gene signature and minimized prediction error (12). In addition, it also maximized the size of the sensitive patient subset used to validate the signature error (12). Moreover, the proposed cross-validation procedure could evaluate the major clinical outcomes, including side effects of radiotherapy on radiosensitive patients. Except for the 10-fold cross validation, a split-sample method and LOOCV are frequently used for internal validation. For computationally burdensome analyses, 10-fold CV may be preferable to LOOCV (12). For current nested inner loop cross validation procedure, LOOCV may lengthen the time it takes to complete the analysis. Usually, the results provided by the 10-fold cross validation is very similar to LOOCV (12).
In this study, we developed a new index, nominal HR, in a three-step procedure to identify GC patients that would be more sensitive by radiotherapy. Using the new index makes a clear separation of the patients, which makes it easy to identify radiosensitive patients. A radiosensitive patient is predicted if the estimated odds ratio (for binary outcome) appears below a specified threshold (R) for at least g of the significant genes (9,10). Our model using a product clearly estimates the sensitivity amount of each patient, which improves the ability to predict. The proposed model evaluated the whole genome expression data. The 11 genes were found to be the radiosensitive gene signature for GC patients.
Our results showed that the predicted radiosensitive patients who received radiotherapy had significantly better survival than both the radiosensitive patients without radiotherapy and non-radiosensitive patients who received radiotherapy (Fig. 2A and C). However, radiotherapy did not improve the survival of predicted non-radiosensitive patients (Fig. 2B). After adjusting for other clinical factors, a multivariate analysis suggested that radiotherapy was an independent factor of benefit on the predicted radiosensitive patients (Fig. 3). The reduced rate of the new tumor event and progressive disease were observed for predicted radiosensitive patients who received radiotherapy, which further provided strong positive evidence for our prediction (Fig. 4). Although the clinical stage was strongly associated with the predicted radiosensitivity, the survival of the predicted radiosensitive patients who received radiotherapy was significantly better than radiosensitive patients without radiotherapy, independent of stage (Figs. 5 and and6).6). The overlap of results from cluster analysis and predicted radiosensitive and non-radiosensitive patients also validated the radiosensitive gene signature (Fig. 7). Taken together, these validation results reveal that the identified radiosensitive gene signature is a powerful biomarker for predicting which GC patients would benefit from radiotherapy.
Our analysis not only developed a radiosensitive gene signature, but also detected genes that may be potentially associated with the molecular basis of GC. Based on the results, we find that several genes, including C9orf16, DNAL4, SPSB2 and ACD, are highly expressed in radiosensitive patients. Among these, C9orf16 is considered to be related to ovarian cancer (26), but it has not been studied in gastric cancer. DNAL4 encodes an axonemal dynein light chain that functions as a component of the outer dynein arms complex, acting as the molecular motor that provides the force to move cilia in an ATP-dependent manner. Furthermore, the protein encoded by ACD plays a key role in the assembly and stabilization of the telosome/shelter-intelomeric complex, which functions to maintain telomere length and to protect telomere ends. SPSB2 encodes a member of a subfamily of proteins containing a central SPRY (repeats in splA and RyR) domain and a C-terminal suppressor of cytokine signaling (SOCS) box. This protein plays a role in cell signaling. Despite the lack of studies on these genes in gastric cancer, our results show that these genes may play a key role in the progression of gastric cancer. The detailed mechanism of how these genes function in gastric cancer will be the main aim of our next study.
We acknowledge the contributions of the TCGA Research Network. This study was supported by the National Natural Science Foundation of China (no. 81573253 and 81773541) to Z.-X.T., a project funded by Jiangsu Provincial Medical Youth Talent to J.Z., a project funded by Suzhou Science and Technology Bureau (no. SYS201672) to H.-G.H., and a project funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions at Soochow University.