We identified predictive gene signatures associated with both OS and 5-year OS that were independent of IGCCCG risk classification as prognostic/predictive factors. Good outcome was associated with two broad gene sets: immune function (in particular, immunoglobulins) and repression of differentiation. Conversely, poor outcome was associated with genes involved in active differentiation. The gene signatures stratified intermediate- and poor-risk patients into highly curable and highly resistant groups. Anecdotally, the gene signature also may have the ability to stratify good-risk patients, because the only good-risk patient who died as a result of disease was predicted to have a poor outcome; however, this pattern clearly will require additional studies to establish.
Although B cells seem to be the likely source of immune gene expression, preliminary studies to determine whether B cells or tumor cells are the source of immunoglobulin gene expression were inconclusive. No discernible differences in the numbers of infiltrating B cells in high- and low-expressing tumors were observed by immunohistochemistry with pan–B-cell markers. However, the immunoglobulins were not significantly expressed in GCT cell lines, and polymerase chain reaction analysis showed that the immunoglobulin gene rearrangements were not clonal in nature (data not shown). There also have been reports that immunoglobulin gene rearrangement and expression can occur early in the ontogeny of germ cells,19
which suggests that the tumors also could express these transcripts. Notably, in a previous study of 19 tumors, expression of immunoglobulin genes also was observed in both EC and YST.20
Interestingly, SEM often shows lymphocytic infiltration, and SEM that we profiled also expressed many of these genes. When our gene predictor was applied to 14 pure SEM specimens (unpublished data), all were correctly predicted to have good outcome, which suggests that the model may also be applicable to SEM. However, we currently lack SEM specimens with poor outcome necessary to rigorously test the model.
Genes predictive of poor outcome were enriched for associations with active differentiation processes, such as kidney, skeletal, and—most prominently—neural differentiation. Although these genes may be only markers of outcome, we believe there may be functional significance to the differentiation signatures. ZIC1
, which interacts with GLI proteins in the Hedgehog pathway,21,22
was strongly associated with poor outcome and has been implicated in medulloblastomas and endometrial cancer.23,24
We previously found that induction of differentiation of the pluripotent EC cell line NT2/D1 into neural lineages25
results in resistance (unpublished data) and elevation of ZIC1
expression. Activation of smoothened signaling within the Hedgehog pathway also was evident in these tumors; smoothened plays an essential role during normal CNS development26
and can lead to development of cancer, including brain and skin tumors,21
when aberrantly activated. Interestingly, the neural genotype was observed in all histologic subtypes, even in the absence of a neural phenotype. These results suggest that differentiation, particularly into neural lineages, is associated with poor outcome and may result in cisplatin resistance.
Synaptopodin 2 (SYNPO2
), a putative tumor suppressor gene also known as myopodin, had the highest predictive value in the 5-year OS analysis. We recently showed frequent downregulation of SYNPO2
as a result of 4q loss,27
which occurs frequently in GCTs and which is consistent with a possible role as a tumor suppressor. SYNPO2
expression has been reported to suppress tumor growth, its cellular localization is stage dependent, and loss of expression was associated with a poor outcome in both prostate28
and bladder cancers.29
These data imply a relationship between the process of differentiation and chemotherapy resistance in GCT and are consistent with clinical observations. SEM is more sensitive to chemotherapy than NSGCT, and this observation is embedded in the IGCCCG guidelines.1
All teratomas (TERs) are resistant to chemotherapy, and neural and skeletal differentiation are among the most common forms of malignant transformation. AFP and HCG are biochemical markers of yolk sac and trophoblastic differentiation, respectively, and higher values of each are associated with a worse outcome.1
These data implicate somatic, yolk sac, and trophoblastic differentiation pathways in development of drug resistance.
Notably absent from the outcome-associated gene lists were genes involved in DNA repair. Previous studies have postulated that reduced levels of expression of DNA repair genes are responsible for the chemotherapy sensitivity of GCTs,30
but we did not observe enrichment of DNA repair genes in the outcome signatures.
One potential criticism of this analysis is the heterogeneity of the tumor set. This heterogeneity, however, represents the clinical setting that confronts clinicians. GCTs often present with mixed histologies and at different primary sites (ie, gonadal or mediastinal), and only one gene was significantly differentially expressed according to site. Moreover, residual tumors resected after chemotherapy represent those resistant to therapy. Although there were 37 genes that were changed with chemotherapy, none were in the top 25 predictive genes, and many were associated with pluripotency and were more highly expressed in untreated tumors. In a previous study, loss of expression of OCT3/4
, a core transcription factor required for pluripotency, was associated with cisplatin resistance.31
These observations are consistent with the hypotheses that undifferentiated GCTs are more sensitive to cisplatin and that residual tumor after cisplatin treatment is depleted of sensitive, pluripotent elements and is enriched for resistant, differentiated elements. We believe that the inherent heterogeneity of GCT enhanced our ability to detect molecular signatures associated with outcome in tumors and accurately reflects those seen in a clinical setting.
There are several likely explanations for the lower classification rate of the training set compared with the validation set. In several instances, we profiled multiple tumors from patients with poor outcome, and one was randomly chosen for the outcome analysis. In three such instances, the patient's outcome was misclassified, but we subsequently observed that other tumors from these patients carried poor outcome signatures; replacement of the original tumor with the second tumor resulted in correct classification of all three patients (data not shown). This suggests that, when multiple tumors are available from a patient, all should be profiled, because one metastasis may carry a poor outcome signature although the others may not. Other unusual occurrences included two good-outcome patients who died after the 5-year cutoff; one was predicted to have poor outcome. Two other good-outcome patients who were predicted by PAM to be poor-outcome patients experienced late relapses after 10 years. Similarly, two good-outcome patients with PAM-predicted poor outcome developed second primary tumors. Because bilateral GCTs are thought to represent separate tumors,1
it is possible that the gene expression signature may have identified a field defect in these patients. Surgical cure is a final confounding problem, because resection of residual disease improves outcome after chemotherapy but cannot be predicted by a gene signature.11,32,33
This may explain why some patients with poor-outcome signatures had good outcomes. In fact, such gene signatures could be an indication for surgical intervention in some scenarios.
Some patients with pure TER specimens had poor outcome signatures; several of these patients died as a result of disease, particularly if secondary malignant transformation was present. Pure TER usually is considered a benign disease, despite its resistant to chemotherapy. Hence, TER should not automatically be viewed as benign disease. We previously have reported wide variations in p53 and Ki67 expression in TER.34
Rather, pure TER may harbor residual, malignant GCT not observed pathologically or may be programmed for malignant transformation. These data set the stage for gene signature studies that may permit more precise decision making regarding surgery after chemotherapy.
In conclusion, we identified gene signatures associated with outcome in patients with GCTs. Our results indicate that signatures representing immune function and repression of differentiation are associated with good outcome, whereas signatures representing active differentiation, particularly into neural lineages, and loss of the pluripotent genotype are associated with poor outcome. Adaptation of subsets of these genes for use in clinical assays could result in improved outcome prediction and risk stratification. We have initiated studies to extend our observations in the following ways: validation of these signatures in independent sets from other institutions; development of a diagnostic subset for use on paraffin tissue; and evaluation of serum MDK expression to determine whether this marker is independent of AFP and HCG levels and is representative of the differentiated state.
Expression values were generated by using the robust multiarray average method5
Because the training and validation sets were generated at different times and on different batches of microarrays, each probe set in the validation set was adjusted such that the median expression value was the same as the corresponding probe set in the training set. Follow-up time was calculated as the difference between the start date of chemotherapy and the date of last follow-up. The median and range of follow-up times in were calculated only on those patients who were still alive.
For the binary analysis of 5-year OS, a modified version of the PAM algorithm was used.7
The modification was that the shrinkage threshold was a parameter in the cross-validation rather than chosen on all the data. The classification rate was estimated on the basis of 10-fold cross-validation that was repeated 25 times. Because the threshold with the lowest overall misclassification rate gave a trivial result on the test set (ie, all samples were classified as having good outcome), we examined thresholds with slightly higher error rates that contained between 50 and 200 transcripts in the classifier. The models then were applied to the validation set for testing of performance. Survival curves that compared the predicted good- and poor-risk groups were generated by using the Kaplan-Meier method, and the difference in survival between these two groups in the validation set was tested by using the log-rank test. The predictive gene set was included in a multivariate logistic regression model with IGCCCG risk stratification, which was treated as a continuous variable.
For analysis of the OS end point, a predictive score was developed by using the weighted sum of the most significant genes. Specifically, the genes were ordered on the basis of the likelihood ratio test for the univariate Cox model, and the weights were the coefficients in that model.8
Predictive accuracy was measured by using the concordance index. The concordance index could range between 0.5 (ie, random prediction) and 1 (ie, perfect prediction). The best model on the training set was the one with the highest concordance index, and 10-fold cross-validation was used to protect against overfitting. For generation of Kaplan-Meier curves, the test samples were split into two groups on the basis of the median score. Differences in survival between the groups were evaluated by using the log-rank test. For testing in a combined model with IGCCCG risk stratification, the multivariate Cox model was employed. For all analyses, a cutoff for significance of .05 was used.
To identify genes associated with AFP, HDH, and HCG levels, patients were divided into high- and low-expressing serum groups, and then gene expression levels for those groups were compared by using the MaxT function9
within the multitest package in Bioconductor.6
This method was used to adjust for multiple comparisons. MaxT was run with 1,000 permutations to give adjusted P
values. Patients were considered to have high serum marker levels by using the following conservative criteria: AFP greater than 100 ng/mL; HCG greater than 100 mIU/mL or HCG, nicked greater than 100 ng/mL; and LDH greater than 400 U/L. Genes were considered significant if they had adjusted P
values less than .05.