STS form a heterogeneous group of malignancies comprised of more than fifty distinct diagnostic categories [28
]. Histologic criteria are useful in predicting outcome in STS. For example, in a comparative study of the NCI and FNCLCC grading systems in 410 adult patients with non-metastatic STS [17
], both systems were of prognostic value for predicting metastases and overall survival, by univariate analysis. By multivariate analysis, high tumor grade, regardless of the system used, large tumor size (10 or more cm), and deep location, had independent prognostic value. Importantly, however, STS often exhibit heterogeneity of biological behavior even within diagnostic categories. This heterogeneity makes the clinical care of patients with these diseases challenging, and may also confound the development of drugs to treat these diseases. Thus, there is a need to move past simple histologic examination in STS trials.
The classification of STS has traditionally been determined by light microscopic examination of H&E stained tissues, in which recognizable characteristics are identified in the tumors, and more recently also by the use of genetic techniques (reviewed in [28
]). Classification of tumors by gene expression profiles has the potential to provide additional useful information that is free of observer bias and variability, and aid in tumor classification and diagnosis. Analysis of gene expression by microarray also has the potential to identify heterogeneity among STS [4
]. While one approach is to search for genes that correlate with clinical outcome, heterogeneity of the sample sets may complicate this search, as predictive genes may differ in different types of tumors. The search for genes that predict a particular behavior may also be complicated by genes whose expression is not relevant to the question of interest. Restriction of the number of genes in an analysis by elimination of irrelevant signals may help, if possible. In addition, relevant signals may or may not be different in different histological categories.
In the current report, potential subsets of STS were identified by gene expression profiles independent of knowledge of biological behavior using gene sets that differed between two subgroups of ccRCC, OVCA, and AF. The current report confirms that gene expression patterns can be used to identify subsets of STS directly, without searching for differences based on clinical correlates. This approach may also allow the identification of potential subsets that could be obscured by searching for patterns that discriminate between two predefined groups determined by a particular clinical outcome. Given the simplicity of the technology, there may be no need to identify the smallest gene set that can be used as a reproducible prognostic factor. Indeed, prognostic information may be lost by such an approach. The differences in gene expression observed between subgroups may reflect intrinsic differences in the tumor cells, differences in the host response to the tumor, or both. It is possible that these differences in gene expression patterns reflect differences in biology of these tumors, although we do not have adequate clinical outcome data to confirm this possibility. It is important to note that hierarchical clustering of any random set of samples using a random gene set may yield what appears to be two major clusters; further studies will be required to determine the reproducibility of the current findings and their potential practical utility.
Our findings, though limited by small sample size, suggest the existence of distinct subgroups within the MFH set. As tumors evolve, sequential mutation and/or epigenetic changes may result in increasing divergence in gene expression and biology from the original tumor. We previously reported that hierarchical clustering of gene expression patterns appeared to cluster some MFH samples near samples of liposarcoma, while others appeared to form more distinct clusters [9
]. It is possible that analysis of a larger number of samples will identify additional clinically and biologically relevant subsets of MFH.
The current report also demonstrates a wide range among STS subgroups in the expression of genes that code for targets of some therapies. For example, PRAME, CTAG1, and CTAG2 have been reported to be over-expressed in liposarcoma compared with a variety of normal tissues [7
], and recent studies have reported the expression of these and other CTAGs in several sarcoma subtypes [34
]. The current results confirmed our earlier report of over-expression of PRAME, CTAG1 and CTAG2 in liposarcomas using the U_95 array [7
] with the newer U_133 array, and also demonstrated that the expression of these potential targets of immunotherapy is heterogeneous among STS. Thus, studies in STS of agents with known targets would be strengthened by stratification by expression of target genes.
Notably, the three earlier studies that suggested the existence of two major subsets of AF, RCC, and OVCA, from which the gene sets used herein were derived, found differences in the expression of many extracellular matrix (ECM) genes between the respective subsets in each of the three diseases. This could be important for several reasons. First, the interaction of tumor cells with ECM proteins can have profound effects on cell biology, regulating signal transduction, apoptosis/anoikis, morphology, and tissue architecture. In this regard, expression of the ECM protein TGFBI has recently been reported to influence paclitaxel sensitivity of ovarian carcinoma cells [36
]. ECM proteins can regulate interactions between growth factors and ovarian hormones in mammary epithelial cells, and laminin inhibits estrogen-induced proliferation of breast cancer cells [37
]. Interactions with ECM components have been reported to alter TNF-alpha induced changes in endothelial permeability [38
], and the ECM proteoglycan decorin can inhibit growth of pancreatic carcinoma cells [39
]. Similarly, fibrin clots can promote motility of fibroblasts and endothelial cells, and fibrinogen degradation products may be angiogenic [40
]. Second, the response of the host to the tumor may also play an important role in tumor biology, and ECM expression, by either tumor cells or host stromal cells, could reflect the local host response. For example, stromal fibroblasts could produce factors that alter tumor cell growth/biology [42
It would be reasonable to stratify patients entering clinical trials using a technique similar to that described herein, in an attempt to decrease the problem of heterogeneity of the study population. Even if not performed in real time, tumor samples should be saved for a later analysis of this type. Since the results of a clustering analysis depend upon the composition of the sample set analyzed, such an approach would use a standard reference group of STS samples with clustering performed with the addition of the new test sample.
When comparing histologic grading of soft tissue sarcomas and their gene expression, it is important to take into consideration several facts. Histologic grading is an imperfect exercise; when the reproducibility of the European system was tested by 15 pathologists [43
], an agreement was reached in 81% of the cases for tumor necrosis, 74% for tumor differentiation, 73% for mitotic rate, and 75% for overall tumor grade, although the agreement for histologic type was only 61%. In a study in which the NCI and FNCLCC grading systems were compared, there were discrepancies in ~35% of the cases [17
]. Compared to the NCI system, the FNCLCC system produced a greater number of grade 3 tumors, a lower number of grade 2 tumors, and had better correlation with metastases-free and overall survival. Despite the limitations in reproducibility, numerous studies have confirmed the prognostic value of histologic grading of soft tissue sarcomas [44
Any new technique that attempts to establish new pathways for prognostication should be compared to currently available techniques, and its superiority should be documented. Gene expression profiles, in addition to providing potentially useful prognostic information, may yield insight into two important aspects of sarcoma; namely, the identification of therapeutic targets, leading to more individualized therapies, and second, better understanding of the genesis, progression, and biology of these tumors.
The current study supports the use of gene expression patterns as a complementary set of data that may augment the use of light microscopy to help classify STS. Analysis of a larger number of samples and correlation of biological phenotypes with gene expression patterns may identify clinically meaningful characteristics of the subsets identified herein.