This study uses available genome-scale gene expression based analyses across glioblastomas to further investigate observed age effects in patient survival and creates a unified dataset for further exploration of gene expression correlates in glioblastomas. We add to the literature 86 additional GBM biopsy gene expression profiles performed on the U133A and U133 Plus 2.0 platforms. We confirm that within histologically defined GBMs, there are robust and repeatedly observed gene expression signature based groups of GBMs, which produces a classification scheme within GBMs. Within this aggregate dataset of 267 GBMs, the three previously defined molecular glioma subgroups were robustly detected as defined by over-expression of a series of related genes: Neurogenesis/HC1A (a.k.a. ProNeural), Mitotic/HC2A (a.k.a. Proliferative), and Extra-Cellular Matrix related/HC2B (a.k.a. Mesenchymal). From the larger dataset, we observe evidence of a new subtype that highly co-expresses genes in both the Pro and Mes groups, which we term "ProMes". Of these four molecularly distinct groups identified, only PN portends a favourable prognosis relative to the other expression based groups.
In this study, we determine that the reason that younger patients diagnosed with GBMs have longer survival durations than older GBM patients is due to the observation that younger patients tend to develop the favourable PN GBM type more commonly relative to older patients. The reason for this observed age effect is not clear at this time, but may have to do with the precursor cell that develops into GBMs changing over time. One could hypothesize that the precursor cell that gives rise to the PN type diminishes in abundance in the CNS with advancing age. However, given that GBM incidence increases greatly with age, the absolute numbers of PN GBMs is actually numerically higher in older patient groups. Thus, we favour a model that the precursor cell type that gives rise to the Pro and Mes types of GBMs are increasingly likely to become neoplastic over time while this effect is not as pronounced within the PN type precursors.
PN type GBMs represent a unique tumour aetiology whose idiopathic molecular mechanisms manifest in survival periods spanning from two to ten years in contrast to ten to fifteen months for the Pro, Mes and ProMes types of GBMs. Historically, the percentage of GBM patients who demonstrate long term survival of 3 years or more has been reported to be approximately 5% [18
]. This 5% incidence rate matches well with the observations reported here in which 8% (18/267) of our PN GBM patients are observed to survive 3 years or longer. The identification of the PN subgroup is important for patient management and stratification into small phase II clinical trials for experimental therapeutics as uneven representation of the PN GBM diagnoses would greatly alter observed survival times irrespective of potentially active agents [19
]. The identification of the gene expression subtype is clearly more important for patient stratification within clinical trials than age.
It is likely that the use of genome-wide expression based molecular classification will result in less variation in tumour diagnoses and provide more specific guidance to clinicians. The agglomeration of gene expression datasets permits meta-analyses that were insufficiently powered in multiple individual publications. Resources for the sharing of genome-scale expression datasets have been set up at Array Express, Gene Expression Omnibus, and Celsius [20
]. Critical to the sharing of microarray data is providing raw microarray data as opposed to processed data. In order to facilitate this sharing, the NIH Neuroscience Microarray Consortium has established Celsius, which is a community resource of CEL (image) files performed on the Affymetrix platform for public distribution using programmatic tools. At the writing of this manuscript, the Celsius database contains CEL files from human experiments performed on U95Av2 arrays (n = 5 006), U133A arrays (n = 13 818), and U133 Plus 2.0 arrays (n = 10 376). To fully capture and leverage the value of microarray expression data, a greater commitment must be made to capture and share clinical covariates and raw expression data. For instance, in this study of 267 glioblastomas, a total of 433 GBM CEL files were initially identified across the U95Av2, U133A, and U133 Plus 2.0 platforms. Of these, thirty-six percent (160/433) were immediately removed from this study due to a lack of any clinical data. Additionally, not all microarray CEL files or their matched clinical data points are systematically retrievable.
Amongst the samples gathered at UCLA, we simultaneously gathered additional covariates such as extent of surgical resection, Karnofsky Performance Scores (KPS), lesion locations, and MRI scans. We have begun to make all of these data available through a web interface in order to promote data sharing and exploration of gene expression differences in gliomas. All of the data added here are deposited in Gene Expression Omnibus (GEO), and can be explored at our real-time survival-synchronized search engine "Probeset Analyzer" [22
Clinical Decision Impact and Improved Public Disclosure
In large academic hospitals, tumours come from a wide variety of patients from across different cities, states, or countries. In contrast, local hospitals treat their regional constituencies. The potential for demographically-biased patient populations and biased tumour subsets is a possibility. These trends can reinforce particular treatment strategies at local institutions over time. For example, if patients from a community highly populated by retirees (e.g. southern Florida) presented with a GBM, clinicians would be apt to predict that these older patients would likely succumb to their malignancies within one year. Current treatment for patients diagnosed with high grade gliomas consists of surgical resection followed by toxic and expensive therapy schedules that are minimally effective. But if these elderly patients were suffering from a PN tumour, they would have a high likelihood for surviving at least two to three years or longer. These patients could then be distinguished from patients who otherwise present identically under the microscope or according to their patient biographical sketch. This would permit time to enrol in potentially beneficial clinical trials. Thus, if grade and age alone were considered for prognosis, these factors would lead clinicians to prescribe unnecessary treatments due to trends reinforced by regional sampling biases.