Identification of more specific and sensitive markers for outcome prediction and response to therapy is required in order to further improve the choice of risk-related therapy for children with NB. Using a carefully selected set of 59 prognostic genes based on an innovative data-mining strategy, we performed a gene expression study on the largest NB patient series to date, covering 579 patients in total. Our robust prognostic multigene expression signature was tested on a large set of SIOPEN tumours from uniformly treated patients and validated on an independent set of COG tumours. The signature is a strong independent risk predictor, able to identify patients with increased risk in the current risk groups.
Our study is unique in that a carefully selected set of only 59 genes was tested on a large panel of 579 tumour samples, thus increasing statistical power and robustness through this high patient/gene ratio. Several previous studies have attempted to identify prognostic signatures in NB based on genome-wide mRNA expression profiles. However, an important limitation of most published gene expression studies is the lack of statistical power due to extremely low patient/gene ratio. As such, there are inherent but often overlooked statistical issues, such as data over-fitting, unstable gene lists, and lack of study power.21
Consequently, for any small set of tumours, a gene classifier can be easily established, with little or no utility if not validated on an independent patient cohort.
After having established and successfully tested our robust prognostic signature on the total patient cohort, we next assessed the value of the signature in relation to the currently used risk factors, using multivariate logistic regression analysis and survival analysis after stratification of the patients based on the currently used risk factors. The signature significantly discriminates patients in most of the clinical risk subgroups. Possible reasons for absence of discrimination in some subgroups might be the relatively low number of patients in these subgroups or not-sufficiently long follow-up times. Most importantly, the multivariate analysis attributes independent significant value to the signature. Based on this signature, patients with higher risk for death by disease can be identified (odds ratio of 19·32 (95%CI: 6·50–57·43)), indicating that our gene signature clearly outperforms the other risk factors. This demonstrates the potential of this gene expression signature for improved clinical management of NB patients.
Of further interest is that survival analyses within the groups of patients treated according to the current European treatment protocols clearly demonstrate that the signature enables the discrimination of patients with different disease outcome. This is an important finding, especially within the current high-risk category of patients treated according to the HR-NBL1 protocol, as currently no information on genomic aberrations or other factors is available in order to identify a group of patients with worse outcome in this subgroup of NB patients. Strikingly, all but one patient who achieved second complete response in this subgroup were classified as having low molecular risk and had encountered late relapses (median: 31·6, mean: 32·2 months) compared to the group of patients who did not achieve second complete response of which all but two were classified as having high molecular risk and had encountered early relapses (median: 10·0, mean: 13·6 months). Further, death from disease also occurred earlier in the group of patients with high molecular risk (average: 12·3, median: 9·6 months) compared to patients with low molecular risk (average: 27·1, median: 27·1 months). Consequently, patients with high molecular risk within this subgroup could be the future candidates for new and hopefully more effective targeted therapies and some of those with a low molecular risk might possibly be allowed for no transplant. On the other hand, patients who have a high molecular risk and who are currently treated with surgery alone or mild chemotherapy might benefit from more appropriate therapies, i.e. according to the current HR-NBL1 protocol.
An essential step in the validation procedure of our signature is its performance assessment on an independent set of COG tumours whereby all analyses were performed blinded to clinical and outcome data. Similar performance of the expression signature was observed, indicating that the signature can yield reproducible results in independent patient cohorts. Moreover, irrespective of possible confounding factors related to patient ethnicity, treatment with other drugs, and RNA extraction with different standard operating procedures, the success of this validation study also confirms the robustness of the signature. In comparison to existing NB classifiers, our signature has comparable performance within the total cohort of patients, but here, for the first time to our knowledge, the added value of the signature in comparison to currently used risk stratification systems has been confirmed on a totally independent set of tumours in a blind study.
In order to reduce the gene set to a smaller robust gene subset we used several methods including Spearman’s rank correlation clustering and selection of one or two genes in each gene cluster, top ranking univariate cox and logistic regression analyses and the rank product method. Although similar classification performance could be obtained, the 59-gene list always slightly outperformed the reduced lists.
Inspection of the genes being part of the signature reveals seven genes that have previously been linked to NB biology (MYCN, NTRK1, ODC122
) or have been proposed as positional candidate genes including CAMTA123
on 1p, BIRC525,26
on 17q, and CADM1
on 11q. Gene Ontology analysis of the signature and comparison of the gene list with the super PCNA gene (proliferation signature) (unpublished data; Detours et al.) revealed that only very few genes are involved in cell cycle regulation and proliferation. This remarkable finding is in contrast to signatures in many other cancer entities (e.g. breast cancer29
) in which typically more than two thirds of the genes are implicated in proliferation. In line with this, there appear to be very few genes involved in inflammation, also typically seen in other cancers.30
Additional Gene Ontology analysis of the prognostic gene list showed that genes implicated in neuronal differentiation such as PTN, NRCAM, DPYSL3, SCG2, DDC, FYN, NTRK1, MAPT, PMP22, CHD5
, and MTSS1
, are enriched amongst the genes higher expressed in low-risk tumours. Further scrutinizing of the prognostic gene list and functional analyses might reveal genes that play a role in neuroblastoma pathogenesis and therefore could serve as potential therapeutic targets or point at pathways involved in cancer that could be targeted by new therapies.
Important features of the applied RT-qPCR quantification strategy for marker gene expression analysis are speed, accuracy, cost-effectiveness, applicability in routine laboratories, and requirement of minimal amounts of RNA. As the tumour sample size is often very limited, the applied RNA amplification procedure offers great advantage to accessibility of material for diagnostic and prognostic work-up. Another key success factor of our strategy is the possibility of using universally applicable, quantifiable absolute standards. These synthetic standards not only allow careful monitoring and correction of inter-run variation but also enable the exchange of data between different laboratories, irrespective of the use of a different PCR instrument or reagents. Further validation of this strategy will enable the performance of large multicentre studies conducted at different sites (unpublished data; Vermeulen et al.). An important critical issue of all gene expression studies is RNA quality. The accuracy of gene-expression profiling might indeed be influenced by this metric depending on the quantification method used, the number of genes included in the classifier, expression differences, intra-group variability, and expression levels of the marker genes. Impact of RNA quality on gene-expression has extensively been discussed in the literature and conflicting data exist. In order to not compromise the conclusions of this study, we had stringent RNA quality and purity requirements and excluded ~20% of the samples. Moving forward, further studies should evaluate the impact of RNA quality on classification performance and establish a cut-off designating sufficient quality for reliable class prediction. At the same, standard operation procedures should be introduced to maximize the extraction and storage of high quality ribonucleic acids.
In conclusion, we established and validated a robust prognostic multigene expression signature in the largest NB population till now. The signature can act as an independent risk predictor enabling the identification of patients with increased risk in the current treatment groups. Important advantages of this signature compared to previously published gene expression classifiers are the need of smaller amounts of starting material, the lower number of genes, higher cost-efficiency and speed of the quantification method, and the possibility of cross lab data comparison. This study should form the basis for future investigations such as large well-defined prospective studies with international collaboration. A further challenge is the performance of an integrated analysis for determining the prognostic performance of combining this expression signature with other genomic features of the tumour including microRNA and gene copy number profiling and epigenetic markers along with the currently used clinicobiological factors for risk stratification.