Summary of the evidence
We have used a meta-analysis method to identify genes are shared and important in metastasis. The fact that these genes are involved in pathways that have been previously implicated in metastasis supports their involvement in the metastatic process. These genes may be useful as potential therapeutic targets or predicting clinical outcome. Since these genes and pathways are common to multiple tumor types as shown by the fact that they are enriched in various tumors, these may be targets that can be exploited in many different tumors. Drug discovery could be performed by finding inhibitors of identified pathways, such as FAK inhibitors.
One promising in silico approach of drug discovery is the use of the Connectivity Map to find drugs that can reverse a gene signature, such as the common metastatic signature, and as a result potentially reverse the metastatic phenotype [51
]. On preliminary analysis of the metastatic signature by the Connectivity Map, the top molecule that could reverse the common metastatic signature by the permutation analysis was camptothecin. Camptothecin is a topoisomerase I inhibitor that has been shown to induce apoptosis in tumor cells. Irinotecan and topotecan, which are analogs of camptothecin, are currently being used to treat several cancers including colon cancer, ovarian cancer, and gliomas [53
]. Interestingly, when ranking the drugs by Anatomical Therapeutic Chemical (ATC) codes, the top three ranking codes were all groups of antipsychotics, which may be related to the ability of some of these compounds to induce autophagy in experimental models [54
]. The molecules in these groups were significantly associated with reversal of the metastatic signature with good specificity for the signature. This may represent known and readily available drugs that could have a new application immediately without the need of developing a new compound, which would take many years of testing. The result of this in silico analysis can only be confirmed with more analyses and experiments; however, this shows the promise of applying this signature to the prediction and therapy of metastatic cancers for improving the outcomes of patients.
Though all of the genes in our signature were differentially expressed in more studies than would be expected by chance alone, it is important to note that none of the genes in the common metastatic signature were present in more than 8 of our 18 datasets. This could be caused by many factors, such as heterogeneity of metastatic tumors, dataset quality, and the use of different platforms without uniform representation of genes of interest. This may explain the difficulty many individuals have found in identifying overlapping genes in multiple datasets examining metastasis [55
]. In addition, more overlapping genes may not have been identified because of a potential lack of power caused by using a stringent Q-value of 0.1. However, this highlights the usefulness of the meta-analysis approach in identifying significant metastatic genes that repeat more than expected by chance that may not be identified when initially comparing datasets.
In our analysis, we have also noted that the number of down-regulated genes is much greater than the number of up-regulated genes. This intriguing observation suggests that overcoming metastatic suppression may be a critical or common step in tumor progression. Alternatively, the genes involved in metastasis suppression may be more similar and shared among the solid tumors than those involved in metastasis activation processes. It has been previously shown that down-regulation of certain genes, such as KISS1, RhoGDI2, and nm23-H1, is important in metastasis [56
]. At least one of our identified down-regulated genes, CDGF, is a recognized metastasis suppressor gene [4
]. In addition, two of the most highly dysregulated pathways, the actin cytoskeleton signaling pathway and the regulation of actin motility by Rho, have been associated with multiple metastasis suppressor genes [56
]. We expect that further functional studies of the down-regulated genes in our common signature will reveal novel metastasis suppressor genes.
The future applications of this meta-analysis method are numerous, as the number of gene expression datasets increases. For instance, in the field of metastasis, this method could be used to compare patients with primary tumors that are metastatic versus non-metastatic. This may add to the information we have learned from the present study. At the time when study was started, the Oncomine database did not provide enough detailed clinical information to perform this analysis. However, this may be feasible in the future.
As with any meta-analysis, the results are dependent upon the reliability of the original data [6
]. However, it was difficult to test the validity of the original experiments without raw data. This quality issue was partially overcome by the use of our criteria to select the studies for the meta-analysis and by the use of the meta-analysis approach itself. Our selection criteria excluded the outliers in our analysis, such as datasets without any significant genes with a Q-value less than 0.1 and those with greater than 50% of the tested genes being significant which we hypothesize might be due to systemic bias rather than true differences. Additionally, the process of combining different studies into one analysis should theoretically minimize the effect of some of the possible confounders or quality issues that may be present in certain studies. Since any gene in our signature had to be repeated in multiple studies (i.e. in four or more for the down-regulated genes), no one study alone could completely invalidate our gene list. Clinical meta-analyses often test for heterogeneity of studies, but this approach has not been extended to the meta-analysis of genomic studies.
There are several areas for improvement that could be addressed in future studies. One limitation we had to overcome was the fact that it was not possible to download the complete datasets from the Oncomine database, and we were unable to find raw data for most of older studies included in our signature. Therefore, we were unable to compare the complete lists of genes tested and potentially capture genes that were only represented in a small number of platforms by performing more advanced meta-analysis methods, specifically one that could provide weighting based on the number of genes and samples in the initial experiment, such as the weighted z-method [57
]. Limiting the study to only those datasets with available raw data would have substantially reduced the number of studies and possibly the power to detect genes of interest. The counting method we performed in this study was only dependent on information of the significant gene lists which allowed the use of the maximum number of array studies. In the future, as more datasets are readily available for download, this problem may be overcome. This current limitation, however, does not affect the conclusion that those genes identified in this study are likely to be of importance. We conclude that this method has good specificity but may have less sensitivity (a higher false negative rate) than other meta-analysis approaches.
Inconsistent gene ontology also complicated the analysis in this study. Since the Oncomine database provides only the Gene Symbol and only one other gene identifier that could not be matched for every dataset, the only common identifier between our validation datasets and the datasets used in the meta-analysis was the gene symbol. However, a gene symbol may map to multiple probes, so we could be counting results of different probes in each dataset. We did ensure that we counted each unique gene symbol only once in each direction by manually removing duplicates in our extracted data prior to running the meta-analysis. In addition, the use of gene symbols forced us to eliminate from our comparison many ESTs that could have been found in multiple studies. This highlights the need for a common identifier standardized across platforms, such as Entrez gene IDs, which will help to identify more common metastatic genes.
Lastly, the majority of studies used in the meta-analysis were from epithelial tumors reflecting their predominance in the population and, hence, the microarray studies. Attempts to remove these epithelial cancer datasets, such as prostate cancers, resulted in lack of power to identify significant metastatic genes. This could be due to a more dramatic biological effect of metastasis in these epithelial tumors, or the power of the studies themselves, such as a larger sample size and less tissue heterogeneity, etc. This is a limitation of our study due to the availability of eligible datasets in the Oncomine database. With the accumulation of more datasets for non-epithelial, non-adenocarcinoma tumors, future studies may be able to incorporate them and identify a more refined common signature of metastasis that is applicable to even more tumor types.