|Home | About | Journals | Submit | Contact Us | Français|
In bladder cancer, clinical grade and stage fail to capture outcome. We developed a clinically applicable quantitative polymerase chain reaction (QPCR) gene signature to predict progression in non-muscle-invasive bladder cancer. Comparative meta-profiling of twelve DNA microarray datasets (comprising 631 samples, 241,298 probe-sets) identified 96 genes which demonstrated differential expression in seven clinical outcome categories, or were identified as outliers, historic markers, or housekeeping genes. QPCR was performed to determine messenger RNA (mRNA) expression from 96 bladder tumors. 57 genes differentiated T2 from non-T2 tumors (p<0.05). Principal components analysis and Cox regression models were used to predict probability of T2 progression for non-T2 patients, placing them into high- and low-risk groups based on their gene expression. At two years, high-risk patients exhibited greater T2 progression (45% for high-risk patients vs. 12% for low-risk patients, p = 0.003, log-rank test). This difference remained significant within T1 (61% for high-risk vs. 22% for low-risk, p =0.02) and Ta tumors (29% for high-risk vs. 0% for low-risk, p=0.03). The best multivariate Cox model included stage and gender, and this signature provided predictive improvement over both (p=0.002, likelihood ratio test). Immunohistochemistry was performed for two genes in the signature not previously described in bladder cancer, ACTN1 (actinin) and CDC25B (cell division cycle 25B), corroborating their up-regulation at the protein level with disease progression. Thus, we identified a 57-gene QPCR panel to help predict progression of non-muscle-invasive bladder cancers and delineate a systematic, generalizable approach to converting microarray data into a multiplex assay for cancer progression.
Approximately 75% of newly-diagnosed patients with bladder cancer will have disease confined to the urothelium or lamina propria (stages Ta, Tis, and T1). These non-muscle-invasive tumors account for significant morbidity, given recurrence rates of 50–70% (1) and need for cystoscopic surveillance. Furthermore, 10–15% of these tumors will progress to muscle invasion or higher (T2-4) (2), with worsened prognosis and 5-year overall survival rates of 50–60% (3). To date, there has been no reliable means of predicting tumor progression other than clinical judgment, published risk estimates, or burgeoning clinical nomograms (2, 4).
In parallel to these clinical questions, there has been significant maturation of DNA microarray gene expression analysis over the last decade. Microarray analysis has become a high-throughput method of measuring the cancer transcriptome and can distinguish cancer from normal tissues, identify cancer subtypes, and predict recurrence or treatment response. For example, breast cancer has been studied extensively with microarray analysis, generating gene signatures to guide clinical management (5, 6). Other cancers, such as bladder cancer, have been investigated infrequently with microarray analysis. A recent query of the Affymetrix publication database and PubMed confirms the disparity in microarray attention between bladder and breast cancer: bladder cancer is linked to 102 Affymetrix and 223 PubMed publications, while breast cancer is linked to 757 Affymetrix and 1677 PubMed citations. Even accounting for the increased incidence of breast cancer in 2008 (184,000 versus 69,000 for bladder), there are fewer bladder cancer microarray studies performed.
Furthermore, clinical application of microarray gene signatures has been difficult given the lack of reproducibility. Small cohorts and variable microarray platforms may explain the minimal overlap between signatures. Ultimately, a gene signature that will be used for risk stratification must be well-validated across various, independent patient populations. Previously, we sought to overcome the limitations of varied analyses through comparative meta-profiling of microarray datasets to characterize a common transcriptional profile across cancer types (8). Comparative meta-profiling generates gene signatures from the overlap of independent microarray datasets, limiting the noise of spuriously identified genes and accentuating true underlying signature patterns. Furthermore, quantitative polymerase chain reaction (QPCR), relative to microarrays, is more reproducible, possesses a larger dynamic range, and is a clinically more tractable platform for diagnostics and prognostics development.
The goal of this multi-phase study was to utilize preexisting microarray datasets to develop a gene signature that would help predict progression for non-muscle-invasive bladder cancers. In Phase I (Comparative Meta-Profiling and Creation of Meta-Signature), we used comparative meta-profiling to analyze published bladder cancer microarray datasets and determine genes associated with cancer development, recurrence, progression, and outcome. We then sought to tailor the large number of genes to a smaller, robust metasignature of 96 genes associated with aggressive behavior in bladder cancer. In Phase II (Sample Selection and QPCR for Development of Gene Signature), these 96 genes were pre-configured onto a clinically applicable, high-throughput QPCR card. Gene expression values were quantified for 96 frozen tumor tissue specimens. Ultimately, 57 genes were selected which differentiated between non-muscle-invasive and muscle-invasive tumors. In Phase III (Evaluation of Gene Signature Predictive Ability and Biologic Networking), we assessed the ability of a 57-gene signature to predict probability of progression of non-muscle-invasive bladder tumors to T2 disease, and investigated the set’s overlap with biologic networks. In Phase IV (Immunohistochemical Confirmation of Sample Genes), we confirmed protein expression for two gene signature members, actinin (ACTN1) and cell cycle division 25B (CDC25B), utilizing a bladder cancer tissue microarray.
Ultimately, this signature may aid in the identification of non-muscle-invasive bladder cancers that are more likely to progress, and for which earlier definitive therapy like cystectomy may be offered. More generally, we present a systematic approach to utilizing publicly available cancer microarray datasets and converting them into a clinically applicable platform.
Nine previously published bladder cancer microarray profiling datasets and three multi-cancer microarray profiling datasets were identified, comprising 631 samples and 241,298 probe-sets (Supplementary Table 1). These publicly available microarray data sets were uploaded into Oncomine (9), an online compendium and advanced analysis platform for gene expression datasets. The flow diagram of comparative meta-profiling leading to the creation of a Taqman Low Density Array (TLDA) card is detailed in Figure 1A.
For each of the microarray profiling studies, we reviewed clinical information for profiled samples, including cancer grade and stage, recurrence, local or distant progression, and patient death. Ultimately, six clinical categories were defined: cancer grade, muscle-invasion, recurrence, progression to higher stage, positive lymph node status, and death from disease (Supplementary Table 2). A seventh clinical category for overall aggressiveness was devised, combining progression, positive lymph nodes, or death from disease. Individual samples were assigned to classes for each analysis, and in each study, genes were assessed in Oncomine for differential expression between these classes with Student’s t-test, to create meta-profiles for each clinical category (see Supplementary Methods). Genes were selected as candidates for the TLDA card if they were significantly over-expressed in at least four clinical category meta-profiles or under-expressed in at least three, to increase the likelihood that they reflected significant processes in bladder cancer (Fig. 1B); from there, the list was further tailored by choosing genes with available TLDA primers, thus resulting in 50 over-expressed and 15 under-expressed genes. Six outlier genes in the datasets were also identified by Oncomine analysis and included in the meta-signature, as well as six housekeeping genes and 19 historic markers. This resulted in a meta-signature of 96 genes of interest (Supplementary Table 3). These 96 genes were then preloaded onto a 96A-well format TLDA card (Applied Biosystems, Inc., Foster City, CA), which allows for multiplex high-throughput QPCR measurements. Five batches of ten cards each were constructed.
Cases with available frozen bladder cancer tissue from time of transurethral resection of the bladder tumor (TURBT) were selected from those patients enrolled in the bladder cancer database at the University of Michigan. All samples were collected with the informed consent of the patients and prior institutional review board approval. To be included in the bladder cancer tumor bank, samples had been previously pathologically reviewed to ensure adequate tissue and tumor representation as well as confirm stage and grade (according to modified World Health Organization/International Society of Urologic Pathology standards). Samples were selected based on pathologic stage at time of TURBT (Ta, T1, or T2), presence of transitional cell carcinoma, lack of mixed or variant histology, and no previous intravesical or systemic therapy within one year of TURBT. Overall, 100 samples fit these criterion and clinical information was collected regarding initial tumor grade and stage at time of TURBT, recurrence, local or distant progression, and disease-specific and overall mortality. Patients were characterized into two groups: non-muscle-invasive (Ta, T1) cancers with no evidence of progression to T2 disease during follow-up, and any stage tumors that were pathologic T2 at TURBT or demonstrated progression to T2 disease, local or distant metastasis, or cancer-specific death during follow-up.
Each frozen tissue sample was sectioned into seven 20-micron sections, and RNA isolation was performed using Trizol extraction (Invitrogen, Carlsbad, CA). QPCR was performed using Taqman dye on the Applied Biosystems 7900HT Fast Real-Time PCR system. Reproducibility across batches was investigated by performing repeat gene expression measurement of 16 tumor samples (see Supplementary Methods).
Additionally, twelve benign bladder frozen specimens were identified from adjacent benign tissue in radical cystectomy cases, as obtained from the frozen tissue bank and tissue procurement service at the University of Michigan, and RNA extraction was performed from dissected epithelium-rich areas. These samples were also run on the TLDA cards (see Supplementary Methods).
RNA yield quantification was performed with the Nanodrop 1000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA).
Gene expression was normalized relative to the average of four housekeeping genes [ACTB (beta-actin), CYCS (Cytochrome C), GAPDH (Glyceraldehyde-3-phosphate Dehydrogenase), and SDHA (Succinate Dehydrogenase Complex, subunit A)]; the values were then log2-transformed. 18S (18S rRNA gene) was excluded from this average as raw threshold cycle (Ct) values consistently ran in the 2–3 cycle range. Samples were excluded from analysis if they demonstrated weak QPCR signal [Ct value for GAPDH >28, or housekeeping average >31]; overall five samples were excluded (four tumor, one benign).
To obtain the gene signature, univariate Wilcoxon rank-sum tests were used to identify genes differentiating T2 from non-T2 tumors, with a two-sided p-value < 0.05 statistically significant. These expression values were log-transformed using the transformation log(expression+1). Gene expression raw Ct values that were missing initially had log-values imputed as zero, implying no expression of that gene relative to housekeeping genes. To reduce outlier influence, the distribution of each gene’s expression values was truncated at the third upward standard deviation. Principal components analysis was used to reduce this gene set into a smaller number of variables explaining >75% of the data variance, and principal components were used as predictors in a multivariate Cox regression model for T2 progression. Patients who had already progressed to T2 at TURBT were coded as having time-to-event=0. The best Cox model was chosen using a backwards selection algorithm incorporating the Akaike Information Criterion (AIC) for model comparison.
In order to evaluate the signature’s predictive power, leave-one-out cross-validation was performed, resulting in a predicted probability of T2 progression for each Ta and T1 patient at TURBT. These cross-validated predictions were used to stratify non-T2 patients into high- and low-risk groups for T2 progression, using the median predicted probability as the cutoff. Differences in outcome were evaluated using Kaplan-Meier curves and log-rank tests. Additionally, AIC was used to select a best multivariate Cox regression model for progression to T2 using age, gender, CIS, stage, and grade as possible predictors, and the likelihood ratio test was used to evaluate the significance of the signature when added to this clinical model. Associations between these clinical variables and T2 progression were assessed using univariate Cox models and likelihood ratio tests.
Molecular concepts map (MCM) analysis computes pairwise associations between gene sets to create an ‘enrichment network’ of associations across all available signatures, arising from a variety of cancer types, pathways, mechanisms and drugs (11). This compendium of >14,000 ‘molecular concepts,’ or sets of biologically connected genes, is available at http://private.molecularconcepts.org. A gene set of interest can then be investigated for its functional overlap with other gene sets and biologic concepts (see Supplementary Methods).
Two genes, actinin (ACTN1) and cell division cycle 25B (CDC25B), were identified from the metasignature; these had available antibodies and were chosen for immunohistochemistry (IHC) analysis utilizing a bladder cancer progression tissue microarray (TMA). This TMA was constructed from 41 cases derived from 40 patients, representing benign bladder tissue, bladder CIS (carcinoma in situ), bladder cancer (non-invasive and invasive), and bladder cancer lymph node metastases. Three cores (0.6 mm in diameter) were taken from each tumor focus confirmed by two surgical pathologists (R.M. and L.P.K.). All tissues were derived from our institutional bladder cancer database with informed consentof the patients and prior institutional review board approval; there was minimal overlap of cases used for TMA construction and mRNA extraction.
Immunohistochemistry was performed on the TMA using mouse monoclonal antibodies against CDC25B (Labvision 1188-p1; 1 in 50 dilution) and ACTN1 (Santa Cruz sc-17829; 1 in 50 dilution) proteins and standard avidin-biotin complex techniques, as described previously (12). Details of the TMA construction and IHC staining are provided in Supplementary Methods.
For IHC analysis, one-way ANOVA was used to compare distributions of the median product scores by group. F-tests were used to compare competing models, and comparisons between groups were made using Tukey’s Honest Significant Difference (HSD) procedure (for pairwise comparisons) and Scheffe’s method (other comparisons).
All statistical analyses were performed using R, version 2.7.0 (http://www.r-project.org).
Frozen tumor sections were available for all 100 patients selected from the tissue bank and 12 benign bladder specimens (total n=112). One benign and four tumor samples were eliminated from final analysis, secondary to low gene expression. The final cohort consisted of 107 samples--96 tumor samples, with 42 non-progressing tumors, 54 progressing or T2 tumors, and 11 benign bladder samples. There was high QPCR reproducibility across batches (Supplementary Fig. 1). Patient demographics for the final 96 tumor samples are listed in Table 1. Median follow-up in non-T2 patients for whom predictions were made was 2.4 years. Overall, 5/31 Ta tumors and 15/31 T1 tumors progressed to T2 during follow-up.
Univariate analysis revealed that pathologic stage (T1 vs. Ta) was the only significant clinical predictor of T2 progression (p = 0.01). Grade (high vs. low) approached significance as a predictor of T2 progression (p=0.08), but a within-stage analysis revealed grade was not predictive of progression (Supplementary Table 4).
The 107 bladder samples were run on the pre-configured 96-element TLDA cards. Fifty-seven genes demonstrated differential expression between T2 and non-T2 tumors (p<0.05), with an estimated false discovery rate of 1.1%. This set consisted of 37 over-expressed and 20 under-expressed genes in T2 tumors. Further gene signature details are available in Supplementary Results.
We sought to determine whether this gene signature was associated with progression of Ta and T1 tumors to muscle-invasive disease. Five-year outcomes for the Ta and T1 tumors are demonstrated in Figure 2. Using the 57-gene signature to divide this population into high- and low-risk groups, high-risk patients exhibited a higher rate of progression to T2 disease within two years (45% for high-risk vs. 12% for low-risk, p = 0.003; Fig. 2A). As expected, stage alone was a significant predictor of T2 progression, with more T1 patients experiencing T2 progression than Ta patients (Fig. 2B, p=0.007). Importantly, however, the gene signature prediction maintained significance within T1 tumors (61% progression for high-risk vs. 22% progression for low-risk, p = 0.02; Fig. 2C) and Ta tumors (29% progression for high-risk vs. 0% progression for low-risk, p = 0.03; Fig. 2D), demonstrating that this gene signature provides additional risk stratification beyond stage alone. This difference in outcomes is most pronounced for T1 patients in the first year of follow-up, during which 7% of predicted low-risk T1 patients progressed to T2 disease, versus 61% of predicted high-risk T1 patients.
Several clinical variables (age, gender, pathologic stage, histological grade, and associated CIS) were investigated with univariate analysis; the only significant predictor of progression was pathologic stage (p=0.01; Supplementary Table 4). Although histologic grade was marginally associated with T2 progression (p = 0.08), this could be explained by a strong association between histologic grade and pathologic stage in this cohort (Fisher exact test: p < 0.0001). Indeed, in a multivariate Cox model utilizing clinical parameters, the best model retained only stage and gender as significant predictors of T2 progression (Table 2). The gene signature, however, provided significant ability to predict progression, independent of stage and gender (p=0.002, likelihood ratio test).
A heat map comparing the 57 genes and all samples is shown in Figure 3. Hierarchical clustering of genes demonstrated the 37 over-expressed (Cluster 1) and 20 under-expressed genes (Cluster 2) in T2 disease. In the over-expressed gene set, two smaller gene subsets can be appreciated from hierarchical clustering (Clusters 1A and 1B). The 20 under-expressed genes in T2 disease are also relatively over-expressed in Ta patients without progression. The benign samples demonstrate under-expression of Cluster 1B and Cluster 2 genes. Cluster 2 genes are up-regulated with progression from benign to Ta disease, but down-regulated again in transition to T2 disease.
To integrate the selected genes into a functional framework, the gene sets were investigated in the context of MCM analysis. This concepts-based analysis of the 57 gene signature demonstrated enrichment of cell adhesion and extracellular matrix invasion pathways as well as cell cycle regulation and mitosis in the up-regulated genes, confirming the importance of these programs in bladder cancer progression (13). For the under-expressed genes, these demonstrated overlap with under-expressed genes in poorly differentiated lung and invasive breast carcinomas; also, included in the list of down-regulated genes are several well-known tumor suppressors, including p53 and RB1 (retinoblastoma 1). See Supplementary Results and Supplementary Figures 2 and 3 for further details.
To further validate the components of the 57-gene signature with protein expression, we identified genes for which IHC-compatible antibodies were available. Antibodies to ACTN1 and CDC25B were identified and IHC was performed to investigate protein expression in situ on a bladder cancer progression TMA (Fig. 4). Both markers demonstrated homogenous staining, with predominantly cytoplasmic expression for ACTN1 and nuclear expression for CDC25B. ACTN1 showed the most significant individual group comparison difference between the non-invasive and either invasive or metastatic groups (Tukey’s HSD: p=0.003 for each) (Fig. 4B). CDC25B expression demonstrated a more linear trend with disease severity (p=0.0002) (Fig. 4C). More detailed information is available in Supplementary Results.
The accurate designation of Ta and T1 bladder cancers which will progress to muscle invasion has yet to be perfected, and currently relies upon pathologic review and surveillance with gold standards of cystoscopy and urine cytology. Urine cytology, however, lacks sensitivity for low grade tumors (14, 15), and cystoscopic detection may occur months after muscle invasion, depending on the interval. Earlier detection of progression to muscle-invasive disease may provide a survival benefit given decreased long-term survival of patients with muscle-invasive disease (likely due to the presence of concomitant micrometastasis). Additionally, patients with non-muscle invasive cancers who progress to T2 on surveillance demonstrate similarly poor survival after cystectomy as patients presenting with T2 disease (16). While urine-based tests exist to detect incipient or recurrent bladder cancer (14, 17), there are no widely-used modalities to risk-stratify patients beyond initial detection. Nomograms exist to estimate risks of recurrence and progression, but have not gained wide acceptance in the United States; these often require clinical information not readily available, such as tumor multiplicity and size (18), or incorporate single bladder tumor markers (4). A tumor-specific multi-gene signature can provide a more comprehensive picture of tumor aggressiveness.
Microarray gene-expression profiling is difficult to translate into a clinical prognostic tool given the large number of genes involved (19) and required time and expertise. QPCR is more clinically applicable, especially when working with a small group of highly-selected genes. Our methodology is applicable across many cancer types—namely, compiling microarray data for bioinformatics analysis, generating a larger list of robust genes involved in aggressive behavior, and deriving a smaller QPCR gene signature to predict an outcome of interest. In this study, we summarized the most essential transcripts from available bladder cancer microarray data into a prognostic gene set of 57 genes. This resulted in a clinically feasible test, utilizing a small amount of frozen bladder tumor available from TURBT, to provide a gene signature that helps predict progression in non-muscle invasive cancers.
Specifically, patients who were designated as high-risk by the gene signature were more likely to demonstrate progression to T2 disease than low-risk patients; this predictive ability surpassed information provided by pathologic stage alone. This provided evidence that a gene signature can provide additional risk stratification beyond pathology, particularly because inter-observer variability exists in tumor staging and grading for bladder (20, 21) and other cancers (22). Given the use of electrocautery during TURBT, it can also be difficult to assess margin status accurately, and a re-staging TURBT is standard for pathologic T1 disease to assess for missed muscle invasion (23). Specifically, this gene signature possessed excellent predictive ability in the T1 tumor cohort during the first year following TURBT, and for Ta patients, those with progression were classified appropriately as high-risk. The availability of tumor-specific gene expression with clinical parameters at the time of TURBT can provide better patient counseling. A more expedited offering of cystectomy as an alternative to intravesical bacillus Calmette-Guerín (BCG) therapy or surveillance, or following initial BCG failure, might be offered to T1 patients with high-risk clinicopathologic and gen signature features; in this dataset, more than 60% of high-risk T1 patients demonstrated progression in the first year. For high-risk Ta patients, some of these patients may not progress fully to T2 disease, but the clinician’s threshold for progression may be lowered such that alternative treatments are discussed sooner.
Two genes from the signature whose protein expression has not been described explicitly in bladder cancer were chosen for IHC analysis. ACTN1 has been shown to possess different splicing patterns in T2 versus Ta tumors (24), suggesting an ability to utilize ACTN1 in stage separation; in fact, ACTN1 protein expression was significantly different between non-muscle-invasive and invasive or metastatic bladder cancers. CDC25B has been shown to be up-regulated in progressing bladder tumors in a previous microarray study, correlating with its gradually up-regulated protein expression on IHC. Many of the other genes in the signature have also been studied in bladder cancer: TIMP2 in bladder cancer metastases (26), and p53 and RB1 in bladder cancer development and progression (27, 28). Furthermore, others have shown that cell cycle dysregulation is necessary for uroepithelial transformation and cell adhesion dysregulation is commonly found in uroepithelial tumor progression (13). This gene signature compiles the most essential genes from previous microarray studies and may prompt further investigation into their complex interactions necessary for progression.
Limitations of this study include the retrospective nature of sample collection, small sample size, and non-standardized follow-up. A larger sample size was difficult to accrue given our institution’s tertiary referral pattern, although this issue plagues many single-institution bladder cancer microarray studies. Non-invasive tumors could have progressed later than the time of follow-up for some patients, thus creating false-negative predictions. However, poor outcomes in bladder cancer are more likely to manifest themselves early with a shorter natural history than prostate cancer, for example, and predicted low-risk patients would be maintained on standardized cystoscopic and imaging surveillance. We acknowledge that false-positive results could prompt more invasive treatment modalities earlier, and emphasize the need to integrate this gene signature with clinical parameters.
Ideally, a prospective multi-institutional study with standardized follow-up and a larger sample size would be required. This type of study would include collection of tissue and urine, with the ultimate goal of creating a non-invasive test. Also, further studies may result in concentration of this gene set into a smaller essential set of genes. From a technical perspective, samples utilized in this study were grossly dissected, and the distinction between epithelial and stromal components was not made. Also, while reliance upon a manufactured card streamlines QPCR, gene primers are limited to those commercially available and may not account for splicing or fusion variants. This list inevitably excluded other promising candidates in bladder cancer, such as the Ral family of GTPases (29), KiSS-1 (metastin) (30), or PTEN (31, 32).
In conclusion, we utilized comparative meta-profiling of existing bladder cancer microarray datasets to define genes involved in aggressive behavior, and refined this to a final 57-gene signature that was significantly associated with the risk of progression to muscle invasion. This signature can be pre-loaded onto a commercially available QPCR card, and with prospective validation, could become a clinically applicable point-of-care tool for risk stratification in non-muscle-invasive bladder cancers. The broader implications of this study are that we established a systematic “pipeline” for converting multiple independent microarray studies into a high-throughput QPCR platform more amenable to clinical translation.
We would like to thank M. Vinco (UM Cancer Center Core Grant) for preparation of the benign bladder tissue samples. Oncomine is freely available to the academic community and its commercial use has been licensed to Compendia Biosciences, in which A.M.C. is a co-founder.
Financial Support: Early Detection Research Network (UO1 CA111275-01 to A.M.C.). A.M.C. is supported by a Clinical Translational Research Award from the Burroughs Welcome Foundation. S.A.T. is a fellow of the Medical Scientist Training Program.
Conflicts of Interest: A.M.C, R.W., D.S.M, and S.A.T. have submitted a patent application for the gene signature described in the manuscript.