|Home | About | Journals | Submit | Contact Us | Français|
While prostate cancer is a leading cause of cancer death, most men die with and not from their disease, underscoring the urgency to distinguish potentially lethal from indolent prostate cancer. We tested the prognostic value of a previously identified multigene signature of prostate cancer progression to predict cancer-specific death. The Örebro Watchful Waiting Cohort included 172 men with localized prostate cancer of whom 40 died of prostate cancer. We quantified protein expression of the markers in tumor tissue by immunohistochemistry, and stratified the cohort by quintiles according to risk classification. We accounted for clinical parameters (age, Gleason, nuclear grade, tumor volume) using Cox regression, and calculated Receiver Operator Curves to compare discriminatory ability. The hazard ratio of prostate cancer death increased with increasing risk classification by the multigene model, with a 16-fold greater risk comparing highest versus lowest risk strata, and predicted outcome independent of clinical factors (p=0.002). The best discrimination came from combining information from the multigene markers and clinical data, which perfectly classified the lowest risk stratum where no one developed lethal disease; using the two lowest risk groups as referent, the hazard ratio (95% confidence interval) was 11.3 (4.0–32.8) for the highest risk group and difference in mortality at 15 years was 60% (50–70%). The combined model provided greater discriminatory ability (AUC 0.78) than the clinical model alone (AUC 0.71), p=0.04. Molecular tumor markers can add to clinical parameters to help distinguish lethal and indolent prostate cancer, and hold promise to guide treatment decisions.
Upon diagnosis with localized prostate cancer, patients and clinicians are faced with the decision of whether to treat or to defer treatment. On one hand, prostate cancer is a leading cause of cancer death among men in westernized countries (1), and deaths occur even 20 years after diagnosis (2). On the other, treatment has adverse effects (3) and is often unneeded, as most men do not die from their cancer, and many harbor tumors which are indolent even in the absence of therapy (2, 4, 5).
Treatment of localized disease can reduce cancer-specific mortality, but in the only randomized trial of radical prostatectomy versus watchful waiting (6, 7), the number needed to treat to prevent one cancer death was 19. That trial predated screening by prostate specific antigen (PSA); 30–60% of PSA-detected cancers have been characterized as over diagnosed (8, 9) and therefore the number needed to treat may be greater for a screened population.
There is a clear need for tools to distinguish potentially lethal from indolent disease at diagnosis to guide treatment decisions. Clinical nomograms characterize risk of progression using pretreatment clinical markers: PSA levels, biopsy Gleason scores, tumor extent, and clinical stage (10–13). These scoring systems have significant predictive power, but molecular tumor markers hold promise to improve prediction (14) A 12-gene molecular signature of advanced prostate cancer was recently identified through integration of proteomic and expression array data, comparing benign prostate, localized prostate cancer and metastatic disease (15). A set of 36 markers, which showed differential expression at both the RNA and protein level, plus five additional genes, were immunostained on a prostate cancer progression array. Through linear discriminant analysis, the multigene model was identified which was significantly associated with PSA-failure after prostatectomy in a small cohort. However, most men with PSA recurrence do not develop lethal disease (16). We tested the prognostic value of this molecular signature in relation to cancer-specific death within a cohort of men diagnosed with clinically localized prostate cancer and followed prospectively over 28 years.
The population-based Örebro Watchful Waiting Cohort (2, 17) comprises men with localized (T1a/T1b, NX, M0) prostate cancer diagnosed by transurethral resection of the prostate (TURP) for symptomatic benign prostatic hyperplasia. Cases were diagnosed between 1977 and 1991, prior to the widespread use of PSA screening, in Örebro, Sweden, within the University Hospital’s catchment area. In accordance with standard treatment, the men were initially followed expectantly with careful monitoring by clinical exams, laboratory tests and bone scans every 6 months during the first 2 years post-diagnosis, and yearly thereafter. Hormonal therapy was initiated upon demonstrated progression to symptomatic disease.
During this time, 252 men were diagnosed with prostate cancer at the University Hospital by TURP and followed by watchful waiting. The current study was nested among the 172 men for whom tumor tissue was available. We noted no difference in the distribution of Gleason scores or incidence of prostate cancer death among those with and without tumor tissue. Follow-up of the cohort is 100% complete through March 2006. Metastases were diagnosed by bone scan. Deaths were identified using the Swedish Death Register, and medical records were reviewed by the study investigators to confirm cause.
We retrieved archival formalin-fixed, paraffin embedded TURP specimens to construct tissue microarrays (TMA) using a manual tissue arrayer (17). The study pathologists reviewed H&E slides for each case to provide uniform Gleason grading. We found a 30% discordance comparing Gleason scoring re-review with the initial pathology review, with generally lower scores in the initial reports. This grade migration has also been described by Albertsen (4). The pathologist determined the dominant prostate cancer nodule or nodule with the highest Gleason pattern, and two 0.6 mm tissue cores from tumor areas were transferred to the recipient array blocks.
We assayed protein expression of the markers (Table 1) in the multigene model on the Örebro Watchful Waiting TMA using immunohistochemistry. The TPD52 antibody could not be obtained; thus 11 markers were assessed. One 5 micron section of the TMA block was cut for each protein. Incubations and dilutions for each antibody were optimized while minimizing background (Table 1). Secondary antibodies linked to streptavidin-biotin were used to visualize staining.
Protein expression was determined on scanned digital images of TMA cores (18) using a semi-automated image analysis system (Chromavision) with high reproducibility (19) that assessed staining intensity (0–255) and percent of positive stained area (0–100%). The study pathologist electronically circled areas of histologically recognizable prostate cancer to capture tumor expression.
Presence of the TMPRSS2:ERG fusion was previously evaluated on a subset of cases (N=107) using a fluorescence in situ hybridization (FISH) assay (20). In an earlier publication of the cohort, we showed that presence of the TMPRSS2:ERG fusion was associated with an almost 3-fold increased risk of cancer death (20).
TMA blocks were constructed without prior knowledge of clinical outcomes, and the pathologist remained blinded to outcome during immunohistochemistry evaluation.
A priori we used protein intensity for markers that stained primarily in the cytoplasm and percent staining for those staining primarily in the nucleus (Table 1). For individuals missing data on specific markers, we imputed using the k Nearest Neighbor classification, an algorithm which assigns missing data based on the majority of vote of its neighbors, as defined by the other markers. We selected k=3, so that a comparison was made to its 3 nearest neighbors and assigned the mean value for the three. More than 93% of our cases had complete markers, and 3% were missing expression for only 1 marker.
To create the molecular signature score, we divided expression of each marker into quartiles, based on the cohort distribution. For markers whose expression is upregulated in metastatic vs. localized cancer, based on prior data (Table 1), we assigned a score = 1 for those in the highest quartile of expression, and 0 otherwise. For markers with downregulated expression, a score = 1 was given for those in the lowest quartile of expression and 0 otherwise. We calculated a weighted risk score across the 11 markers, multiplying the protein expression coding values (0 or 1) for each gene by the coefficients from the linear discriminant analysis(15), thereby prioritizing genes that provided the greatest discrimination in the original article.
As described previously (17), we generated a weighted risk score that incorporated clinical predictors available on the cohort: age at diagnosis (continuous), Gleason grade (categorically, 2–5, 6, 7, 8–10), nuclear grade (categorically, grade I, II, III)(21) and tumor extent (22) (defined categorically as the proportion of chips with tumor, <5%, 5–24.9%, 25–49.9%, 50%+). The cohort was assembled prior to the introduction of PSA screening, and thus PSA levels at diagnosis were not available. Finally, we created a combined risk score of the multigene molecular signature and clinical markers to examine their joint predictive value. Men were then classified as having high, intermediate or low risk of lethal cancer based on their molecular signature score, their clinical risk score or combined clinical/molecular risk score divided into quintiles. In separate analyses, we added the TMPRSS2:ERG fusion status to the molecular and molecular-clinical risk scores to evaluate the additional informativeness of the fusion, in combination with other markers, to predict poor cancer prognosis among the subset of men.
We used time-to-event analyses to evaluate the gene signature to predict prostate cancer death during follow-up. Person-time was calculated from date of cancer diagnosis to date of development of metastases, cancer death or censored at time of death from other causes or end of follow-up (March 2006). Hazard ratios (HR) and cumulative incidence differences (with 95% confidence intervals (CI)) were used as effect measures using the Cox proportional hazards model.
Competing causes of death could play an important role in prostate survival analyses. Men who died of another cause soon after diagnosis are less informative about prognosis, since some would have progressed had they lived longer. We categorized men as: lethal phenotype (men who developed metastases during follow-up, n= 40), indolent phenotype (men who lived at least 10 years after diagnosis without metastases, n=49), and indeterminate phenotype (men who died of competing causes within 10 years of diagnosis, n=83), and compared clinical characteristics of the three groups. We estimated the cumulative incidence of lethal disease accounting for competing risks (23) using a publicly available SAS macro (24). Rather than treating other causes of death as censored observations, this method simultaneously analyzes multiple cause-specific hazards. We fit Cox models stratified by failure type (lethal cancer vs. other cause of death) and adjusted for clinical covariates.
In addition to estimating hazard ratios and cumulative incidence differences, we calculated Receiver Operator Curves (ROC) for the three models – the molecular signature, clinical model, combined molecular-clinical model-- plotting sensitivity versus 1 – specificity to predict lethal disease at 15 years. We compared the Area Under the Curve (AUC), where a value of 1.0 indicates perfect discrimination and 0.5 is no better than chance alone (25).
Analyses were undertaken using the SAS Statistical Analysis (Version 9.1). The research protocol was approved by the institutional review boards at the collaborating US and Swedish institutions.
Of 172 men with localized prostate cancer, 40% had high grade tumors and 19% had tumor volume greater than 25% (Table 2). During 28 years of follow-up, 40 men died of cancer (N=39) or were alive with bone metastases (N=1); 49 were long-term survivors who lived >10 years after their diagnosis without development of metastases; and 83 died of causes other than prostate cancer. Mean follow-up to development of metastatic disease was 7.6 years (range 0.1–27.1), and from metastasis to death was 2.0 years.
Men classified as having a lethal disease tended to have tumors with higher Gleason grade, higher nuclear grade, and greater tumor extent than men with the indolent phenotype (Table 2). Men with lethal phenotype were also more likely to have fusion positive tumors. Men classified as indeterminate had clinical characteristics between lethal and indolent phenotypes, reflecting in this group a mixture of men with indolent and those who would have developed lethal disease if they lived long enough.
Expression of Jagged1 and MTA1 were most strongly correlated with expression of other markers, showing correlation coefficients of 0.3 to 0.4 for positive correlations and −0.3 to −0.4 for inverse correlations; no one marker was correlated with all others. We evaluated each specific marker to predict prostate cancer death, adjusted for clinical parameters. The strongest molecular predictors (HR, 95% CI) of prostate cancer death were MTA1 (3.4, 1.2–9.2), p63 (1.8, 0.8–4.2), jagged1 (1.8, 0.7–4.5) and ABP280 (1.6, 0.7–3.6). Interestingly, these markers were among the strongest discriminators of metastatic vs. localized disease in the publication by Bismar et al (15).
Using the molecular markers, the age-adjusted hazard ratio of prostate cancer death increased with increasing risk group classification, with a 16-fold increased risk of cancer death comparing the highest versus lowest risk groups (Table 3). The multigene signature remained a significant predictor of lethal prostate cancer even controlling for clinical parameters: the hazard ratio of developing lethal disease was 12.3 (95% CI 1.5–100.7) comparing extreme risk groups, and there was increased risk for all risk categories compared to the lowest (p for trend = 0.0015). Moreover, the molecular signature was a significant predictor of lethal disease among men with low grade (Gleason score 4–6) tumors (HR = 16.9, p=0.007).
Gleason grade, tumor volume and nuclear grade were each independent predictors of prostate cancer prognosis. Men classified as highest risk based on the clinical markers were 13 times (95% CI 4.3–40.5) more likely to die of prostate cancer compared to the lowest risk group (Table 3). Interestingly, among men characterized as low or intermediate risk based on clinical parameters, the multigene signature could further stratify who would have good or bad prognosis (p for trend 0.028).
While both the molecular and clinical signatures independently predicted lethal phenotype, the best discrimination came from a score combining the multigene and clinical information. No man classified as lowest risk in the combined score developed metastasis or died of his disease (Table 3). As a result, we combined the two lowest risk strata as the referent category to calculate hazard ratios. With this comparison, the hazard ratio of developing lethal prostate cancer was 11-fold higher (95% CI 4.0–32.8).
Figure 1 shows cumulative incidence of lethal prostate cancer at 5, 10, 15 and 20 years of follow-up based on risk according to the combined multigene and clinical parameters. Even at 5-years, higher risk groups identified those who developed lethal disease (cumulative incidence difference 28.7%, 95% CI 17.4–40.0%). With continued follow-up, the difference in cumulative incidence of lethal cancer between the lowest and highest risk group increased. Although the greatest discrimination in prediction was in contrasting the highest and lowest risk groups, the intermediate risk groups also were predictive of outcome.
ROC curves are presented in Figure 2. At 15 years follow-up, the predictive ability of the molecular signature alone (AUC 0.68) was similar to that of the clinical markers alone (AUC 0.71). The model that combined the molecular and clinical parameters provided the greatest discrimination (AUC 0.78), with a 10% improvement over the clinical markers alone (p=0.04). The highest risk score based on clinical parameters was a better classifier (higher sensitivity) than the molecular signature of those who would develop lethal disease. However, 10% of the lowest risk men based on clinical markers died of cancer or developed metastasis during follow-up, compared to 3% classified as low risk by the molecular signature and 0% classified by the molecular-clinical model, suggesting the molecular data could improve classification of those who would have a good prognosis.
Information was previously collected on the presence or absence of the TMPRSS2:ERG fusion on a subset of 107 men in the cohort.(20). Information onTMPRSS2:ERG fusion status improved prognostication of the multigene model. At 15-years follow-up, the AUC for the multigene signature + fusion data was 0.79, and for the combined molecular/clinical + fusion data was 0.83.
In this population-based cohort of men with initially untreated localized prostate cancer, we tested and validated a proposed multigene signature to predict lethal disease or long term survival. The overall probability of developing lethal prostate cancer was 1 in 5. The multigene model was a significant predictor of cancer prognosis, independent of clinical parameters, such that the probability of developing lethal disease was 1 in 20 for those classified at lowest risk, but 1 in 2 for those classified as highest risk. The signature distinguished lethal and indolent disease even among men with tumors Gleason <7. These data demonstrate that tumor markers at diagnosis can predict outcome more than 20 years hence, and suggest that in part the biologic phenotype of prostate tumors to have a lethal or indolent course is set early in the disease development.
The discriminatory ability of the molecular signature and clinical model were similar based on the ROC curves. However, the clinical model was a worse classifier for the low-risk group, and misclassified a greater proportion of men as indolent who in reality developed metastasis or died of their disease. In assessing classification, one should consider misclassification of truly lethal disease to be a more hazardous occurrence.
The combination of molecular and clinical data provided the greatest outcome discrimination. None of the lowest risk men (20% of the total) developed lethal disease, whereas by the end of follow-up, almost three-quarters of those classified as highest risk had died of their cancer or developed metastasis. While few would suggest active surveillance for a man diagnosed with Gleason 8 or higher tumors, molecular markers may be most informative in guiding treatment decisions among men with Gleason 6–7 tumors or where other clinical parameters are suggestive of low to mid risk. The improvement in the AUC for the combined multigene/clinical model compared to the clinical model alone suggests that prostate cancer prediction models may seek to combine both molecular and clinical data.
These data provide a proof of concept and demonstrate the potential utility of molecular signatures of lethal prostate cancer. The signature was imperfect, however; not all men with the multigene signature died of the disease. Moreover, the majority of deaths occurred in the middle risk groups, with mixed discriminatory ability, reflecting the need for better markers to classify outcomes. Nonetheless, the ability to predict accurately a man’s outcome from prostate cancer at the extreme quintiles could be of great clinical utility. Moreover, our data suggests that the recently identified TMPRSS2:ERG fusion may provide even greater improvement in prognostication, in combination with other markers. A set of molecular markers has the added potential benefit of being developed into a standardized and objective test. Clinical parameters such as Gleason grading involve a level of subjectivity, as demonstrated in the apparent Gleason score reclassification which has occurred over time(4).
For validation of biomarkers of prostate cancer prognosis, cancer-specific death or metastasis is the optimal outcome. While PSA recurrence is associated with an increased risk of prostate cancer death, most men with recurrences do not die of cancer (26, 27), so studies based on intermediary measures may be misleading. Long-term and complete follow-up is critical, since prostate-specific deaths can occur even 20 years after diagnosis (2, 4). The Örebro cohort has been followed prospectively with careful clinical annotation (2).
The cohort was followed by watchful waiting, and thus initially treatment naïve, which provides an opportunity to characterize a man’s cancer as indolent even in the absence of therapy. Our study population derived from a well-defined catchment area, with similar clinical care for all patients, thus reducing potential selection biases. We applied a standardized histopathologic review for Gleason grading to avoid potential grade migration over time (4). Although the Örebro cohort was assembled in the pre- PSA era, the cancers were incidentally detected and likely resemble PSA detected cases given the distribution of Gleason grade and stage. These TURP-detected tumors tended to be in the transitional zone, as opposed to peripheral tumors, so there might be concern that TURP-detected and PSA-detected tumors have different molecular phenotypes such that the current findings cannot be generalized to current clinical practices. However, there is little evidence to suggest meaningful differences in the biology of tumors in these zones and among different modes of presentation. Indeed, the multigene signature, developed on primarily peripheral zone specimens, was predictive of outcome among our cohort. We had no baseline PSA levels, a clinical predictor of outcome (28–30), but given that PSA levels is not a strong prognostic predictor among men who opt for watchful waiting following diagnosis of localized prostate cancer (31), such information would likely provide a small improvement in the predictive probability of the multigene/clinical risk score.
Our findings suggest that evaluation of prostate tumor biomarkers at diagnosis can enhance prediction models to aid in counseling patients and guide clinical practice. The signature can identify men at lowest risk of progression, for whom active surveillance may be most appropriate. Although prediction of the middle risk group is not perfect, the molecular tools can identify men for whom aggressive therapy would be indicated and thus substantially reduce the number needed to treat to avoid one prostate cancer death. The future challenge is to improve the molecular signature so that a greater proportion of men can be classified as low or high risk with similar or better discrimination.
The authors are grateful to Kelly Lamb, and Lela Schumacher for technical support critical to this study, to Ryan Lee for his expert advice in statistical programming, and to David Havelick for expert editorial assistance. The tissue microarray arrays were constructed at the Dana Farber/Harvard Cancer Center Tissue Microarray Core facility.
The project was supported by the NIH/NCI Prostate SPORE at the Dana-Farber/Harvard Cancer Center (NCI P50 CA090381), NIH T32 Training Grant CA009001 (LAM), NIH R01AG21404 (MAR, FD), Deutsche Forschungsgemeinschaft DFG PE1179/1-1 (SP)