|Home | About | Journals | Submit | Contact Us | Français|
Multivariate prognostic instruments aim to predict risk of recurrence among patients with localized prostate cancer. We sought to devise a novel risk assessment tool which would be a strong predictor of outcome across various levels of risk, and which could be easily applied and intuitively understood.
We studied 1,439 men who had undergone radical prostatectomy and were followed in the CaPSURE database (a longitudinal, community-based disease registry of prostate cancer patients) diagnosed between 1992 and 2001 were included. Disease recurrence was defined as prostate specific antigen (PSA) ≥0.2 ng/ml on 2 consecutive occasions following prostatectomy, or a second cancer treatment more than six months after surgery. The UCSF-CAPRA score was developed using pre-operative PSA, Gleason score, clinical T-stage, biopsy results, and age. The index was developed and validated using Cox proportional hazards and life table analyses.
210 patients (15%) recurred, 145 by PSA criteria and 65 by second treatment. Based on the results of the Cox analysis, points were assigned based on PSA (0-4 points), Gleason score (0-3), T stage (0-1), age (0-1), and biopsy data (0-1). The CAPRA score range is 0 to 10, with roughly double the risk of recurrence for each 2-point increase in score. Recurrence-free survival at 5 years ranged from 85% for a CAPRA score of 0-1 (95% CI 73-92%) to 8% for a score of 7-10 (95% CI 0-28%). The concordance index for the CAPRA score was 0.66.
The UCSF-CAPRA score is a straightforward yet powerful preoperative risk assessment tool. It must be externally validated in future studies.
An estimated 230,110 new cases of prostate cancer are predicted for 2004; 29,900 men are expected to die of the disease this year.1 While this mortality figure is second only to lung cancer, the high diagnosis to mortality ratio underscores the well-documented quandary presented by the fact that many men diagnosed with prostate cancer do not die of the disease.2 Definitive local therapy yields excellent long-term survival rates, and has been shown to reduce prostate cancer metastases and cause-specific mortality.3 All available active treatments, however, may exert a significant impact on patient health-related quality of life (HRQOL).4 Clinicians must therefore attempt to determine at the time of diagnosis who among their patients might do well with active surveillance, who should receive immediate local treatment, who require aggressive multimodal therapy, and who should be treated presumptively for advanced disease.
Numerous algorithms and nomograms have been developed which aim to predict either pathological stage or biochemical recurrence following treatment.5 These range in complexity from a three-level categorization published by D'Amico et al6 to the nomogram devised by Kattan et al7 which calculates likelihood of recurrence as a continuous variable but requires a multi-step paper tool or a computer program to use. We aimed to devise a predictive tool for decision-making before primary treatment which would provide more detailed and accurate risk stratification data than the D'Amico classification, yet offering easier calculation and application than the Kattan nomogram.
CaPSURE™ (Cancer of the Prostate Strategic Urologic Research Endeavor) is a longitudinal, observational database of men with biopsy-proven prostate adenocarcinoma, recruited from 40 primarily community-based urology practices across the United States. Newly-diagnosed prostate cancer patients are recruited consecutively by participating urologists, who report complete clinical data and follow-up information on diagnostic tests and treatments. Informed consent is obtained from each patient under institutional review board supervision. Patients are treated according to their physicians' usual practices, and are followed until time of death or withdrawal from the study.8, 9
The prostate specific antigen (PSA) value used was the highest PSA value recorded in the nine months prior to diagnosis. 2002 clinical TNM stage was the highest reported from 1 month prior to 3 months after the date of diagnosis. Gleason scores were recorded from the diagnostic biopsy site with the highest total and highest primary scores. Percent positive biopsies (PPB) was calculated from detailed reported biopsy data. Disease recurrence after radical prostatectomy (RP) was defined as two consecutive PSA values ≥0.2 ng/ml at any time postoperatively or any additional treatment more than six months after RP. The date of recurrence was defined as the earlier of the second PSA ≥0.2 ng/ml or the date additional treatment was initiated. If disease recurrence did not occur, the patient's follow-up time was censored at the date of the last recorded PSA.
As of July 2003, 10,018 patients were enrolled in CaPSURE. 4128 of these elected RP as primary treatment for their prostate cancer. We included patients diagnosed between 1992 and 2001 with clinically localized disease (clinical stage T1c-3a, N0/x, M0/x) who did not receive neoadjuvant or adjuvant radiation or hormonal therapy (N=2154). We excluded patients with unknown PSA, Gleason score, clinical T-stage, or PPB. We also limited the analysis to patients with at least a sextant biopsy, PSA ≥2 ng/ml at diagnosis, and at least two follow-up PSAs or evidence of additional treatment more than 6 months after RP. 1439 patients meeting these criteria constituted our analytic dataset.
Our goal in developing this predictive index was to maximize the ability of the score to predict disease-free survival (DFS) while maintaining the simplicity and clinical applicability of the tool. The variables initially considered for inclusion in the index included PSA, Gleason score, T stage, PPB, age at diagnosis, and ethnicity. We began by including all of these variables in a Cox proportional hazards model with detailed categories (PSA as 2-4, 4.1-6, 6.1-8, 8.1-10, 10.1-20, 20.1-30, >30; Gleason as 1-2/1-2, 1-2/3, 3/1-2, 3/3, 1-3/4-5, 4-5/1-3, 4-5/4-5; T-stage as T1c, T2a, T2b, T2c, T3a; PPB as <15%, 15-25%, 26-33%, 34-50%, 51-66%, 67-79%, ≥80%; age as <50, 50-54, 55-59, 60-64, 65-69, ≥70; and ethnicity as African-American, Caucasian, other).
The results of the initial model were reviewed to determine whether any variables could be eliminated and which category levels could be collapsed. Ethnicity was not a significant predictor of DFS [hazard ratio (HR) for African-American ethnicity 1.16, p=.53, HR for other/unknown ethnicity 0.31, p=.10]. Further refinement of the model was accomplished primarily through an examination of the model's parameter estimates (PEs): category levels within each variable with similar PEs could be combined to reduce the model. This process was repeated iteratively until the final model was reached.
We also built models incorporating various mathematical transformations of PSA, including linear, logarithmic, truncated, sigmoidal, cubic spline, and piecewise linear. There was in fact a slight improvement in the performance of the model using the piecewise linear PSA function rather than categorized PSA levels. However, the relatively small gain in accuracy would not be worth the large loss in simplicity and ease of use; therefore categorized PSA was retained in the final model. Finally, in addition to the various categorizations of PPB, we examined the percent of biopsy cores positive from the more involved side of the prostate only, as well as the absolute number—rather than percentage—of cores positive. These variables, however, did not provide any additional predictive information.
Once the final model was specified, the PEs were used to assign points for each level of the variables in the model. We decided that each 2-point increase in the final index should represent approximately a doubling of risk for the outcome of disease recurrence. Proportional hazards model PEs are calculated on the log scale; thus a PE of 0.7 would result in a doubling of risk, and each 0.35 increase in PE would be worth 1 point. Points were thus assigned to each level of the final variables in the index. The final CAPRA score for each patient was calculated by summing the points for each variable in the model.
CAPRA scores were calculated for the men in the analytic population and were included in Cox proportional hazards regression models, with HR calculated for each CAPRA score. Life table and Kaplan-Meier analysis were used to determine the probability of DFS at 3 and 5 years for each CAPRA score level.
We calculated Kattan nomogram scores7 for each man in the dataset, and classified each according to a modification of D'Amico et al's risk groupings.6 A patient with PSA <10 ng/ml, no Gleason pattern 4 or 5 disease, and a clinical stage of T1 or T2a was low risk. Intermediate risk patients were those with PSA 10.1-20 ng/ml, Gleason 7 or <7 with 4 or 5 as the secondary pattern, or clinical stage T2b-c. High risk patients had PSA >20 ng/ml, Gleason >7 or 4 or 5 as the primary pattern, or clinical stage T3a.
Relationships among the CAPRA, Kattan, and D'Amico scores were assessed by Pearson correlation coefficients (r), frequency analysis of D'Amico categories by CAPRA level, and mean Kattan nomogram score per CAPRA level. We also calculated the concordance index (c-index) for each algorithm. The c-index in survival analysis is the proportion of randomly paired patients for whom the patient with the higher probability of recurrence (higher CAPRA score, higher D'Amico risk category, lower Kattan nomogram score) also had the earlier observed disease recurrence. The concordance index ranges from 0-1, with 1 indicating perfect concordance and 0.5 indicating no concordance.
All analyses were performed using Statistical Analysis Software (SAS) version 8.2, except for the c-index, which was calculated using S-Plus version 6.0.
Of the 1439 patients, 210 (15%) recurred at a median of 21 months. 145 (69%) of these failed by PSA criteria, and 65 (31%) by second treatment. Those who failed by the second treatment criterion did so at a median 45 months following prostatectomy, with a median PSA value of 0.3 at time of second treatment. The remaining 1229 patients were censored at a median of 24 months. The mean patient age was 62. 22 patients (8.4%) were African-American, 52 (3.6%) were of other or unknown ethnicity, and the remainder were Caucasian.
The final UCSF-CAPRA scoring system is presented in Table 1, which also illustrates the distribution of patients and crude recurrence rates across the levels of these variables. Points for each variable are totaled to yield a final CAPRA score of 0-10. The parameter estimates and HRs for each are given in Table 2.
Table 3 presents the distribution of patients across the possible CAPRA scores, as well as the crude recurrence rates for each score. 88% of the patients had scores in the range of 1-4; only 2% had scores above 6, and none had a score of 10. Crude recurrence rates were 10.9% for patients with scores 0-3 and 78.6% for those with scores ≥7. In the Cox regression analysis, the PE for each incremental point in the index was 0.42 (p<.0001), with a HR for recurrence of 1.5 (95% CI 1.4-1.6). This result indicates roughly a doubling of risk of recurrence on average with every two point increase in CAPRA score. The PEs and HR values with 95% CI for each individual score are presented in Table 4. This table also includes the results of the Kaplan Meier analysis: patients with CAPRA scores of 0 or 1 had 3- and 5-year RFS rates of 91% and 85%, respectively. For those with CAPRA scores of 7 or more, these rates were 24% and 8%, respectively. Figure 1 presents the survival curves.
Correlation among the UCSF-CAPRA and the D'Amico and Kattan instruments is as follows: between CAPRA and Kattan, r=.77; between CAPRA and D'Amico, r=.74; and between Kattan and D'Amico, r=.71 (all p<.0001). Table 5 presents the median Kattan score and distribution of D'Amico risk groups at each CAPRA score. The c-indices for the three systems were 0.63 (0.55-0.72) for D'Amico, 0.65 (0.56-0.75) for Kattan, and 0.66 (0.57-0.75) for CAPRA.
As of April 2000, Ross et al had counted 42 nomograms published for evaluation of the various stages of prostate cancer;5 others have been published more recently. Another offering to this already rich literature certainly must be justified. The D'Amico classification performs well in identifying patients at low risk, but there is significant overlap among patients classified to the intermediate and high risk categories.10 This is not surprising, as the classification does not account for multiple adverse risk factors, such that a patient with PSA 19 ng/ml, Gleason 4+3, stage T2b disease, for example, would be considered at higher risk than one with PSA 5 ng/ml, Gleason 4+4, stage T1c.
Other instruments in current use, such as the Kattan nomogram or the tables published by Partin et al,11 better integrate multiple risk factors, but are difficult to apply without having paper copies of the instruments at hand. A hand-held computer version of the Kattan nomogram is available, but with this approach the derivation of the predicted outcome score—and the relative weights given each variable—becomes completely opaque to both clinician and patient. We have also shown recently that the Kattan nomogram may overestimate likelihood of recurrence-free survival in the community setting, especially among patients with relatively low risk tumors.12
The CAPRA score performed well in this community-based cohort, with a concordance index comparable to that of the Kattan nomogram. As should be expected from an RP cohort, the distribution of scores was shifted toward low scores, representing lower risk. Consistent with other nomograms, the greatest predictors of risk remain the PSA and Gleason score. There is known to be natural fluctuation in PSA levels in the absence of active treatment. The decision to use the highest PSA prior to treatment was an arbitrary one, but the highest and last PSA levels were different among only 4.6% of the patients; moreover only 1.6% would have a different CAPRA score were the last rather than highest PSA used. We chose to exclude 75 patients with a PSA <2 ng/ml, who constitute a very small fraction of the CaPSURE patients and do not represent typical prostate cancer patients. They had unusually indolent tumors, and to include them would have precluded the development of a 0-10 score applicable to the large majority of contemporary newly-diagnosed patients.
Similar reasoning led to the exclusion of patients with fewer than 6 biopsy cores taken. Again, we aimed to design an instrument germane to prostate cancer patients newly diagnosed in 2004; to include those with fewer cores would have significantly distorted the analysis of PPB. The importance of systematic biopsy is increasingly well-recognized among CaPSURE urologists: by 2002, only 2.8% of patients in our cohort had fewer than 6 biopsy cores taken. We therefore excluded these patients, at the cost of excluding more patients diagnosed earlier in the study period and thus shortening our mean follow-up time.
Clinical T stage as assessed by digital rectal exam was not a significant predictor of outcome in our model except in the case of palpable extracapsular extension (stage cT3a) which raises the score by 1. Age at diagnosis is not a significant variable in most extant nomograms. In our dataset, although there was not a statistically significant difference, the parameter estimate for men under 50 compared to over 50 at diagnosis was large enough to merit inclusion in the CAPRA score. 4% of patients in our cohort were under 50 and were half as likely to recur as men over 50. This finding is consonant with recent reports from academic series which report better pathological and biochemical outcomes among men under 50.13, 14
Evidence from numerous studies over the past several years indicates that information derived from the results of the diagnostic biopsy contributes significantly to accurate risk assessment among patients with newly diagnosed localized disease.15-18 In the CAPRA model, PPB only contributed one point despite analysis of this variable in multiple forms. It may be the case that variability in the community setting in urologists' biopsy techniques and/or pathologists' interpretations of specimens, or some other unexplained factor, reduces the predictive power of this variable relative to that found in academic series. Of note, a recent attempt at improving the Kattan nomogram by including PPB and other biopsy data produced only minor improvements in its predictive power.19 PPB may also not be the best measure of tumor burden on biopsy; Freedland et al, for example, demonstrated that the total percentage of biopsy tissue involved was a better predictor of pathological and biochemical outcomes than the percentage of cores positive.20 This variable is not yet collected in CaPSURE, however, and therefore could not be considered for inclusion in CAPRA.
Within this large cohort of primarily community-based RP patients, the UCSF-CAPRA score proved to have predictive accuracy for biochemical recurrence comparable to the Kattan nomogram, and is significantly easier to calculate and apply both for clinical and research purposes. This novel instrument clearly must be validated in other cohorts of RP patients; we also plan to test the instrument in cohorts of radiotherapy patients. We further recognize that biochemical recurrence is not an ideal proxy for either disease-specific or overall mortality, and we will investigate the CAPRA score's ability to predict these outcomes as more of the patients in CaPSURE reach them. Clearly, no nomogram can replace clinical decision-making in men with localized prostate cancer: patients and physicians must weigh such factors as patients' preferences for various quality of life outcomes as well as risk of disease recurrence or progression. Nonetheless, the UCSF-CAPRA offers practical guidance for these decisions, and will also serve as a useful risk-stratification tool for future prostate cancer studies.
CaPSURE™ is a research collaboration between UCSF and TAP Pharmaceutical Products, Inc., in Lake Forest, IL (TAP). Funding for this research project was provided by TAP and the National Institutes of Health/National Cancer Institute University of California-San Francisco SPORE Special Program of Research Excellence p50 c89520.