|Home | About | Journals | Submit | Contact Us | Français|
Pretreatment prostate-specific antigen (PSA) dynamics (PSA velocity and PSA doubling time) are widely advocated as useful prognostic markers in prostate cancer. We aimed to assess the published evidence for the clinical utility of PSA dynamics in this population.
We conducted a systematic review of studies published before March 2007 in which a PSA dynamic (velocity or doubling time) was calculated in patients before definitive treatment, a subsequent event (such as biopsy or recurrence) was ascertained, and the association between the two was analyzed. Our principal end point was the type of analysis reported, particularly whether the predictive accuracy of a statistical model that included both absolute PSA level and a PSA dynamic was compared with that of a model that included only PSA.
Eighty-seven articles were eligible for analysis. The most common end points were biopsy (42 articles), and either recurrence (14 articles) or metastases or death (14 articles) after definitive therapy. Although PSA dynamics were generally found to be associated with outcome, only one article compared predictive accuracy of models with and without a PSA dynamic: this reported that PSA velocity improved prediction slightly (from 0.81 to 0.83), but was subject to verification bias. No article used decision analytic methods to examine the clinical impact of PSA dynamics.
There is little evidence that calculation of PSA velocity or doubling time in untreated patients provides predictive information beyond that provided by absolute PSA level alone. We see no justification for the use of PSA dynamics in clinical decision making before treatment in early-stage prostate cancer.
Prostate-specific antigen (PSA) is one of the few molecular markers routinely used for detection, prognostication, and monitoring of a common cancer. Most widely known as a screening test to detect prostate cancer, PSA is also known to be of value to risk stratify patients at the time of surgery, provide an early indication of disease recurrence, and monitor response to therapy in patients with advanced disease.
Cancer is a growth process, and it seems reasonable to suppose that the rate of change of a tumor marker would be a more sensitive marker of disease aggressiveness than an absolute level: one might presume, for example, that a patient with a small but rapidly growing tumor is more likely to die of cancer than a patient with a larger burden of more indolent disease. Accordingly, since Carter et al1 introduced the concept in 1992, the rate of change in PSA—PSA dynamics—has become the focus of intense research activity in prostate cancer. Two common metrics are PSA velocity and PSA doubling time. PSA velocity is the change in PSA over time, typically given as nanograms per milliliter per year; PSA doubling time is the number of months for a certain level of PSA to increase by a factor of two. Despite their apparent simplicity, PSA velocity and doubling time have been defined in a large number of different ways, with investigators varying with regard to the minimum number of PSA measures needed to calculate a dynamic, the minimum and maximum time between measures, and the statistical method for estimating change when there are more than two PSA levels.
PSA dynamics have been advocated as a marker for a wide range of different end points in prostate cancer. An important distinction needs to be made between pretreatment and post-treatment PSA dynamics. In the latter case—which includes, for example, PSA doubling time at recurrence after radical prostatectomy—patients do not have an intact prostate, PSA levels derive only from cancer, and PSA is thus a sensitive marker of cancer burden. As such, PSA dynamics would be expected to match cancer growth rates closely. Indeed, there are data suggesting that PSA dynamics can predict outcome of salvage radiation therapy,2 the probability of a positive bone scan at biochemical recurrence,3 and overall survival in patients with hormone-refractory disease.4
Pretreatment PSA levels depend on both malignant and nonmalignant processes in the untreated prostate. PSA dynamics are therefore related to cancer growth rates only in part. Nonetheless, pretreatment PSA dynamics have been claimed to aid long-term prediction of cancer diagnosis,5 cancer detection,6 prediction of biochemical7 and clinical8 recurrence after curative therapy, and prediction of progression in patients on active surveillance.9 Pretreatment PSA dynamics are also starting to become incorporated into clinical practice guidelines. For example, the National Cancer Center Network 2007 guidelines for prostate cancer detection10 include a recommendation that men with a PSA velocity greater than 0.35 ng/mL/yr should consider biopsy, even if their PSA level is low.
We recently completed two separate studies of PSA dynamics before treatment, and in neither case did we find evidence that the rate of change of PSA was of value. In our first study, we found that PSA velocity did not improve on PSA alone for predicting long-term risk of prostate cancer;11 in the second study, PSA dynamics did not help predict either recurrence or prostate cancer mortality after radical prostatectomy.12 This led us to reassess the evidence base for pretreatment PSA velocity and doubling time as prostate cancer markers. Here, we report a systematic review of studies investigating pretreatment PSA dynamics.
We searched MEDLINE to the end of February 2007 for articles on prostate cancer and PSA dynamics. The search terms are shown in Table 1. We supplemented our searches by writing to researchers who had published on PSA dynamics, asking for details of additional studies.
To be eligible for analysis, articles had to meet four criteria. First, our interest was pretreatment PSA dynamics, so we specified that patients must have an intact prostate at the time of the final PSA measure required for calculation of the PSA dynamic. Second, the study had to include at least one of the following end points: diagnosis of prostate cancer; stage or grade of prostate cancer; biochemical recurrence after radiotherapy or radical prostatectomy; progression for patients on active surveillance; metastases or death from prostate cancer. Third, the study must include at least two measures of PSA before a clearly defined point in the patient's clinical history, such as biopsy for studies of cancer detection, or radical prostatectomy for a study looking at biochemical recurrence. This was a particular concern for active surveillance studies: in many of these studies, PSA dynamics were calculated not at a specified landmark time, such as 1 year after diagnosis, but at the patient's last follow-up before progression or censoring. As this value cannot be used for a prediction, such studies were excluded. Fourth, the study must evaluate the association between the PSA dynamic and the end point.
Articles that did not meet the eligibility criteria were categorized according to the reason: not human subjects; no original data (eg, review article); no PSA dynamic measured; no eligible study end point; patients with treated prostate cancer at the time of evaluation of prostate cancer dynamics; no association between dynamic and study end point assessed; dynamic not assessed at a defined landmark time before progression on active surveillance protocols; other reason for exclusion.
Data were extracted from each study reported using a standardized data extraction form. The following variables were documented: the study end points; PSA dynamic assessed (velocity or doubling time); type of cohort; number of patients; calculation of the PSA dynamic; use of cut points; number of patients experiencing the study end point; type of statistical analysis; authors’ conclusion.
The central focus of data extraction concerned the statistical methods. We used the schema of a previously published systematic review of molecular marker studies that follows a hierarchy of evidence regarding the clinical value of a molecular marker.13 We first documented whether a P value was reported for a test of association between the PSA dynamic and the study end point, either univariately or in a multivariate model that included the level of PSA. Next, we documented whether a measure of predictive accuracy was reported for either the univariate or multivariate model, including, but not limited to, sensitivity, specificity, area under the curve, negative or positive predictive value, and concordance index (C-index). Our key question concerned comparisons of predictive accuracy: whether the predictive accuracy of PSA alone was compared with that of a PSA dynamic or, alternatively, whether the predictive accuracy of a model that included PSA was compared with a model that included both PSA and a PSA dynamic. Other predictors, such as stage or grade, could be included if the additional predictors were identical in the models with and without the PSA dynamic. We also assessed whether any attempt was made to evaluate the clinical impact of incorporating PSA dynamics in clinical decisions, for example, by estimating the number of additional cancers identified or biopsies avoided.
Authors’ conclusions were defined as positive (1) if at least one PSA dynamic was reported to be clinically useful, or improve prediction, for at least one end point, or (2) if the PSA dynamic was reported to be associated with at least one end point and neither clinical value nor improvement in prediction were evaluated. Conclusions were categorized as negative if PSA dynamics were found to be unassociated with outcome or did not improve prediction or were not clinically useful.
Our primary planned analyses were descriptive: proportions for categoric variables and medians and interquartile ranges for data such as sample size. We also planned to test whether there was an association between the statistical analyses used and the conclusions drawn by study authors. We hypothesized that authors might be more willing to draw positive conclusions about a PSA dynamic if they only assessed the statistical significance of the association between the dynamic and outcome, as compared to if they evaluated predictive accuracy. Finally, we planned to describe in detail all articles that reported whether a PSA dynamic increased the predictive accuracy of a model that included PSA alone. In these analyses, we were particularly interested in assessing “verification bias”:14 in screening studies, men with low PSA level are unlikely to undergo biopsy; assuming that such men are cancer free leads to bias when assessing the relationship between PSA and prostate cancer.
Inclusion and exclusion of articles was decided by one reviewer (M.F.O. or C.S.) and confirmed by a second (A.J.V.). The review methods were first piloted on a sample of 10 articles by one reviewer (M.F.O.). After review of the protocol, all articles were then assessed by a different reviewer (C.S.) and checked independently (A.J.V.), with disagreements resolved by consensus.
A total of 1,882 articles were retrieved by our MEDLINE search, of which 87 were found to be eligible. Reasons for exclusion are listed in Table 2.
Of the included articles, PSA velocity was studied in 64 articles, PSA doubling time was studied in 17 articles, and both metrics were studied in six articles. Five articles included an idiosyncratic definition of PSA dynamics in addition to doubling time or velocity. The median number of patients was 295 (interquartile range, 86 to 1,095 patients). Cohorts were categorized as population-based for 35 articles (40%); referral for 22 articles (25%), and clinical for 30 articles (34%). Approximately half of studies (40 articles) evaluated PSA dynamics as continuous variables in all analyses; of the remainder, 12 articles used a data-dependent cut point, 14 articles used a prespecified cut point, and in 21 articles, the rationale for the choice of cut point was unclear.
Table 3 describes the end points assessed in the different trials. Slightly more than half of studies focused on the detection of prostate cancer. A total of 110 end points were analyzed in the 87 trials. The median number of events for each end point (number of events missing for five end points) was 38 (interquartile range, 17 to 120 events), but was sometimes low: 16 end points included 10 or fewer events.
Table 4 shows the statistical methods used to evaluate the association between PSA dynamics and study end points. There was an overwhelming focus on hypothesis testing, whether in the univariate or multivariate context. Only a minority of articles reported on the predictive accuracy of PSA dynamics, and only two articles compared the predictive accuracy of a model including both a PSA dynamic and PSA with the accuracy of a model including PSA without the PSA dynamic. In one of these two articles,15 the authors stated that “inclusion of PSA velocity did not improve the out-of-sample prediction of prostate cancer risk,” but did not provide data. Accordingly, we did not formally categorize this article as comparing accuracy with and without PSA dynamic.
Forty-seven articles (54%) were classified as reporting positive results, with 30 articles (34%) reporting negative results, and 10 articles (11%) reporting that results were unclear. There was no significant relationship between the statistical methods used in an article and whether the authors reported positive or negative results (P > .2 by Fisher's exact test).
We originally planned to describe in more detail only those studies that compared the accuracy of a model incorporating both PSA and a PSA dynamic with that of a model including PSA alone. However, only one study16 formally met this criterion: Table 5 therefore additionally gives details of studies that compared the accuracy of PSA with that of a PSA dynamic. In general, studies either found PSA to be a more accurate predictor than PSA dynamics, found only trivial differences in favor of PSA dynamics, or were associated with serious methodologic shortcomings, such as verification bias24 or small sample sizes.21 There seems to be some suggestion from Table 5 that PSA velocity might play a role in men with prior negative biopsy, and we intend to address this question empirically.
We have reported on the statistical methods used in the studies of pretreatment PSA dynamics as a marker for prostate cancer. Our key question concerned the value of adding information about a PSA dynamic to that of PSA alone. Consider the case of a clinician who needs to make a decision about a particular patient. The clinician will naturally have the patient's most recent laboratory report on hand, including the PSA level. If the clinician also wanted to use PSA dynamics to inform the clinical decision, he or she would have to look up prior PSA levels—perhaps contacting another clinic if the patient had recently moved—and then perform a calculation, which might be complex. In theory, this process could be computerized; however, doing so would still need to be well motivated. Moreover, some patients obtain PSA levels from different laboratories, complicating the clinician's task. To show that it is worth going to the time and trouble to use PSA dynamics, it would have to be demonstrated that doing so would improve clinical decision making. At a minimum, this would require that predictions based on PSA plus PSA dynamics are more accurate than those based on PSA alone; ideally, a decision analysis would also show that clinical outcome would be improved: for example, a demonstration that use of PSA dynamics would importantly reduce the rate of unnecessary biopsy without missing an excessive number of cancers.
We have found that, although PSA dynamics are associated with many end points, there is a near complete lack of evidence that pretreatment PSA dynamics are of clinical value for early-stage prostate cancer. Only two studies compared the accuracy of a statistical model incorporating both PSA and a PSA dynamic with the accuracy of a model that included PSA without the PSA dynamic: one showed no improvement in accuracy associated with PSA velocity; the other showed some minor improvements but was subject to verification basis. Studies comparing the accuracy of PSA with PSA dynamics similarly failed to show clear evidence in favor of PSA dynamics. We therefore conclude that calls to use PSA dynamics in clinical practice are not supported by current clinical evidence. Such calls would include both direct clinical recommendations, such as recommending biopsy for men with low PSA but a PSA velocity greater than 0.35 ng/mL/yr,10 and inclusion of PSA dynamics as an inclusion criterion for a clinical trial. This is on the grounds that, were such a trial to be successful, whatever PSA dynamic cut point was used would determine which patients should receive the study agent in clinical practice.
Comparing the accuracy of statistical models with and without PSA dynamics is not a complex statistical procedure. It therefore came somewhat as a surprise to us that only two studies did so. Moreover, only a third of articles included any evaluation of accuracy at all, with the majority of articles focusing purely on P values and hypothesis testing. The time-honored distinction between clinical and statistical significance is of particular relevance here: it is perfectly possible for a marker to be a statistically significant predictor of an end point, but to add little clinical information (indeed, this is what we saw in one of our own studies11).
Accordingly, we make the following recommendations for future research on PSA dynamics. First, efforts should be made to avoid verification bias; for example, researchers should avoid defining men not undergoing biopsy as cancer free. Methods to correct for verification bias have been published,14 but these have poor properties if there are few false negatives, and this is exactly what tends to happen in screening studies: men with low PSA velocities will generally not reach a high enough PSA level to undergo biopsy and so will not be found to have cancer during the course of the study. Second, the end point and the marker should be independent. As an example, in some active surveillance studies, a high PSA level defines progression. A high PSA velocity will inevitably lead to a high PSA; this inevitably creates a statistical relationship between PSA velocity and progression. Third, both PSA and PSA dynamics are continuous variables, and risk is unlikely to be homogenous on either side of any particular threshold. As such, both should be entered into analysis as continuous variables. Fourth, it is quite possible that PSA dynamics have a nonlinear relationship with outcome. It might be, for instance, that the risk of prostate cancer is low if PSA velocity is negative, suggesting no tumor growth, but also if PSA velocity is high, suggesting prostatitis. Accordingly, researchers should consider modeling PSA and PSA dynamics using nonlinear terms, such as splines or polynomials. Finally, and perhaps most importantly, researchers should assess whether PSA dynamics adds to existing clinical information. The most straightforward approach is to calculate predictive accuracy for a model that includes both PSA and PSA dynamics and compare this with accuracy of a model that includes PSA but not PSA dynamics. In some cases, such as predicting recurrence after surgery, it would be appropriate to include other predictors, such as stage or grade. Researchers should also consider decision analytic techniques to determine whether using PSA dynamics to influence clinical decision making improves patient outcome.
Absence of evidence is not evidence of absence,27 and we would not want to be interpreted as claiming that pretreatment PSA dynamics are not of value as prostate cancer markers. On the contrary, it seems highly possible that PSA dynamics might help predict at least one of the end points we study here. An obvious example would be a man undergoing screening with 6 monthly PSA levels of 1.2, 1.5, 1.6, and then 17 ng/mL/yr: the PSA velocity of 15.4 ng/mL/yr would be indicative of prostatitis rather than prostate cancer. Moreover, we see the value of post-treatment PSA dynamics, such as the PSA velocity at the time of recurrence, as being of proven value.
In summary, we have found little evidence that pretreatment PSA velocity or PSA doubling time are of value for early-stage prostate cancer. There is therefore no justification for the use of PSA dynamics in the clinical setting or as an inclusion criterion for clinical trials in this population.
The author(s) indicated no potential conflicts of interest.
Conception and design: Andrew J. Vickers, M. Frank O'Brien, Hans Lilja
Financial support: Andrew J. Vickers
Collection and assembly of data: Andrew J. Vickers, Caroline Savage, M. Frank O'Brien
Data analysis and interpretation: Andrew J. Vickers, Caroline Savage
Manuscript writing: Andrew J. Vickers, Caroline Savage
Final approval of manuscript: Andrew J. Vickers, Caroline Savage, M. Frank O'Brien, Hans Lilja
published online ahead of print at www.jco.org on December 8, 2008
Supported in part by a P50-CA92629 SPORE grant from the National Cancer Institute and by the Allbritton Fund and the Koch Foundation.
The funding bodies had no role in study design, data collection, data analysis, data interpretation, or writing of the report. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication.
Authors’ disclosures of potential conflicts of interest and author contributions are found at the end of this article.