|Home | About | Journals | Submit | Contact Us | Français|
The US Preventive Services Task Force recently concluded that the harms of existing prostate-specific antigen (PSA) screening strategies outweigh benefits.
To evaluate comparative effectiveness of alternative PSA screening strategies.
Microsimulation model of prostate cancer incidence and mortality quantifying harms and lives saved for alternative PSA screening strategies.
National and trial data on PSA growth, screening and biopsy patterns, incidence, treatment distributions, treatment efficacy, and mortality.
A contemporary cohort of US men.
35 screening strategies that vary by start/stop ages, inter-screening intervals, and thresholds for biopsy referral.
PSA tests, false positive tests, cancers detected, overdiagnoses, prostate cancer deaths, lives saved, and months of life saved.
Without screening, the risk of prostate cancer death is 2.86%. A reference strategy that screens men aged 50–74 annually with a PSA threshold for biopsy referral of 4 μg/L reduces the risk of prostate cancer death to 2.15% with risk of overdiagnosis of 3.3%. A strategy that uses higher PSA thresholds for biopsy referral in older men achieves a similar risk of prostate cancer death (2.23%) but reduces the risk of overdiagnosis to 2.3%. A strategy that screens biennially with longer inter-screen intervals for men with low PSA levels achieves similar risks of prostate cancer death (2.27%) and overdiagnosis (2.4%) but reduces total tests by 59% and false positive tests by 50%.
Varying incidence inputs or reducing the survival improvement due to screening did not change conclusions.
The model is a simplification of prostate cancer natural history, and the survival improvement due to screening is uncertain.
PSA screening strategies that use higher thresholds for biopsy referral for older men and that screen men with low PSA levels less frequently can reduce harms while preserving lives saved compared to standard screening.
National Cancer Institute.
Prostate cancer screening is one of the most controversial topics in public health policy. Although PSA testing is ubiquitous in the US, there has always been uncertainty about its efficacy and effectiveness. Sustained declines in prostate cancer mortality since the first wave of screening in the early 1990s suggest benefit but are not conclusive, as improvements in prostate cancer treatment may also explain the decrease in prostate cancer deaths.
Results from randomized screening trials conducted in Europe and the US have only stoked the controversy. The Prostate, Lung, Colorectal, and Ovarian (PLCO) cancer screening trial in the US showed no difference between prostate cancer mortality rates in intervention and control arms (1), while the European Randomized Study of Screening for Prostate Cancer (ERSPC) showed a significant mortality reduction but documented a high frequency of overdiagnosis per life saved (2). Updated results from both studies confirmed their original findings (3, 4), with fewer overdiagnoses per life saved in the ERSPC under additional follow-up. The trial results have been extensively debated, and it is now clear that the PLCO results reflect a comparison of organized annual screening versus opportunistic screening rather than screening versus no screening (3, 5). Still, largely on the basis of these trial results, the US Preventive Services Task Force (USPSTF) recently recommended against routine PSA-based screening (6).
Other organizations have updated or are in the process of updating their guidelines in light of the trial results. To date, no other published guideline recommends against PSA screening, with many encouraging informed decision making at an individual level. However, Welch (7) points out that such informed decision making carries an enormous burden and argues that strategies that make the harm-benefit tradeoff more favorable are urgently needed.
The USPSTF recommendation also identifies the need for additional research to “evaluate the benefits and harms of modifications of the use of existing prostate cancer screening tools” and to “optimize the benefits while minimizing the harms” (6). In this article we take up that challenge and address the following question: Can we identify strategies that reduce the harms of screening while preserving its impact on detection and survival? In other words, can we screen smarter for prostate cancer?
There are many potential avenues to smarter screening because there are many parameters that define a screening strategy: ages to start and stop screening, the inter-screening interval, and the threshold for biopsy referral, and all but the starting age may depend on prior results. All parameters have been topics of debate, but it is unlikely that novel combinations will be explored in a prospective randomized setting (8, 9). An alternative is to model disease incidence and mortality under observed screening practices, then study model-projected outcomes under alternative screening strategies.
The Fred Hutchinson Cancer Research Center (FHCRC) microsimulation model of prostate cancer was developed as part of the Cancer Intervention and Surveillance Modeling Network (http://cisnet.cancer.gov), a consortium of investigators whose goal is to use modeling to understand the roles of different cancer interventions in explaining trends in cancer incidence and mortality.
The incidence component of the model consists of two linked parts: PSA growth and disease progression. PSA growth is based on data from the control arm of the Prostate Cancer Prevention Trial (Figure 1(a)). Disease progression consists of tumor onset, metastatic spread, and clinical diagnosis that would occur in the absence of PSA screening, with the risks of events after onset indicated by PSA levels (Figure 1(b)). To calibrate the model we superimpose PSA screening according to observed US screening patterns and obtain model-projected disease incidence. We then identify rates of onset, metastasis, and clinical diagnosis so that model-projected incidence matches observed incidence (10, 11). The calibrated model closely replicates observed age-adjusted incidence rates by stage and grade.
The mortality component of the model consists of disease-specific and other-cause survival. Disease-specific survival depends on age, stage (local-regional or distant), and grade (Gleason 2– 7 or 8–10) at diagnosis. For local-regional cases, disease-specific survival also depends on primary treatment (radiation or surgery), which is assumed to be administered according to patterns observed in the 9 core areas of the Surveillance, Epidemiology, and End Results (SEER) program in 2005.
Screening that identifies non-overdiagnosed disease prior to clinical diagnosis results in the identification of earlier-stage tumors than might be identified without screening, leading to a reduction in prostate cancer mortality (Appendix Figure 1). We refer to this as a “stage-shift model” for the impact of screening on prostate cancer mortality.
The candidate screening strategies we consider are all 32 combinations of: (1) two ages to start (40 or 50 years) and stop (69 or 74 years) screening, (2) two inter-screening intervals (annual or biennial), and (3) four thresholds for biopsy referral (PSA 4.0 μg/L; PSA 2.5 μg/L; PSA 4.0 μg/L or PSA velocity 0.35 μg/L/year; or PSA > 95th percentile for age (PSA 2.5 for ages 40–49, 3.5 for 50–59, 4.5 for 60–69, and 6.5 for 70–74 years)). The strategies are motivated by contemporary controversies. Some studies advocate lowering the screening start age from 50 to 40 years and others argue for lowering the PSA threshold for biopsy (12-14). Some studies suggest reducing the frequency of screening for men with low PSA levels (14); we operationalize this idea by evaluating an adaptive strategy (8) which screens biennially but increases the screening interval to 5 years if PSA is below its age-specific median (15).
We also evaluate strategies that have been recommended by guidelines groups. Based on recommendations from the American Cancer Society (ACS), we evaluate a strategy that changes the inter-screening interval from annual to biennial if PSA is below 2.5 μg/L (16). Based on recommendations from the National Comprehensive Cancer Network (NCCN) (17), we evaluate a strategy with annual screening starting at age 40 but that expands to 5-year intervals if baseline PSA is below 1.0 μg/L, changes to annual screening at age 50, and refers to biopsy if PSA exceeds 2.5 μg/L or PSA velocity exceeds 0.35 μg/L/year. We do not include a strategy from the American Urological Association as they recommend starting screening at age 40 but do not indicate a screening interval or biopsy threshold. Our reference strategy is annual screening ages 50–74 with a PSA threshold of 4.0 μg/L for biopsy referral.
For each candidate screening strategy, we project a range of negative (number of tests, false positive tests, overdiagnoses, and prostate cancer deaths) and positive (cancers detected, lives saved, and months of life saved) outcomes. Since unnecessary biopsy- and treatment-related complications represent fixed fractions of false-positive tests and overdiagnoses, we do not present these outcomes separately. Projected outcomes are presented for a contemporary man aged 40 based on a simulated cohort of 100 million men for each screening strategy. Outcomes are reported as the mean number of events or the lifetime probability of each outcome. We also calculate the additional number needed to detect (NND) to prevent one prostate cancer death, which represents overdiagnoses per life saved and has become established as a summary measure of the harm-benefit tradeoff in prostate cancer screening (18, 19).
We previously calibrated the FHCRC prostate model using data through the year 2000 (10, 11). To validate the incidence component of the model, we compare observed and model-projected age-adjusted incidence by stage through the year 2005. To validate the mortality component of our model, we simulate the ERSPC which, based on the protocol at most ERSPC centers, screened men every 4 years and biopsied 86% of men with a PSA above 3.0 μg/L (4). Using this framework, we calculate model-projected prostate cancer mortality rates for screened versus unscreened cohorts age 55–69 after 11 years of follow-up and compare the projected absolute and relative mortality reductions with observed.
Recognizing that the incidence and mortality model inputs are subject to uncertainty, we conduct a sensitivity analysis to determine the robustness of our findings across a range of plausible values. In this analysis, we focus on the inputs that are unobservable, namely the rates of disease onset, metastasis, and clinical detection in the incidence model and the extent of screening impact in the mortality model. Our previous work calibrating the incidence model to US prostate cancer incidence trends yielded a range of values for each parameter, and we run the model 100 times under each screening strategy, each time sampling the parameters from their respective ranges to determine the variability in results that would be induced by varying these inputs. In addition, we run all screening strategies under several settings for the survival impact of screening, ranging from no impact to the impact consistent with the stage-shift model (Appendix Figure 1).
This study was funded by the National Cancer Institute and the Centers for Disease Control, which had no role in the design or execution of this study.
Under no screening, the model projects a lifetime chance of a prostate cancer diagnosis of 12.0% and a lifetime chance of dying of prostate cancer of 2.86%. The chance of diagnosis is higher than estimates from the pre-PSA era of 9% (20), but our projected probability of diagnosis assumes contemporary biopsy practices, which are more sensitive than pre-PSA protocols (21).
Model-projected age-adjusted incidence closely matches observed incidence through the year 2005 (Appendix Figure 2), indicating that without any parameter change the model predicts incidence reasonably well beyond its years of calibration (1975–2000). The simulation of the ERSPC projects that screening reduces mortality by 28% after 11 years of follow-up, close to the reduction of 29% estimated by trial investigators after correction for non-compliance (4); and it projects an absolute mortality reduction of 2.08 per 1000 men enrolled after 11 years of follow-up, higher than the 1.07 per 1000 men enrolled observed in the trial (4). At least part of the discrepancy is likely due to cross over to screening in the control arm of the trial (22).
Table 1 summarizes lifetime outcomes projected under all 35 screening strategies, numbered in descending order by probability of life saved; the reference strategy (annual screening for ages 50–74 with PSA threshold for biopsy referral 4.0 μg/L) ranks 8th. Appendix Figure 3 demonstrates how outcomes under that reference strategy change when any one of its parameters (screening ages, inter-screening interval, PSA level threshold, PSA velocity threshold) changes.
The reference strategy yields a 15.3% lifetime chance of diagnosis, a 3.3% lifetime chance of overdiagnosis, and a 2.15% lifetime chance of prostate cancer death, a relative reduction of 24.8% compared to the 2.86% chance of prostate cancer death with no screening. Under this reference strategy, the lifetime chance of a false positive test is 21% and the NND is 4.7, which is similar to other long-term estimates (23, 24). Unless otherwise stated all results are presented relative to this strategy.
The NCCN strategy (Strategy 1) saves the most lives. However, the lifetime risks of a false positive test and of overdiagnosis are nearly doubled compared with the reference strategy. In general, lowering the PSA threshold or adding a velocity threshold generates substantial harms relative to incremental lives saved (Strategies 3 and 5).
Varying ages to start and stop screening has a substantial impact on lives saved and on overdiagnoses. Lowering the starting age to 40 (Strategy 6) increases the probability of life saved and overdiagnosis and substantially increases the number of PSA tests. Lowering the stopping age to 69 (Strategy 26) leads to a relative reduction of the probability of life saved by 27%, but the probability of overdiagnosis is nearly halved and the probability of a false positive PSA decreases by nearly 20%. The latter finding reflects the fact that a significant proportion of men diagnosed with lethal prostate cancer in the absence of screening are over 70, and these men have the potential to be detected early, but many more men in this age group have cancers that would not have affected their life expectancy, so screening this age group substantially increases the number overdiagnosed. Screening men up to age 74 but increasing the threshold for biopsy referral via an age-dependent PSA cutoff (Strategy 20) reduces overdiagnoses by one-third (to 2.3%) while only slightly altering the lives saved (to 2.23%). Therefore, one approach to preserve the impact of screening on mortality while controlling overdiagnosis may be to screen older men more conservatively (stopping at age 69 or increasing the PSA threshold for biopsy referral for ages 70–74).
The performance of the ACS strategy (Strategy 9) exactly parallels the reference strategy with no reduction in overdiagnoses and equivalent lives saved. The only impact of this strategy relative to the reference strategy is to reduce the number of tests conducted. This suggests that, holding starting age and PSA threshold fixed, if PSA is low, the interval between PSA assessments can be increased to biennial examinations without affecting other outcomes. Screening every five years rather than every two years when PSA is below the median for PSA within 10-year age groups (Strategy 22) lowers the average number of tests by one-third and overdiagnoses by one-quarter relative to a biennial strategy while only reducing the chance of life saved by a relative 17%.
Figure 2 illustrates the tradeoffs between the probability of life saved (X value) and the probability of overdiagnosis (X value) for selected screening strategies. Projections under the base case survival impact correspond to the 29% mortality reduction observed in the ERSPC after 11 years of follow-up (corrected for non-compliance) and are connected by the darkest line at the top. The NND for each strategy is the ratio X/Y, and dashed lines originating from the origin (representing no screening) illustrate fixed NND values of 5, 10, and 20 for reference. Strategies 1, 3, and 5 have NND between 5 and 10 because they fall between the radiating lines NND=5 and NND=10; remaining strategies under the 29% mortality reduction assumption all have NND<5. The figure illustrates that relative to stopping screening at age 69 (Strategy 26), continuing screening through age 74 but with age-dependent PSA thresholds for biopsy (Strategy 20) increases probability of life saved (absolute increase 0.1%) much more than it increases overdiagnosis (absolute increase 0.05%). The figure also shows results obtained in analyses of sensitivity to the survival impact (i.e., for mortality reductions of 20%, 10%, and 0%).
Varying the incidence model inputs produces very little variation in absolute model-projected outcomes (results not shown). Further, overall conclusions regarding tradeoffs across candidate strategies are robust to our sensitivity analysis on assumed survival impact. Less intensive strategies—i.e., those with fewer screens or higher thresholds for biopsy referral among older men—generally produce a considerably lower risk of overdiagnosis with modest impact on relative rankings of disease-specific deaths or lives saved (Figure 2).
Since the advent of PSA screening, there has been uncertainty about screening benefit and concern about screening harms. The recent USPSTF recommendation against PSA screening for prostate cancer has raised awareness of the harms of existing screening strategies. In response, we sought to identify smarter screening strategies using microsimulation modeling.
The use of modeling in policy development is becoming more accepted (25, 26). The USPSTF relied on modeling to determine strategies for breast (27) and colorectal cancer screening (28). And numerous models have been developed to study prostate cancer screening (29-32). Indeed, a recent publication considered six different strategies for prostate cancer screening (24). However, like other existing prostate screening models, it did not conceptualize the disease process in a way that permits comprehensive evaluation of all screening strategy parameters. Our model is unique in that it not only represents individual PSA over time but also explicitly links PSA growth with disease progression, which is linked with mortality. As a consequence, we can explore outcomes due to varying PSA thresholds for biopsy referral as well as variations in screening ages and intervals, which may change dynamically depending on PSA levels. By quantifying the likelihood of a false positive test, overdiagnosis, or life saved associated with a broad range of screening strategies, we can identify strategies that reduce harms but preserve the impact of early detection on prostate cancer mortality.
Our results yield several important conclusions. First, we find that aggressive screening strategies, particularly those that lower the PSA threshold for biopsy, do reduce prostate cancer mortality relative to the reference strategy. However, the harms of unnecessary biopsies, diagnoses, and treatments may be unacceptable. Quantifying the magnitude of these harms relative to potential gains in lives saved is critical for determining whether the projected harms are acceptable.
Second, we find substantial improvements in the harm-benefit tradeoff of PSA screening with less frequent testing and more conservative criteria for biopsy referral in older men. These approaches preserve the majority of the survival impact and markedly reduce screening harms compared with the reference strategy. In particular, using age-specific PSA thresholds for biopsy referral (Strategy 20) reduces false positive tests by a relative 25% and overdiagnoses by 30% while preserving 87% of lives saved under the reference strategy. Alternatively, using longer inter-screen intervals for men with low PSA levels (Strategy 22) reduces false positive tests by a relative 50% and overdiagnoses by 27% while preserving 83% of lives saved under the reference strategy. These adaptive, personalized strategies represent prototypes for a smarter approach to screening.
When smarter screening strategies achieve similar absolute probabilities of life saved, the choice between them depends on relative weighting of overdiagnosis and other harms. Using these two prototype strategies as an example, Strategy 22 reduces total tests by a relative 59% and false positive tests by 33% but increases overdiagnoses by 5% relative to Strategy 20. In general, the relative weighting of harms, like the relative weighting of benefits and harms, may depend on whether one adopts an individual or societal perspective. If an individual perspective is adopted, preferences may be variable across the population.
Other investigators have recommended personalized strategies for PSA screening as a means to reduce harms while preserving benefit. Carter et al. (14) suggested that the inter-screening interval should be lengthened in men with low PSA. The risk calculator from the Prostate Cancer Prevention Trial produces a personalized prediction of the risk of occult disease based on PSA, age, race, and family history (33). In principle we could compare an approach based on this calculator with other personalized strategies, but this would require adding race and family history to the model, recalibrating the model accordingly, and determining a reasonable risk threshold for biopsy referral. This is possible in principle but beyond the scope of the present study.
We recognize that every model is necessarily a simplification of reality and is limited by its assumptions. Our model is no exception. We allow the likelihood of developing high-grade disease to vary with age but do not model grade progression. Due to limitations in the SEER data used to calibrate the model, we are limited to two stages (SEER local-regional or distant stage) and two grades (Gleason 2–7 or 8–10). We model survival benefit via a stage-shift mechanism which is likely also a simplification. Yet, a close match between our calibrated model and observed incidence and absolute and relative mortality reductions in a simulated ERSPC give us confidence that we are producing a valid representation of the likely tradeoffs involved in screening for a complex heterogeneous disease. Our model also does not incorporate utilities and does not produce quality-adjusted estimates of the impact of screening on survival. However, existing data on utilities associated with prostate cancer screening and post-diagnosis health states are extremely limited (34) and we do not feel that they are sufficiently reliable for modeling at this time. Further versions of the model will include other elements that are missing in the present version, including utilities once adequate data become available, costs, and race-specific disease progression.
In his recent editorial (7), Welch concludes that “In the case of the prostate, for the past two decades we’ve been looking too damn hard. That’s what’s led to so many biopsies and so much overdiagnosis.” By screening smarter, we look less hard, particularly in older men at the highest risk of overdiagnosis. As demonstrated in the PLCO trial and supported by our model results across a broad range of alternative strategies, there are diminishing returns to intensive screening. If we recognize that realistic screening strategies must achieve an acceptable balance of benefits and harms as opposed to unconditionally maximizing benefits, we can improve on the effectiveness of existing PSA-based screening for prostate cancer.
The authors thank Jeffrey Katcher for developing a flexible interface for specifying candidate PSA screening strategies and Drs. Jeanne Mandelblatt and Andrew Vickers for helpful comments on an earlier draft.
Grant support: This work was supported by Award Numbers R01 CA131874 and U01 CA88160 from the National Cancer Institute and Award Number U01 CA157224 from the National Cancer Institute and the Centers for Disease Control.
Disclaimer: The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Cancer Institute, the National Institutes of Health, or the Centers for Disease Control.
Roman Gulati, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, M2-B230, P.O. Box 19024, Seattle, WA 98109-1024. Tel: +1.206.667.7795. Fax: +1.206.667.7264. Email: rgulati/at/fhcrc.org.
John L. Gore, Department of Urology, University of Washington, 1959 NE Pacific St, Box 356510, Seattle, WA 98195-6510. Tel: +184.108.40.20630. Fax: +1.206.543.3272. Email: jlgore/at/u.washington.edu.
Ruth Etzioni, Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, M2-B230, P.O. Box 19024, Seattle, WA 98109-1024. Tel: +1.206.667.6561. Fax: +1.206.667.7264. Email: retzioni/at/fhcrc.org.