Advances in understanding the biology and genetics of renal cell carcinoma have led to novel approaches for treatment of mRCC that target the VEGF receptor. With the growing therapeutic arsenal against mRCC, it is now feasible for patients to receive multiple lines of potentially beneficial treatment. Indeed, a recent trial reported on a study population that had received three to five prior lines of therapy (Motzer et al, 2010
). With the increasing number of effective treatments available (Soulieres, 2009
), the effect of first-line therapies on OS are more likely to be confounded by the effects of subsequent therapies. The question of whether PFS/TTP rather than OS should be employed as a primary outcome measure in pivotal studies of new treatments for mRCC is therefore important. This situation is similar to that with metastatic colorectal cancer, in which there was rapid development of novel treatments, necessitating the consideration of using PFS as a surrogate for OS in pivotal studies (Buyse et al, 2007
). Although several novel treatments for mRCC have been approved for use in the United States with TTP or PFS as the primary end point in pivotal studies, and results of population-based historical cohort studies of sunitinib and sorafenib have demonstrated that the introduction of these treatment has resulted in increased survival (Heng et al, 2009a
; Warren et al, 2009
), a rigorous examination of the association between PFS/TTP end points and OS has yet to be undertaken.
The analysis presented here suggests that treatment effects on measures of PFS/TTP are strongly associated with treatment effects on OS in patients with mRCC. However, the proportion of variability in treatment effects on OS that was explained by treatment effects on PFS/TTP was modest. In particular, the adjusted R2
was 0.63 for the association between −ln HRPFS/TTP
and −ln HROS
. This value is within the range reported in other prior analyses of the relationship between treatment effects on PFS/TTP and OS (Sherrill et al, 2008
). A high R2
is not a necessary criterion for surrogacy, however, as some of the unexplained variation may reflect the sampling error in each trial due to small sample size. Even for a perfect surrogate end point, therefore, R2
will be less than one in a set of trials with small samples (Tang et al, 2007
). The trials examined in this evaluation were relatively small (median of 96 patients per arm). Moreover, there is no standard value above which an R2
(or correlation coefficient) can be claimed to be sufficient. The adjusted R2
for the association between differences in median PFS/TTP and differences in median OS was only 0.28. While the difference in median survival times may be a more appropriate measure of treatment effect than HRs if the proportional hazards assumption is violated, median survival times represent only a single point on the survival distribution and are potentially imprecise. It is not surprising therefore that amount of unexplained variation is greater when treatment effects are measured in terms of differences in median survival. Despite the relatively low R2
from this regression, it is useful to note that the results from the regression analysis presented here suggest that, on average, there is an slightly better than 1-month gain in median OS associated with a 1 month gain in median PFS/TTP. This is consistent with the hypothesis that treatment effects on post-progression survival are uncorrelated with treatment effects on PFS/TTP (Bowater et al, 2008
Not surprisingly, the association between treatment effects was stronger in studies that did not allow crossover to active treatment. Additionally, the association between treatment effects on PFS/TTP and OS were less in trials conducted after 2005, when targeted therapies for treatment of mRCC were more likely to be available as potential off-study second-line treatments. Estimates of the association between treatment effects on PFS/TTP and OS based on the entire sample of trials may therefore be conservative. An increase in response rate was also correlated with OS, although the association was not as strong as that with treatment effects on PFS/TTP measured in terms of −ln(HR).
Limitations of this study should be noted. First, this study was based on published results of controlled trials which may be subject to publication bias. To the extent that only studies showing positive effects on both PFS and OS were published, then our estimates may overstate the true association between PFS and OS. However, a funnel plot analysis of the −ln HROS provided no strong evidence of publication bias (the plot was symmetric around the mean effect size and Egger's test was not significant).
Ideally, the assessment of association of PFS/TTP and OS should be demonstrated over different stages of the disease (as the causal pathways of the disease process might differ depending on the stage) and across classes of drug (as drugs with different modes of action may have different pathways of intervention) (Fleming and DeMets, 1996
). It is possible that the association reported here could only apply to specific recognised prognostic groups, but analyses by prognostic groups were unfeasible based on data reported in study publications (Molina and Motzer, 2008
; Heng et al, 2009b
). The majority of studies included in this analysis involved comparisons of two or more cytokine therapies. The association between treatment effects on PFS/TTP and those on OS were significant in trials evaluating targeted and non-targeted therapies. The association between treatment effects on PFS/TTP and OS was not significant for comparisons involving VEGF inhibitors, although there was a trend towards an association (P
=0.0510). The number of such comparisons was small, however, and these comparisons may have been more likely to have been confounded by crossover and receipt of other non-study therapies post progression. It is reasonable to assume that results presented here can be generalised to evaluations of agents such as axitinib, that have similar mechanisms of action to the therapies included in this analysis (Rugo et al, 2005
; Rini et al, 2007
; Rixe et al, 2007
For studies that allowed for crossover from control to active therapy, we used the reported measure of treatment effect that was considered to be least likely to be subject to confounding by such crossover. While it would be desirable to use a common measure of treatment effect for all studies, it is well established that crossover from control to active treatment may attenuate observed treatment effects on OS relative to what would have been observed in the absence of such crossover (Finkelstein and Schoenfeld, 2011
; Saad and Buyse, 2012
). To include results of studies with extensive crossover without controlling for crossover would add no useful information to the analyses. The RPSFT and IPCW methods used in the analyses of everolimus (Korhonen and Malangone, 2010
; Korhonen et al, 2011
; Wiederkehr et al, 2011
) and pazopanib (Sternberg et al, 2010b
) are useful methods for analysing OS in the context of selective crossover (Finkelstein and Schoenfeld, 2011
; Morden et al, 2011
; Rimawi and Hilsenbeck, 2012
In unblinded trials, there may be a motivation for clinicians to call a patient's disease progression earlier if the patient is in the control arm than if the same patient had been in the experimental arm (Dodd et al, 2008
). To the extent that this inflates the treatment effects on PFS, the association between treatment effects on PFS/TTP and treatment effect on OS might be attenuated (because OS is not impacted by this bias). The use of blinded independent central review (BICR) may reduce any such bias. However, retrospective BICR may necessitate informative censoring on local assessment of progression, which may bias the comparison in favour of control patients (Dodd et al, 2008
). This also would attenuate the observed association between treatment effect on PFS/TTP and treatment effect on OS. Treatment assignment was blinded in only six of the studies included in the analyses. Independent review of progression was employed in six studies. As studies that used blinded treatment assignment and/or review of progression tended to be those evaluating novel targeted agents, assessment of the independent effects of blinding of treatment assignment and/or BICR on the association between treatment effects on PFS and treatment effects on OS was infeasible.
Information from the trial reports on the frequency of assessments, the criteria used to assess response and/or progression, or the duration of treatment was not extracted. It therefore was not feasible in this analysis to assess how these and other unmeasured factors might affect the association between treatment effects on PFS and treatment effects on OS. Differences in these factors might help explain some variability in observed associations between treatment effects on PFS/TTP and on OS.
As the searches upon which this study was based were conducted in 2010, results of randomized controlled trials of systemic therapies for mRCC may have been published since the original literature search for this study was conducted. One such trial is the Renal EFFECT trial, a randomized controlled trial of intermittent vs
continuous sunitinib (Motzer et al, 2012
). It may be worthwhile in future research to update these analyses using results of this and other recently published studies, and to explore in multivariate analysis the independent effects of study design and other factors on the associations between treatment effects on PFS/TTP and treatment effects on OS.
In conclusion, results presented in this study suggest that treatment effects on diseases progression end points are strongly associated with treatment effects on OS. Further research is required to establish whether disease progression end points may be used as surrogate end points for OS in clinical trials of novel treatments for mRCC.