|Home | About | Journals | Submit | Contact Us | Français|
We investigated the relationships between progression-free survival (PFS), response, confirmed response, and failure-free survival (FFS) with overall survival (OS) to assess their suitability as primary endpoints in phase II (P2) trials for advanced NSCLC.
Individual data of 284 patients from 4 P2 trials were pooled. Progression status and response were modeled as time dependent variables in a multivariable (adjusted for baseline age, gender, stage, and performance status) Cox proportional hazards (PH) model for OS, stratified by trial. Subsequently, Cox PH models were used to assess the impact of PFS, response, confirmed response and FFS on subsequent survival, using landmark analysis at 8, 12, 16, 20, and 24 weeks. Model discrimination was evaluated using the concordance (c) index.
The overall median OS, PFS and FFS were 9.6, 3.7 and 2.8 months, and the response and confirmed response rates were 21% and 15% respectively. Both progression status and response as time dependent covariates were significantly associated with OS (p<0.0001; p=0.009). PFS and FFS at 12 weeks significantly predicted for subsequent survival with the strongest c-index and hazard ratio (HR) combination in landmark analyses (HR, c-index: PFS - 0.39, 0.67; FFS - 0.37, 0.67). The c-indices for response and confirmed response were low (0.59-0.60), indicating their inability to sufficiently discriminate subsequent patient survival outcomes.
Failure-free survival or progression-free survival at 12 weeks is a stronger predictor of subsequent patient survival compared to tumor response, and should be routinely used as endpoints in phase II trials for advanced NSCLC.
Approximately 39% of non-small cell lung cancer (NSCLC) patients have advanced disease (stage IIIB with a positive pleural effusion or stage IV) at diagnosis and are generally considered incurable.1 Although progress has been made, the survival of patients with advanced NSCLC remains poor, with a median overall survival (OS) of 6 to 12 months, and 1-year survival rates between 20% and 50%.2,3 More recently, a phase II trial of chemotherapy with targeted agents (Bevacizumab and Cetuximab) demonstrated a median OS of 14 months and a median progression-free survival (PFS) of 7 months.4 A phase III trial with this combination is currently in development. Given the dismal prognosis of this disease, it is critical to rapidly screen new treatments and move forward promising therapies for definitive results. While OS remains the gold standard endpoint that unequivocally assesses the benefit of a new treatment relative to the current standard of care, it requires more follow-up in some cases making it a “long” endpoint to assess, especially in a phase II setting where time is of the essence. The ability to rapidly identify patients not benefiting from the current therapy and give them alternate “effective” therapies is in the best interest of the patient. Another potential challenge to OS as an endpoint is the inability to effectively assess crossover effects and subsequent therapies upon disease progression. Thus, it is critical to identify valid alternative endpoints to replace OS as the primary endpoint in phase II clinical trials. We undertook this investigation using data from previously conducted phase II trials in advanced NSCLC.
Controversies surrounding tumor burden assessment have been reported in the literature, specifically, the high inter- and intra-observer variability in NSCLC lesion measurement,5 and a lack of correlation between response and OS.6 Nevertheless, tumor response is a historically accepted endpoint to assess clinical benefit in phase II trials. The Response Evaluation Criteria in Solid Tumors (RECIST) was implemented in an effort to standardize assessment of tumor response and has been widely used in cancer clinical trials since 2000.7 Per RECIST, measurable target lesions (up to a maximum of 10) representative of all the involved organs are identified, recorded and measured at baseline using uni-dimensional tumor measurements. The overall objective status is then determined based on the assessment of the target lesions, non-target lesions and new lesions. Best response is defined as the best objective status (i.e., complete or partial response, stable disease, or progression) on treatment. Confirmed response is defined as two consecutive assessments of complete or partial response assessed at least 4 weeks apart. Thus, by definition, confirmed response, in contrast to best response, requires that the response status of the patient is sustained for at least a period of 4 weeks, thus avoiding to some extent overestimation of the observed response rate due to the repeat assessment. This is particularly important in non-randomized trials where tumor response is the primary endpoint.
With the advent of targeted therapies that prolong disease stabilization, patients typically experience stable disease (SD) rather than tumor shrinkage. It has been shown that patients with SD also achieve clinical benefit,8 and hence it is not appropriate to ignore SD when assessing treatment efficacy. Therefore, progression-free survival (PFS) rate has become an accepted alternate endpoint in assessing treatment efficacy as it includes a patient who achieves SD for an extended period of time as a success, in addition to those who achieve complete or partial response. The typical time point(s) at which the PFS status is evaluated in a phase II trial for advanced NSCLC is between 8 to 24 weeks (typically around 16 to 24 weeks) given the median PFS for this disease population. PFS is defined as the time from registration or randomization to the earlier of disease progression or death from any cause. By virtue of this definition, PFS, unlike OS, is a measure of both the efficacy and tumor growth associated with the initial therapy, unaffected by any subsequent treatment upon disease progression. In a disease with poor prognosis such as advanced NSCLC, it can be argued that the true endpoint of OS is probably mostly determined by the progression status of the disease. However, issues pertaining to ascertainment bias in an open label trial, imbalance in assessment dates across the different arms, missing assessments, treatment holidays, and/or occurrence of progression in the middle of a long disease evaluation interval can affect the accuracy and validity of PFS as an endpoint, and need to be carefully considered. Another endpoint similar in principle to PFS is failure-free survival (FFS), where failure is defined as treatment termination from any cause (i.e., death, progression, adverse events, patient refusal, or any other unspecified reasons). FFS, by definition, therefore requires patients to be on treatment to be considered a success. Thus, the differences between FFS and PFS as defined here are: 1) PFS, unlike FFS, defines success as progression-free and alive regardless of the patient being on or off study treatment, and 2) FFS, unlike PFS, includes any reason to terminate treatment as a failure.
In this pooled analysis, we formally investigated the relationships between PFS, response, confirmed response and FFS with overall survival using individual patient data from 4 phase II trials conducted by the North Central Cancer Treatment Group (NCCTG). Specifically, we evaluated the impact of these endpoints at different time points during treatment on patient survival.
Individual patient data were pooled from four consecutive NCCTG phase II trials (N0026, N0323, N0326, and N0426) in advanced NSCLC (Stage IIIB with pleural effusion and Stage IV) conducted between 2001 and 2007.9-12 The trial regimens included Docetaxel, Gemcitabine, Pemetrexed, Temsorilimus, Sorafenib, and Bevacizumab. All trials except N0026 were negative for the protocol defined primary endpoint. All trials utilized the RECIST7 criteria for tumor response assessment and the evaluations were performed by the local treating physician.
Patients enrolled on trials N0026 and N0426 were on an every 3 weeks treatment schedule, with tumor assessments done at least every 6 weeks. Patients enrolled on trials N0323 and N0326 were on an every 4 weeks treatment schedule with tumor assessments done at least every 8 weeks. See Table 1 for information on the individual trial characteristics, and Table 2 for the expected number of assessments and the mean and range of the actual number of assessments by trial at different time points. Due to the treatment and tumor assessment schedules, objective status at 8 weeks was based on only one post-baseline assessment on all trials, and objective status at 12 weeks was based only on one post-baseline assessment in 2 of the 4 trials. However, all trials required a tumor assessment on a patient at any time during a treatment cycle when disease progression was clinically suspected. The reporting of disease progression was therefore more real time than just at the pre-defined time points. The 8 and 12 week time points could be considered too early for a meaningful analysis of response, but are included here for the sake of completeness when comparing with the PFS and FFS endpoints.
Objective status was assessed using the RECIST7 criteria in all trials. A success for response included either partial or complete response as the best objective status at any time while on treatment, and a success for confirmed response included 2 consecutive assessments of complete or partial response at least 4 weeks apart. OS is defined as the time from registration/randomization to death from any cause. PFS is defined as the time from registration/randomization to the date of first documented disease progression or death, regardless of patient being on or off study treatment. FFS is defined as the time from registration/randomization to treatment termination due to any cause (i.e., death, disease progression, patient refusal, adverse events etc.).
As the first step, both progression status and response were modeled as time dependent variables in multivariable (adjusted for stage, age, gender, and performance status (PS)) Cox proportional hazards (PH) models for OS, stratified for the patient's trial. This was done to assess whether patients who remained progression-free or had achieved a response at any time during treatment survived significantly longer than those who had progressed or not responded to treatment. Subsequently, univariable and multivariable Cox PH models, stratified for the patient's trial, were used to assess the prognostic impact of PFS, response, confirmed response, and FFS on subsequent survival using a landmark analysis (with 8, 12, 16, 20 and 24 weeks post registration as baseline). The hazard ratios (HR) and 95% confidence intervals are reported.
Model discrimination (i.e., ability to discriminate patients with different survival times) was evaluated using the concordance index (c-index) for the landmark analyses.13 The c-index computes the probability that, for a pair of randomly chosen comparable patients, the patient with the higher risk prediction will experience an event before the lower risk patient. A completely random prediction would have a c-index of 0.5, and a perfect rule will have a concordance of 1.0. All analyses were performed using SAS v9.13, and Splus 8.01.
Data was frozen on March 25, 2009. One patient was excluded due to withdrawal from study prior to receiving study treatment, thus a total of 284 patients were included in this analysis. Table 3 summarizes the baseline characteristics and outcomes of the 4 trials included in this pooled analysis. The median age was 65 years (range, 39-85), with 61% of patients being males. The majority (87%) of patients had stage IV disease and good performance scores (PS) (39% PS =0, 57% PS =1). Information on histology was collected only on one trial (N0026): 21% of patients had squamous and 79% of patients had non-squamous histology as determined by a central pathology review.
The median follow-up time for the 24 alive patients was 2.5 years (range, 0.3 – 6.0). 92% of patients had died (7% died on treatment) and 87% of patients had disease progression at the time of this analysis (13% died without disease progression). The primary causes of death in the patients who died without disease progression included clinical deterioration without documented evidence of disease progression (59%) and adverse events unrelated to the disease (41%). The overall median OS, PFS, and FFS was 9.6 months (95% CI, 8.7-11.1), 3.7 months (95% CI, 2.9-4.3), and 2.8 months (95% CI: 2.1-3.3) respectively. See Figures 1a, 1b, and 1c for Kaplan-Meier curves for OS, PFS, and FFS. The overall response and CR rates were 21% (95% CI, 16.9-26.7) and 15% (95% CI, 10.9-19.5) respectively.
Approximately, 26.4%, 34.9%, 42.6%, 52.5%, and 56.3% of patients had disease progression by 8, 12, 16, 20 and 24 weeks respectively post study entry. Ten patients had identical progression and survival times, i.e., died of disease progression. For the 246 patients who had progressed at the time of this analysis, the median time from progression to death was 5.7 months (95% CI, 4.6-7.1), with 52.9% of patients dying within 6 months after disease progression. The median time from treatment termination to death was 5.7 months (95% CI, 4.6-7.5). Approximately 10.7% of patients who ended study treatment by 12 weeks for reasons other than disease progression (refusal, adverse event, alternate treatment, and other medical problems) remained progression free at 12 weeks.
The results from the time dependent models for progression status and response are summarized in Table 4. At any given time, the hazard for a patient who had disease progression by that time was 7.0 times the hazard for a patient who had not progressed, under the assumption that the two patients were otherwise similar in age, stage, PS and gender at baseline (Model 1, Table 4). While response as a time dependent covariate was also significantly associated with OS, the effect was more modest in terms of HR, p-value, and chi-square statistic (Model 2, Table 4). In addition, PS was a significant predictor with PS 2 and PS 1 patients having a worse outcome compared to PS 0 patients in both models; and age and gender were significant predictors in model 2 (for response) with older patients and males having a worse prognosis.
259, 242, 227, 215 and 197 patients who were alive at 8, 12, 16, 20, and 24 weeks post study entry were included in the respective landmark analysis.
While patients who had achieved a response or confirmed response at some of the landmark time points had significantly longer subsequent survival, the c-indices of these models were low (0.53-0.54), indicating an inability of these metrics to discriminate patients with different survival times. In contrast, patients who were alive and progression-free (in the case of PFS), or who continued to stay on treatment (in the case of FFS) at the different landmark time points did significantly better in terms of subsequent survival compared to those who had progressed, died, or ended treatment for any reason. The c-indices ranged from 0.58 to 0.63, and the HRs ranged from 0.37 to 0.46. PFS status at 12, 16, 20 and 24 weeks were each predictive of subsequent survival with c-index of 0.61, 0.61, 0.63, and 0.63 respectively. FFS status at 12 weeks had the highest c-index of 0.62.
After adjusting for baseline age, gender, stage and PS, models including confirmed response by 16 and 20 weeks as predictors had a c-index of 0.60 and p-value ≤ 0.05; however only 12% and 16% of patients had achieved a confirmed response by 16 and 20 weeks (Model 2, Table 5). The results for PFS and FFS were consistent with the univariable analysis (Models 3 and 4, Table 5). The PFS as well as the FFS status at all of the landmark time points were significantly associated with OS.
The c-indices were similar for PFS status at 12, 16, 20 and 24 weeks (0.66-0.68), and for FFS status at all the landmark time points (0.64-0.67). The HRs for PFS and FFS ranged from 0.39-0.49, and 0.37-0.46 respectively. Looking at a combination of HR and c-index, PFS status at 12 weeks and FFS status at 12 weeks were the most predictive of subsequent survival. Patients who remained alive and progression-free at 12 weeks (PFS), or who continued to be on treatment at 12 weeks (FFS) had a significantly better prognosis in terms of subsequent survival. See Table 5 for the detailed multivariable landmark analysis results.
In summary, PFS or FFS status as early as 12 weeks is a strong predictor of subsequent survival. Figures 2a and 2b show the Kaplan-Meier curves for OS split by the landmark analysis subgroups: PFS status at 12 weeks and FFS status at 12 weeks. The median OS for patients who were progression-free at 12 weeks was significantly higher compared to patients who had progressed by 12 weeks (median OS: 12.2 vs. 3.7 months; p<0.0001). Similarly the median OS for patients who remained on treatment at 12 weeks was higher compared to those who had discontinued treatment prior to 12 weeks (median OS: 12.8 vs. 4.7 months; p<0.0001).
In this pooled analysis, we formally investigated the appropriateness of response, confirmed response, PFS and FFS as alternative endpoints to OS in phase II trials for advanced NSCLC. All trials included in this analysis used the RECIST criteria to determine the objective status, and followed either an every 6 or 8 week schedule for tumor assessments. Both progression status and response, when considered as time dependent covariates, were significantly associated with OS. Based on the landmark analysis, PFS or FFS status as early as 12 weeks were strongly predictive of subsequent survival, with patients who continued to be on treatment at 12 weeks in the case of FFS, or alive and progression-free at 12 weeks in the case of PFS having a significantly better prognosis in terms of overall survival. After adjusting for patient age, gender, stage and PS, the c-index was 0.67 for both PFS and FFS status at 12 weeks, thus indicating good discriminative ability to predict subsequent patient survival outcomes.
Not surprisingly, both PFS and FFS endpoints produced similar results as the primary reason for patients ending treatment in a disease like advanced NSCLC is disease progression. Note that only 10.7% of patients in this analysis who terminated treatment by 12 weeks for reasons other than disease progression remained progression-free at 12 weeks. This brings up an interesting question of using either PFS or FFS as an endpoint in this disease setting. Within the framework of this analysis, this question cannot be clearly answered as the same definitions for disease progression were used in the FFS endpoint as the PFS endpoint.
There are some limitations to the present analysis. All the trials except N0026 included in our analysis were negative for the primary endpoint per protocol, which could influence our conclusions regarding the predictive ability of the different metrics (PFS, FFS, response, and confirmed response) on overall survival. The relatively small sample size (N = 284), different tumor response evaluation schedules (every 6 versus 8 weeks), and the lack of a control arm (for example, platinum containing regimen) restricted the scope of this analysis. Specifically, a formal surrogate endpoint validation which requires both a patient-level and trial-level validation on data from multiple controlled phase III trials was not possible. Surrogacy validation stipulates that an ideal surrogate endpoint must not only highly correlate with the true endpoint but also fully capture the net effect of a treatment on the true endpoint or the effect of treatment on the surrogate endpoint must accurately predict the effect of treatment on the true endpoint.14,15 However, the reality is that the majority of phase II trials are single arm trials, which makes surrogacy validation based on phase II data difficult, if not impossible. In such cases, the analytical approach similar to what is done in this pooled analysis is valid and appropriate.16
While surrogate endpoint validation has been extensively studied in colorectal cancer, breast cancer and glioblastomas, there has not been a consistent effort to look at alternative endpoints to OS in advanced NSCLC in either the phase III or phase II setting.17-21 A large landmark survival analysis in advanced NSCLC (N = 984) concluded that disease control rate at 8 weeks (defined as the rate of non-progression) is a stronger predictor of subsequent survival outcomes after platinum-based chemotherapy than the traditional BR.16 While our current analysis also indicated that PFS status (or disease control rate) at 8 weeks is a significant predictor of subsequent patient survival, the c-index and the HR for PFS status at 12 weeks was stronger. Another recent study explored the surrogacy of PFS for OS in advanced NSCLC (N = 2,838, randomized in 7 trials) by assessing the association between these two time-to-event endpoints and between the treatment effects on these endpoints estimated by hazard ratios.22 This study demonstrated that PFS is a good candidate surrogate endpoint for OS in advanced NSCLC trials. Our present findings (based on data from phase II trials) are consistent with the above results and demonstrate that PFS (and FFS) is a significant predictor of patient survival in advanced NSCLC.
In conclusion, this pooled analysis using individual patient data from multiple phase II trials demonstrated that progression status of a patient, when considered as a time dependent covariate to account for the time to disease progression, is significantly associated with overall survival. Progression-free survival status or failure-free survival status at 12 weeks were superior endpoints to tumor response in predicting for subsequent survival in advanced NSCLC, with patients who continued to be on treatment at 12 weeks in the case of FFS, or alive and progression-free at 12 weeks regardless of being on or off study treatment in the case of PFS having a significantly better prognosis in terms of overall survival. Given that disease progression is the primary reason for patients with advanced NSCLC to terminate treatment, we recommend either endpoint, PFS or FFS, in place of tumor response in phase II trials for advanced NSCLC.
Previously Presented at the 2008 Annual Meeting of the American Society of Clinical Oncology
Supported in part by Public Health Service grants CA-25224, CA-37404, CA-63849, CA-35113, CA-35103, CA-37417, CA-35269, CA-35448, CA-35101, CA-35272, CA-35415, CA-35101, CA-52352.
1This study was conducted as a collaborative trial of the North Central Cancer Treatment Group (NCCTG) and Mayo Clinic