|Home | About | Journals | Submit | Contact Us | Français|
Development of effective therapies for recurrent glioblastoma multiforme (GBM) and reliable, timely evaluation of their benefit are needed. Understanding the relationship between objective response (OR) and survival is important for determining whether OR can provide an early signal of treatment activity in clinical trials. We performed a landmark analysis to evaluate the association between OR and survival at 9, 18, and 26 weeks for 167 patients with recurrent GBM who participated in BRAIN, a phase II trial that evaluated efficacy of bevacizumab alone or in combination with irinotecan, using the Cox regression models adjusted for age, baseline Karnofsky performance score, first vs second relapse, and treatment arm. Hazard ratios (HRs) and P-values for survival between responders and nonresponders were calculated. Additional analyses were performed to test robustness, validity, fit, and accuracy of the models. The relationships between progression-free survival (PFS) and survival and between OR and PFS were also explored. There were 55 responders and 112 nonresponders across the 2 treatment arms in BRAIN. OR status at 9, 18, and 26 weeks was a statistically significant predictor of survival (HR ≤ 0.52, P < .01). PFS was also a statistically significant predictor of survival at each landmark (HR ≤ 0.25, P < .0001). The association between OR and PFS was not statistically significant, likely due to inadequate statistical power for the analysis. Clarifying the relationship of OR and survival is important for determining whether OR can be a reliable predictor of the benefit of a therapeutic agent in patients with recurrent GBM.
Glioblastoma multiforme (GBM) is the most aggressive malignant primary brain tumor in adults. Advances in treatment for newly diagnosed GBM have led to the current 5-year survival rate of 9.8%.1 Despite therapy, once GBM progresses, the outcome is uniformly fatal, with median overall survival (OS) historically ≤30 weeks.2,3
Development of effective therapies for recurrent GBM and reliable, time-efficient evaluation of their benefit in clinical trials are needed. Although OS is the gold standard for evaluating therapeutic efficacy in clinical trials, 6-month progression-free survival (PFS) is considered to be of clinically meaningful benefit in this rapidly progressing disease and often serves as a primary endpoint in phase II trials. For instance, in an analysis of 345 patients with recurrent GBM who participated in North Central Cancer Treatment Group (NCCTG) trials who were alive 6 months after study entry, those who had disease progression by that time were more than twice as likely to die in a given period compared with those whose disease had not progressed.2 Likewise, data from 437 patients with GBM who participated in North American Brain Tumor Consortium (NABTC) phase II protocols indicated that PFS at 9, 18, and 26 weeks was a statistically significant predictor of survival.4
Another efficacy endpoint often used in phase II trials is objective response (OR). According to accepted and proposed neuro-oncology criteria,5,6 radiographic OR corresponds to a ≥50% reduction in size of enhancing tumor on consecutive CT or MRI scans that endures for at least 4 weeks with stable or reduced steroid dose. Historically, few GBM trials have demonstrated OR rates of significant magnitude (ie, 15%–20%, according to the FDA/American Association for Cancer Research/American Society of Clinical Oncology Public Workshop on Brain Tumor Clinical Trial Endpoints, North Bethesda, MD, January 2006) to perform a robust evaluation of OR as a predictor of survival in recurrent GBM. A review of NCCTG trials demonstrated a 7% OR rate in patients with recurrent GBM, and although OR had a predictive trend for both PFS and OS, it did not reach statistical significance.7
Increased OR rates observed in patients with recurrent GBM treated with antiangiogenic agents, such as bevacizumab (BEV),8,9 have revived interest in OR as a predictor of survival benefit. Understanding the relationship of OR to survival in recurrent GBM is important for determining whether OR can provide an additional early signal of treatment activity and a reliable indication of the potential benefit of a therapeutic agent. An early indication of treatment efficacy would give physicians the option to tailor treatment early in the course of the disease.
BEV is a humanized monoclonal antibody that inhibits vascular endothelial growth factor (VEGF) activity and was recently approved by the FDA for use in patients with GBM with progressive disease following prior therapy. In the recent BRAIN trial of patients with recurrent GBM, patients who received treatment with BEV alone or in combination with irinotecan (CPT-11) demonstrated clinically meaningful improvements in the OR rate and 6-month PFS, with encouraging OS.8 BRAIN offers the single largest data set of patients with recurrent GBM treated with an antiangiogenic agent. The high response rates in BRAIN allowed us to perform a landmark analysis to evaluate OR at 9, 18, and 26 weeks as a predictor of survival in patients with recurrent GBM. Additionally, to confirm previously reported findings of the predictive value of PFS on OS2,4 in a BEV-treated population, we examined the PFS–OS relationship in the BRAIN study. Finally, to provide a more comprehensive assessment of OR and PFS as efficacy endpoints, the OR–PFS relationship was also explored.
Data from 167 patients ≥18 years of age with histologically confirmed GBM in the first or second relapse who had received prior standard radiotherapy and chemotherapy for GBM and were randomized to receive BEV (n = 85) or BEV + CPT-11 (n = 82) in the BRAIN study were included. The primary efficacy endpoints of BRAIN were the OR rate and 6-month PFS, based on the response assessments by an independent radiology facility (IRF; RadPharm, Inc.) blinded to treatment arm.
All patients underwent MRI assessments every 6 weeks (ie, prior to beginning each treatment cycle). Progression and OR were assessed by the IRF according to World Health Organization Response Evaluation Criteria,10 taking corticosteroid dose into account (ie, Macdonald criteria5). In addition to meeting MRI criteria for complete response (ie, disappearance of all contrast-enhancing and noncontrast-enhancing tumors), a patient could not be taking corticosteroids above physiologic levels (ie, equivalent to 20 mg/day hydrocortisone) at the time of MRI. In addition to meeting MRI criteria for partial response (ie, >50% reduction in the sum of products of diameter), the corticosteroid dose at the time of MRI could not be greater than the maximum dose taken during the first 6 weeks of study treatment. The corticosteroid dose did not affect determination of stable and progressive disease. Complete and partial responses were classified according to confirmatory MRI performed ≥4 weeks after an observed response. Only contrast-enhancing lesions were measured. Noncontrast-enhancing lesions were considered nontarget lesions in tumor assessment. Contrast-enhancing lesions that were too small to measure were also considered nontarget lesions. Progression (ie, ≥25% increase in the sum of products of diameter) was determined by target and nontarget lesions. In addition to the standard Macdonald criteria, any new area of nonenhancing T2 or fluid-attenuated inversion recovery (FLAIR) signal consistent with tumor was considered progressive disease. Index lesions were not considered in the qualitative assessment of enhancement intensity. In the absence of radiographic documentation, clinical progression, assessed by the investigator according to his/her judgment of neurological progression, was used to determine disease progression. All patients were followed until discontinuation from the study, loss to follow-up, study termination, or death.
OR was defined as a complete or partial response on 2 consecutive MRIs obtained ≥4 weeks apart, with reduced or stable doses of corticosteroids. PFS was defined as time from randomization to documented disease progression or death from any cause, whichever occurred first. Data for patients who received alternative antitumor therapy prior to disease progression were censored at the last tumor assessment date prior to receiving the alternative therapy; data for patients who experienced disease progression or died more than 6 weeks (1 tumor assessment) after the last dose of study drug were censored at the date of the last tumor assessment prior to the last dose of study drug plus 6 weeks. Six-month PFS was defined as the percentage of patients who remained alive and progressionfree at 24 weeks and was estimated using the Kaplan–Meier method.11 OS was measured from randomization to death. Patients who were alive at the time of the data cutoff for the final analysis were censored at their last contact date.
For the analyses presented here, patient data from the BEV and BEV + CPT-11 treatment arms were pooled to maximize the number of ORs and, consequently, the statistical power of the analyses. Our primary analysis assessed the predictive value of OR on survival using landmark analyses, with methods similar to the previously reported studies.2,4,7 To alleviate bias due to selecting any individual landmark, 3 landmarks (ie, weeks 9, 18, and 26) were chosen; and each analysis included only those patients who were alive at a particular time point. For each analysis, a Cox proportional hazards model was used to determine whether response status of patients (ie, responders vs nonresponders) prior to a particular landmark predicted survival beyond that landmark. For instance, patients with an OR at the week 6 MRI assessment that was confirmed at the week 12 MRI assessment were considered responders in the week 9 landmark analysis. Patients who did not achieve an OR at any time while on study and those who achieved an OR after week 9 were considered nonresponders in the week 9 analysis. Survival, the outcome of interest, was defined as time from landmark to death and is hereafter referred to as “residual survival.” All models were adjusted for treatment and important baseline prognostic factors, including age, Karnofsky performance score (KPS), and first vs second relapse. Hazard ratios (HRs), 95% confidence intervals (CIs), and P-values corresponding to each of the variables in the model were calculated. The Kaplan–Meier methods were used to estimate the median residual survival in the responder and the nonresponder groups.
A number of sensitivity analyses were performed to test the robustness of the model results. These included performing the analysis with only the BEV group included; treating OR status as a time-dependent covariate; using investigator-determined, rather than IRF-determined, OR status; adding additional covariates, including gender, baseline corticosteroid use, time from last radiotherapy to randomization, and extent of surgery; stratifying by baseline prognostic factors; including age as a continuous variable; including only responders who responded beyond a particular landmark; and including KPS at the time of response rather than KPS at baseline.
Several additional analyses were undertaken to ensure the validity, fit, and accuracy of the Cox models. These included a log-cumulative hazard plot to assess the validity of the proportional hazards assumption; a log cumulative hazard plot of the Cox–Snell residuals12 to assess the model fit; and calculation of a concordance index13 to assess the predictive accuracy of the model. Finally, internal validation of the prediction model parameter estimates was performed using 400 bootstrap samples of 167 observations each.13,14 As an exploratory analysis, we evaluated the association between response and 1-year OS, using a 2 × 2 contingency table. The analysis was not adjusted for baseline factors and constituted a naïve evaluation of the distribution of <1-year survival and ≥1-year survival among responders and nonresponders. A kappa statistic15 was calculated to ascertain the level of agreement between OR and 1-year OS.
The relationships between PFS and residual survival and between OR and residual PFS at 9, 18, and 24 weeks were explored using the same landmark analysis and the Cox models that were employed in the primary analysis outlined above.
In BRAIN, 24 of 85 (28.2%) BEV patients and 31 of 82 (37.8%) BEV + CPT-11 patients had an IRF-determined OR. Six-month PFS was 42.6% in the BEV group and 50.3% in the BEV + CPT-11 group. These OR rates and 6-month PFS far exceeded those of historical controls.2,3,16 Moreover, ORs endured for a median of 5.6 months in the BEV group and 4.3 months in the BEV + CPT-11 group. A comprehensive list of demographics and baseline characteristics by treatment of patients who participated in BRAIN has been published.8 For the present analyses, select patient baseline characteristics are presented by response status. Overall, there were 55 responders and 112 nonresponders across the 2 treatment arms in BRAIN. Responders and nonresponders were similar in terms of baseline characteristics, with the exception that a larger proportion of nonresponders than responders had a KPS of 70–80 (Table 1).
The analysis population for the 9-, 18-, and 26-week landmark analyses consisted of 157, 147, and 123 patients, respectively, who had survived up to the respective landmark out of the total 167 study patients. The results of the landmark analyses indicated that nonresponders were twice as likely as responders to die in a given time period and that, after adjusting for treatment, age, KPS, and relapse status, OR was a statistically significant predictor of residual survival (Table 2). Median residual survival among responders was higher than that of nonresponders at each landmark, with no overlap in CIs.
The probability of survival was substantially higher and the hazard of death lower for responders compared with nonresponders at each time point, as evidenced by the clear separation of the Kaplan–Meier curves (Fig. 1).
The results of all sensitivity analyses were consistent with those of the primary analysis and demonstrated that, in general, responders derived a greater survival benefit compared with nonresponders (Table 3).
An exploratory evaluation indicated a moderate relationship (kappa = 0.48) between response status and 1-year OS (Table 4). Seventy-one percent of responders survived for 12 months and beyond compared with 21% of nonresponders.
The results of multiple validation analyses ensured the fit, accuracy, and validity of the Cox model (Fig. 2). In general, the assumption of proportional hazards was satisfied at the 9-, 18-, and 26-week landmarks. The alignment of the Cox–Snell residuals to the reference line indicated good fit of the model to the data. The concordance index ranged between 83% and 87% across the 3 landmarks and suggested good predictive accuracy of the model. Parameter estimates and standard errors from the original model were similar to those from the bootstrap samples, indicating validity and reproducibility of the model estimates.
The analysis population for the 9-, 18-, and 26-week landmark analyses consisted of 157, 147, and 127 patients, respectively, who had survived up to the respective landmark out of the total 167 study patients. The results of the exploratory landmark analyses suggest that PFS was a statistically significant predictor of survival in the BRAIN study, after adjusting for treatment, age, KPS, and relapse status (Table 5). HRs between 0.18 and 0.25 indicate an approximately 4-fold increased risk of death for patients who had progressed within a given time period compared with those who were progression free.
The probability of survival was substantially higher and the hazard of death lower for patients who were progression free at a particular landmark compared with those who had progressed at that landmark, as evidenced by the clear separation of the Kaplan–Meier curves (Fig. 3).
OR was not a statistically significant predictor of PFS in BRAIN after adjusting for treatment, age, KPS, and relapse status (Table 6). Notably, there was a substantial decrease in the number of nonresponders included in the week 18 and 26 analyses, since fewer of those patients demonstrated PFS beyond those landmarks.
The results of the primary landmark analysis demonstrate that OR was a statistically significant predictor of survival in BRAIN. Patients who had an OR (responders) while being treated with BEV or BEV + CPT-11 while on study tended to have longer survival in a given time period compared with those who did not have an OR (nonresponders). This is the first report of a statistically significant association between OR and survival in patients with recurrent GBM. Historically, evaluating the predictive value of OR on survival in trials of recurrent GBM was not feasible due to inadequate statistical power resulting from low response rates (ie, <10%) with previously available treatments. However, patients with relapsed GBM who were treated with the VEGF inhibitor BEV (or BEV + CPT-11) in BRAIN demonstrated OR rates that exceeded those of historical controls and provided adequate statistical power to examine the association of OR and survival.
We employed appropriate methods to formulate and validate the statistical model we used and establish its strength, predictive accuracy, and goodness of fit, while minimizing potential survivorship and selection biases. Survivorship bias results from responders being required to survive long enough to be assessed for a response, whereas there is no such requirement for nonresponders, thereby overestimating survival among responders.17 The use of landmark analyses, which included only those patients who had survived beyond a particular landmark when computing survival for that landmark (ie, residual survival), mitigated the potential for survivorship bias. Selection bias occurs when responders have baseline characteristics that predispose them to longer survival and may overestimate differences in survival between responders and nonresponders. Adjusting the Cox model for differences in patient characteristics that are known to be clinically relevant18 mitigated the potential for selection bias. Furthermore, we performed a series of sensitivity analyses to explore the impact of various additional clinical and demographic variables on estimates from the primary model, and the model remained robust with inclusion/exclusion of those variables. HR estimates from the sensitivity analysis that included only patients in the BEV arm were consistent with the primary analysis; however, statistical significance was not reached at the 9- and 18-week landmarks, possibly due to the smaller number of responders at those time points.
A potential limitation of our analyses is that BEV, like other agents that modify signal transduction through the VEGF signaling pathways, produces a normalization of tumor vasculature and a consequent decrease in radiographic enhancement (ie, pseudoresponse).6,19 In the BRAIN study, OR was based on the quantitative assessments of contrast-enhancing tumors and qualitative assessments of tumors that were too small to measure and non-contrast-enhancing tumors that were hyperintense on T2/FLAIR sequences and required that patients have a sustained (ie, ≥4 weeks) response. On the basis of the rigorous criteria used for assessment of response and progression and the observed durability of these responses, pseudoresponse is unlikely to account for the robust OR rate observed in BRAIN.
There was a statistically significant association between PFS and survival in the BRAIN study, and patients who were alive and progressionfree at each time point examined (ie, 9, 18, and 24 weeks) were less likely to die in a given time period compared with patients whose disease had progressed. Six-month PFS is a widely accepted endpoint in GBM trials. Our results provide further support for the clinical relevance of PFS. They also demonstrate that the predictive value of PFS may be observed as early as 9 weeks after the commencement of treatment with BEV, which is consistent with the Lamborn et al.4 findings after the review of NABTC trials of patients with high-grade gliomas.
The results of our exploratory analysis indicated that OR was not a statistically significant predictor of PFS in BRAIN. Previously, Grant et al.20 demonstrated the lack of a relationship between radiographic response and time to progression in patients with recurrent high-grade glioma. On the other hand, Hess et al.21 reported that response in recurrent glioma was associated with a significant reduction in progression rates, and Jaeckle et al.7 reported that response showed a nonsignificant trend as a predictor of PFS. Since our analyses included only those patients who were alive and progression free at the respective landmarks, there was a sizable decrease in the number of nonresponder and PFS events in the week 18 and 24 analyses. Consequently, the statistical power was not adequate to detect an association between OR and PFS. Thus, evaluating the relationship between OR and PFS will require a larger data set than BRAIN provides.
It should be noted that BRAIN was a small, noncomparative study in which all participants were treated with BEV. Although this allowed us to pool data across treatment arms and to maximize the statistical power of our analyses, the lack of a control arm makes interpretation of time-to-event endpoints, such as OS and PFS, challenging. Likewise, it is important to emphasize that it is not possible to generalize these findings to non-BEV treatment settings. These analyses are a best attempt to evaluate the association between OR and survival in relapsed GBM. They are based on a single study of 167 patients and, as such, do not establish OR as a surrogate for survival in the relapsed GBM setting. Establishing surrogacy would require a comprehensive evaluation of data from multiple large trials. The P-values from these analyses were not adjusted for multiple comparisons, and the observed differences in survival could potentially be confounded by unmeasured/unobserved factors that were neither accounted nor adjusted for. Ongoing phase III trials (eg, Chinot et al.22; RTOG 0825, ClinicalTrials.gov NCT00884741) with concurrent control arms will enable us to fully characterize the survival benefits of BEV in the absence of an OR.
In summary, we demonstrated that OR and PFS (which included patients with stable disease) were statistically significant predictors of survival in BRAIN. Understanding the relationship of OR to survival is important for determining whether OR, in addition to PFS, can provide an early indicator of treatment activity and a reliable predictor of the potential benefit of a therapeutic agent in patients with recurrent GBM. The ultimate goal of therapy is to extend survival. Determining the demographic and clinical characteristics of responders and the molecular characteristics of their tumors will be helpful in designing trials and identifying those patients who are likely to obtain the greatest benefit from BEV therapy.
Conflict of interest statement. M.P. is an advisor of Genentech without compensation; T.C., P.Y.W., and H.S.F. have a paid consulting relationship with Genentech; M.S. is a full-time employee of Genentech, Inc.; L.F. is a Roche stockholder and Genentech is a member of Roche; T.M. is a moderator, Advisory Board; D.S. served on Genentech advisory board; L.E.A. received honoraria from Genentech; N.P. has a paid consulting relationship with Scientific Advisory Board, Genentech; M.K.N. served on 2 advisory boards for Genentech, the goal of each being discussion of the emerging role of Avastin in GBM; A.D. is an employee of Genentech.
This work was supported by Genentech, Inc.