|Home | About | Journals | Submit | Contact Us | Français|
The North American Brain Tumor Consortium (NABTC) uses 6-month progression-free survival (6moPFS) as the efficacy end point of therapy trials for adult patients with recurrent high-grade gliomas. In this study, we investigated whether progression status at 6 months predicts survival from that time, implying the potential for prolonged survival if progression could be delayed. We also evaluated earlier time points to determine whether the time of progression assessment alters the strength of the prediction. Data were from 596 patient enrollments (159 with grade III gliomas and 437 with grade IV tumors) in NABTC phase II protocols between February 1998 and December 2002. Outcome was assessed statistically using Kaplan-Meier curves and Cox proportional hazards models. Median survivals were 39 and 30 weeks for patients with grade III and grade IV tumors, respectively. Twenty-eight percent of patients with grade III and 16% of patients with grade IV tumors had progression-free survival of >26 weeks. Progression status at 9, 18, and 26 weeks predicted survival from those times for patients with grade III or grade IV tumors (p < 0.001 and hazard ratios < 0.5 in all cases). Including KPS, age, number of prior chemotherapies, and response in a multivariate model did not substantively change the results. Progression status at 6 months is a strong predictor of survival, and 6moPFS is a valid end point for trials of therapy for recurrent malignant glioma. Earlier assessments of progression status also predicted survival and may be incorporated in the design of future clinical trials.
End points for clinical trials may serve a number of purposes, including assessment of safety, biological activity, and clinical benefit. From a regulatory standpoint, survival and symptom improvement are most likely to lead to drug approval. Currently, however, a U.S. Food and Drug Administration (FDA) initiative is evaluating alternative end points, including objective response rates, progression-free survival (PFS), event-free survival, and improvement in a patient’s quality of life.1
The North American Brain Tumor Consortium (NABTC) currently uses 6-month PFS (6moPFS) as the primary end point in phase II trials for treatment of patients with recurrent malignant gliomas. The NABTC treats patients in a multiinstitutional setting, with an emphasis on early-phase studies generally for patients with recurrent tumors. Patients in these studies have documented progression before treatment, based on MRI. This provides a baseline that can be used to judge the effectiveness of the therapy by examining time to further progression. Objective response, should it occur, is also assessed, but the success of the therapy is defined by the lack of further progression.
PFS may have several advantages over other efficacy end points, such as those based on radiographic response or symptom assessment. Objective radiographic response, usually determined by area or volume changes in contrast enhancement on sequential MRI, is unfortunately an imprecise method of determining tumor burden and change over time. MRI, the best tool available for assessing response, provides only an estimate of the actual tumor burden; it often represents a combination of tumor and treatment effects and cannot directly measure infiltrating disease. The sensitivity and specificity of MR-based imaging may improve over time using newer techniques, but those techniques have not yet been validated for defining treatment response. Objective radiographic response is rare in brain-tumor clinical trials, and other end points need to be considered. For instance, patients treated with temozolomide, a cytotoxic chemotherapy agent tested extensively in recurrent glioblastoma multiforme, did not show significant changes in radiographic response compared with a concurrent control group treated with procarbazine or compared with historical controls.2 Yet, this drug has now become a standard of care in newly diagnosed disease.
End points based on symptom assessment as a measure of efficacy are also problematic in this patient population. Symptom assessment may be strongly influenced by tumor location and size. Small tumors may cause serious neurological deficits when located in critical areas of the brain, and conversely, patients with large tumors in “silent” areas may not manifest any signs of disease. In the latter case, the lack of symptoms over time may not directly relate to changes in tumor burden during cancer treatment. Toxicity is treatment specific and is routinely measured, and clearly it may influence symptoms as well as tumor burden and location. PFS is another end point option and may represent a clinically meaningful outcome, as deferral of progression would likely have a secondary benefit of deferral of progressive neurological decline or reduction in the need for corticosteroids to manage symptoms.
Ideally, evaluation of efficacy should be based on a multidimensional end point, taking into account imaging, symptoms, and progression intervals. However, until this methodology is worked out, for the reasons described above, the NABTC has chosen PFS as the primary end point for its studies.
The problems of using time-to-event analyses are well described in FDA guidance on end points for approval of cancer drugs and biologics.3 When PFS is used as the primary efficacy end point, a fixed time point (in this case 6 months) reduces time-dependent assessment bias, such as that caused by visit or imaging frequency. The use of the 6-month fixed time point has also allowed the use of historical data from a separate database of research studies in patients with high-grade glioma conducted at several institutions to provide a historical control.4 Using these historical data, the NABTC was able to define success or failure of a therapy without the immediate need for concurrent or randomized controlled studies. Because of the aggressive nature of these tumors, 6moPFS was also thought to be a clinically meaningful goal.
Ultimately, the goal of treatment is to improve survival. If time to progression is correlated with survival, then this would support the hypothesis that lengthening the time to progression would also lengthen survival time. We evaluated patients with progressive tumors treated in clinical trials conducted by the NABTC. The primary goal of the study was to determine, for this patient group, whether progression status at 6 months predicted survival from that time. We also wished to determine whether information on progression from earlier assessments could suggest possible changes for the design of future clinical trials.
All patients treated in NABTC phase II trials between February 1998 and December 2002 were included in this study (Table 1). Some studies included both a phase I and a phase II component. For the purpose of this analysis, all patients treated with the recommended phase II dose who met the eligibility entry criteria for phase II were included even if they were enrolled in the phase I portion. Patients treated with other phase I doses were excluded.
Standard entry criteria included confirmed high-grade glioma (grades III and IV) and KPS 60. All protocols required central pathology review. In the few cases where tissue was not available for central review, local pathology designation was accepted. The diagnosis was based on the most recent surgery for which data were available at the time of protocol registration. Protocols specified stratification by tumor grade. Thus, all analyses were performed separately for patients with grade III and grade IV tumors. Grade III tumors included all histological subtypes, including pure and mixed tumors. Because the protocols did not distinguish among subtypes, the analyses reported in this study were also performed without regard to subtype. Patients could be entered in more than one protocol and were included for each protocol in which they were enrolled. Analyses were repeated including these patients only once, either for the first or for the last protocol in which they were enrolled. Because the results were substantially the same, only one analysis is presented.
For all studies, response and progression were defined using the Macdonald criteria.5 Because the primary end point for these studies was 6moPFS, evaluable disease (unidimensionally measurable lesions with margins not clearly defined) or measurable disease (bidimensionally measurable lesions with clearly defined margins) was allowed, and patients having a recent resection for progressive tumor were permitted to enroll if that resection indicated the presence of tumor. In the latter situation, there was no requirement that residual tumor be present after resection. Objective responses were centrally reviewed. Confirmation by repeat imaging was not required. Objective response for measurable disease required a decrease in tumor size of 50% or greater in the setting of stable neurological findings and no increase in steroid dose. Response for evaluable disease was based on a subjective 7-point scale requiring at least a +2, or definitely better, response. Progression was determined by the local institutional investigator and was defined as an increase in tumor size of 25% or greater for measurable disease and clear worsening, or a − 2 response, for evaluable disease. Failure to return for evaluation due to death or deteriorating condition was considered to represent progression. In this case, progression date was the date the patient was declared off-study due to progression.
PFS and overall survival were measured from time of registration unless the protocol included a surgery as part of the study, in which case the date of first postoperative treatment was used as the baseline date. For patients who died, survival was time between registration and date of death. Patients not known to have died were censored for survival as of the last date known alive. In general, the studies mandated repeat scans every 8 weeks. To allow for some variability in timing of the scans, we analyzed PFS status at 9, 18, and 26 weeks. If a patient was removed from a study for a reason other than progression, the patient was censored for further evaluation of progression in that study as of the date of starting other therapy, if that was known. Otherwise, the date the patient was removed from the study was used. If the patient was followed routinely for progression after being removed from the study and had progression without further therapy, that progression date was used. In cases where follow-up for progression was not consistent off-study, patients were censored for progression at the time they went off-study.
PFS and survival were estimated by using the Kaplan-Meier method. The primary purpose of this study was to assess the ability of progression status to predict survival, and this was done using landmark analysis. For each time point evaluated, all patients alive with known progression status at that time were included in the analysis. Survival was measured from that time. Survival curves comparing outcome based on progression status were created using Kaplan-Meier curves and tested using the log-rank test. The results were confirmed using analyses stratified by protocol and stratified by whether or not temozolomide was included as one of the treating agents. The Cox proportional hazards model was also used to allow for incorporation of the putative and known prognostic markers of age, KPS, and number of prior chemotherapy regimens. A further analysis was conducted including response. For that analysis, responders comprised patients who had been declared a responder at or before the specified time point. The conclusions were the same for the univariate and supplemental analyses; therefore, only the univariate results are presented here. All p values presented are two-tailed.
The study population comprised 596 patient enrollments (159 with grade III and 437 with grade IV tumors). Of the grade III tumors, 101 were anaplastic astrocytoma, 39 were anaplastic oligodendroglioma, and 19 were anaplastic mixed glioma. Forty-seven patients (12 with grade III and 35 with grade IV tumors on first enrollment) were enrolled in two protocols. Of the patients who were initially enrolled with a grade III tumor, two had a grade IV tumor at the time of their second enrollment. No patients were enrolled in more than two protocols. Table 2 describes the patient population.
As would be expected, PFS and overall survival tended to be longer for patients with grade III tumors (p < 0.01 and p < 0.001, respectively, log-rank test stratified by protocol). Table 3 presents estimated PFS and survival by grade for selected time points. Six-month PFS was 28% for patients with grade III tumors and 16% for those with grade IV tumors. The 6-, 12-, and 18-month survival rates were 66%, 44%, and 27%, respectively, for patients with grade III tumors and 55%, 25%, and 13% for those with grade IV tumors. Twenty-nine patients with grade III tumors and 59 patients with grade IV tumors were censored for progression. However, only 13 and 39 patients (grades III and IV, respectively) were censored before the primary end point of 6 months.
Of the agents included in these trials, temozolomide is the most accepted treatment for gliomas, and it was part of the treatment regimen for about one-third of the patients studied. Because temozolomide is now the standard of care for newly diagnosed disease and is not likely to be included in future salvage studies, a separate summary of patient outcomes that takes into account administration of temozolomide is included in Table 3.
Table 4 presents survival as a function of progression status for three time points. For this table, patients were included in a specified analysis only if they were known to be alive beyond that time point and it was known whether they had progressed by that time. Survival was measured from that time. The number of patients excluded because they had died before the specified time point is provided. Patients listed under “status unknown” are those for whom either progression or survival status for that time point was unknown. For example, of the patients with grade IV tumors, 223 were excluded from the 26-week analysis: 195 had died before 6 months, 27 were excluded because progression status was unknown, and one patient had disease progression but unknown survival status at 6 months. Patients censored for survival are those known to have been alive at the specified time point but for whom date of death is unknown.
Progression status at 9, 18, and 26 weeks following protocol registration was a strong predictor of survival for both tumor grades. The Cox proportional hazards model was used to estimate the hazard ratio for survival as a function of whether or not a patient had progressed by the specified time point. These hazard ratio estimates were consistent across time for each grade. For patients with grade IV tumors, the hazard ratio was between 0.46 and 0.36, indicating a reduction in hazard of greater than 50% for those who had not yet had disease progression. The estimated hazard ratios for patients with grade III tumors were lower, ranging from 0.30 to 0.33. Figures 1 and and22 show Kaplan-Meier curves for survival from 6 months for patients with grade III and grade IV tumors, respectively, based on progression status at 6 months. The curves from 9 and 18 weeks showed very similar survival patterns and are not presented here.
Historically, response has been considered an indication of an agent’s activity, although association of response with survival has often been difficult to confirm when appropriate statistical procedures have been used to adjust for time biases. In this study there were few responses. Fourteen patients with grade III tumors responded (9% of those evaluable for response); median time to response was 13 weeks (range, 8–67). Thirty partial responses were observed for patients with grade IV tumors (7%); median time to response was 12 weeks (range, 4–58). Three (21%) of the grade III responders did not have successful treatment based on 6moPFS. One was censored at 10 weeks, and two had disease progression before 6 months (8 and 12 weeks after their initial response designation). Twelve (40%) of the grade IV responders did not have successful treatment based on 6moPFS. Of these 12, three were censored. The remaining nine had disease progression before 6 months, seven within 9 weeks of their initial response designation. When we performed a landmark analysis for each grade and time point, in a proportional hazards model with both response and progression status at that time included in the model, response was not close to statistical significance and did not substantively change the predictive strength of progression status.
We present here the results from an analysis of 12 NABTC phase II clinical trials of patients with recurrent high-grade glioma. Our goal was to determine whether progression was a predictor of subsequent survival time and also to evaluate the potential usefulness of progression status at earlier time points. In this study, we were able to document that progression status at 26 weeks was a strong predictor of survival, with a similar pattern for progression status at 9 and 18 weeks.
Clearly, there are limitations to the use of a single time point assessment of progression. If the actual time to progression is known, use of that information can increase statistical power. There is a risk of missing earlier, potentially relevant differences not observed at the prespecified time point. Objective response, which in some cases may ultimately be proven to be an important surrogate for survival, is not factored into the design. Finally, interpretation can be complicated if data are missing due to patients being removed from the study without progression prior to the fixed time point. These patients may have begun new therapies without progression, died without an assessment of progression, or become lost to follow-up. On the basis of the principle of intent-to-treat, the usual calculation includes in the denominator all patients enrolled and in the numerator only those known to be alive and progression-free at the specified time point. This is a conservative strategy. In the case of this patient population, if a therapy is successful, few patients should have missing data for the reasons stated above; therefore, it is not likely to be a major problem in interpretation. Specification of a fixed point in time also reduces the effect of differences in timing of intermediate scans on data interpretation. The fixed time point also ensures that the results for the study will be known no later than 6 months after the last patient is enrolled.
The results of this study confirm the need to stratify based on tumor grade. Patients whose last surgery indicated a grade III tumor had a better outcome as a group than did patients with a diagnosis of a grade IV tumor. The grade III patient group undoubtedly included patients whose tumor had upgraded since the last histological diagnosis because of the infrequency of additional surgeries after the initial treatment. If a more accurate diagnosis of current histology were available at the time of protocol enrollment, the difference in outcome between the patient groups would be expected to be still larger.
For 6moPFS to be an appropriate end point for these clinical trials, it should not only be reproducible but also have clinical relevance. In this study, we were able to document that progression status at 26 weeks was a strong predictor of survival from that time point. We saw a similar pattern for progression status at 9 and 18 weeks.
A number of questions remain. Although done as post hoc analyses, survival was closely associated with progression status at time points before 6 months. This raises the question of whether the earlier time points could be used either in addition to or in place of 6moPFS. If it were possible to substitute 9-week PFS for 6moPFS, results from trials could be determined earlier. It would also be more practical to consider multistage designs. Our current criterion for a successful/unsuccessful trial is 35% versus 15% 6moPFS for patients with grade IV tumors, our primary test population. From this study we have an overall estimate of approximately 45% PFS at 9 weeks for this patient group. A corresponding increase of 20% in 9-week PFS would be from 45% to 65%. Whereas 32 patients are sufficient to have 90% power (with a one-tailed α = 0.1) for the 6moPFS end point, 44 patients (an increase of 38%) would be required to have similar power for the earlier time point. (If we assume an exponential distribution, then a 15% vs. 35% difference at 6 months would correspond to 52% vs. 70% PFS at 9 weeks, and the required sample size for 90% power would be still larger — 53 patients.) Also, an improvement in 6moPFS ensures some degree of durability of effect.
While it does not seem appropriate to replace 6moPFS with PFS at 9 weeks as the primary end point, with the information available on the expected PFS rate at 9 weeks and knowledge that PFS status at 9 weeks predicts survival, we can now consider using that information to create early stopping rules should the PFS be less than would be expected. This would allow early stopping for trials in which the therapy is clearly not meeting expectations. Such an approach would not be applicable for all situations. For example, trials of a targeted therapy might require the full patient number to determine if the therapy was differentially effective depending on a specific tumor marker. On the other hand, if patient entry had required the presence of a tumor marker that was expected to lead to high success with targeted therapy, an early stopping rule based on 9-week PFS estimates would be appropriate. The best way to incorporate this information needs to be evaluated further, as do possible uses for the intermediate (18-week) results.
Unfortunately, the objective response rate remains so low that we feel that we have not suffered from a lack of consideration of this end point to date. Newer clinical trial methodologies can potentially allow for multiple end points,6 for example, response and 6moPFS, and these need to be investigated.
The study as conducted demonstrated that a patient’s progression status at a specified point in time is a strong predictor of survival from that time. This provides documentation supporting the general impression of clinicians that delay in progression is positive, not only because it delays the complications that accompany progression (which have a real clinical effect on the patients) but also because it indicates the likelihood of longer survival.
The documentation that progression predicts survival leads us to believe that treatments that extend PFS are likely to increase overall survival, but this remains to be proven. The studies included here were single-arm phase II trials, limiting what can be concluded with regard to 6moPFS as a surrogate for survival in comparing treatments. However, the results are consistent with the recently reported experience of the North Central Cancer Treatment Group.7 In that report, the authors noted a limitation in the conclusions owing to the results of all trials being negative. Our current analysis included several trials that used temozolomide, a treatment that has subsequently been determined to be effective based on a survival end point (albeit in the treatment of patients with newly diagnosed tumors). While a comparison of the studies with and without temozolomide does not take into account any possible differences in prognostic factors between studies, it is encouraging that patients in studies including temozolomide showed a much improved 6moPFS and survival compared with patients treated in studies not including temozolomide, indicating that 6moPFS could be effective in distinguishing this agent from other generally less effective therapies in phase II trials.
In addition to the use of 6moPFS as an end point for phase II trials, the question arises as to whether this end point could be used as a substitute for survival either for full approval or for accelerated approval of a new treatment. For reasons stated above, we believe that the data support a statement that increasing 6moPFS can reasonably be expected to increase survival. As with survival, there is clearly heterogeneity in PFS that is not related to treatment. The nature of these factors needs to be further explored, and we plan to utilize these data to perform such evaluations. However, because it cannot be expected that all factors affecting survival and/or PFS at time of recurrence can be identified, randomized clinical trials would be required to confirm treatment effect.
The use of PFS for accelerated approval has the potential to substantially shorten the time to access a promising treatment. For example, to have 90% power for a hazard ratio of 1.8 consistent with the treatment effect criterion used in the NABTC phase II trials (an increase in 6moPFS for grade IV patients from 15% to 35%), assuming accrual requires 1 year with additional follow-up of 6 months and that PFS follows an exponential distribution, a randomized trial would require 134 patients and could be completed in 1.5 years if PFS were the end point. For the same power, same hazard ratio, and same number of patients, using survival as an end point would require 3.5 years (assuming 15% survival at 1.5 years, consistent with the historical data). Because patient heterogeneity and additional treatment options might result in a hazard ratio closer to 1 for a survival comparison than for a PFS comparison, the study based on a survival comparison might need to be still longer in order to have enough events for adequate power.
The strength of the results observed reassures us that for phase II trials, 6moPFS is a useful end point for evaluating new therapies. Status at earlier time points may also provide important information, particularly as a guide to reduce patient sample size in the setting of studies with negative results. Further research is needed to validate these observations. Until there are validated instruments for the assessment of symptomatic end points, it is unlikely that time to symptom deterioration will become a commonly used end point in neurooncology clinical trials. On the basis of the results, we are convinced that progression is an important determinate of survival and that PFS may more directly define the benefit of the treatment being tested than does survival. Given the large database and numbers of trials used in the analysis, we believe that this end point is relevant to the patients we serve and that our results support the idea of evaluating new research methodologies using this end point coupled with other measures of clinical benefit.
This study was presented in part at the American Association for Cancer Research/FDA Public Workshop on Clinical Trial End Points in Primary Brain Tumors, January 20, 2006. We thank Ilona Garner, Department of Neurological Surgery, University of California San Francisco, for editorial support. This study was supported by the following grants: NABTC grants CA62399, CA62422, CA62412, U01CA62407-08, CA62455-08, U01CA62405, CA62426, U01CA62399 022030 (for NABTC98-03 only), U01CA62399, 5-U01CA62399-09, and U01CA62421-08; and General Clinical Research Center grants M01-RR00079, CA16672, M01-RR00633, M01-RR00056, M01-RR0865, M01-RR00042, and M01-RR03186.