In this we paper evaluate the impact of different phase II study strategies on drug development and use E[T] and E[N] to evaluate the impact. We first discuss the single arm phase II study design. In this type of study PFS is the primary endpoint and will be used as an early indicator of activity. PFS data from patients given the experimental treatment is compared to the historical experience. If this single arm study is promising a randomized phase III study based on OS is organized. Although it is possible to plan such studies using specific historical controls and taking into account the number of such controls, this is rarely done. Usually historical data (often from small studies) is used to specify a null comparison level of activity. For PFS data, this comparison level may represent PFS at a landmark time, or median PFS for an exponential distribution.
Because historical controls may not be prognostically comparable to patients accrued to the phase II trial, the specified null level of PFS may not be correct. When a null PFS rate is specified that is larger than the true rate for the population under study, the benefit of the new treatment will be under-estimated, thus reducing the probability of finding activity in the phase II study and continuing on to the phase III study. Ultimately, this reduces the probability of finding a significant benefit on OS. Conversely, a treatment that has no benefit on PFS is more likely to appear active when the null rate that is specified is smaller than the true rate for the population under study. This will result in continuing to a phase III study with probability greater than the specified type I error, thus increasing E[T] and E[N]. We study the effect of over or under specifying the null PFS rates in a single arm phase II study.
The problem of incorrectly specifying the null PFS rate in a single arm phase II study can be alleviated by performing a randomized phase II study comparing the new treatment to the control regimen using PFS as endpoint. If the new treatment appears better than the control based PFS then a phase III trial comparing the new treatment to the control regimen using OS as endpoint is organized.
Although the randomized phase II study alleviates the need to specify a null rate it does require more patients than a single arm study. Therefore, in order to address the increase in sample size we consider integrating a randomized phase II into a phase III study. With this approach, accrual to a randomized phase II study is designed to continue on into a phase III study if a specified criterion is met. The endpoint used for the phase II evaluation will differ from that used for the phase III analysis (as in the single arm study and sequence of studies), but data from patients accrued during the phase II study are used in the phase III study. Goldman et al19
have described these designs as a phase III study with an interim futility analysis using an intermediate endpoint. Finally we consider a strategy of skipping the phase II study and performing a single randomized study with survival as the endpoint and including an interim futility analysis based on survival.
In this paper we wish to evaluate the strategies by comparing the total number of patients (E[N]), both phase II and phase III, and total time till completion (E[T]) under null and alternative hypotheses, using parameters from the pancreatic cancer example for illustration. Appendix A
provides equations for the calculation of E[N] and E[T]. In the calculation of E[N] and E[T] the sample size and length of accrual for a phase III study are included and the same phase III study design is used for all strategies. The sample size and length of accrual of the phase III study are based on a design that has a primary endpoint of OS and 90% power for a 2-sided .05 level test.
The pancreatic cancer literature suggests the median OS rate is 6 months. For the sample size calculations of the phase III study an improvement in OS to 7.8 months is used (hazard ratio of 1.3). Although this improvement appears small it is likely this improvement would be of interest since this study is for an advanced disease population and even small OS improvements would be interesting since the drug could then be studied in earlier stages of disease. Assuming an accrual rate of 15 patients per month with a minimum follow up of 6 months would require 46.1 months of accrual or 692 patients.
For the two strategies that have independent phase II and phase III studies (i.e. the single arm study and randomized phase II) the phase II primary endpoint will be PFS and the study will be designed to have 90% power using a 1-sided .1 level test. We continue with the pancreatic example to design the phase II studies based on PFS. The literature suggests that the median PFS for pancreatic cancer is between 2 and 4 months, so for the single arm study we specify 3 months as the null PFS rate and for the randomized study we based the sample size calculation on a control arm median PFS rate of 3 months. We power both studies to detect an improvement in the median PFS to 4.5 months (hazard ratio of 1.5).
For the integrated phase II/III study design, patients will be accrued until time t1. At t1 accrual will be suspended and patients will be followed for a minimum time f1. After t1+f1 a comparison of the treated versus control groups based on progression-free survival (PFS) will be performed. If the p-value for PFS in this interim analysis is not less than a specified threshold, α1, accrual will terminate and no claims for the new treatment will be made. Otherwise, accrual will resume until a total of M patients are accrued. After accruing M patients, follow-up will continue for an additional minimum time fo. At the end of the study OS will be evaluated on all M patients. The total sample size M is that of the phase III study.
The strategy of skipping the phase II study and performing an interim futility analysis on OS requires a specification of t1 (the time of the interim analysis) and α1 the criteria for continuing. That is, if the p-value for the comparison of OS is less than α1 the study will continue.
For the integrated phase II/III and for the phase III with a futility analysis we determined t1 and α1 so that the overall study power (probability of concluding a benefit on OS when starting from phase II) will be maintained at 81%. Note, this 81% is the power for the strategy of a randomized phase II study with 90% power for PFS followed by a randomized phase III study with 90% power for OS. For the integrated phase II/III and the futility design we evaluate E[N] and E[T] for different α1 values but always adjusted t1 to maintain 81% power.
We evaluated the designs under: (i) No treatment effect on either PFS or OS (global null); (ii) treatment effect on PFS and OS (global alternative). When we evaluate the single arm study we use the equations in appendix A
and assume PFS and OS follow exponential distributions. Since the two studies are independent E[T] and E[N] can be calculated analytically. In the integrated II/III design data from the same person is used for the PFS and OS analysis. Therefore the data will be correlated and the correlation needs to be accounted for when evaluating E[T] and E[N]. The correlation structure we assume make analytic results difficult to calculate, since the PFS data no longer follow an exponential distribution, hence, computer simulations were conducted to evaluate E[N] and E[T]. Since the integrated II/III design is compared to the separate randomized phase II strategy, and the futility analysis on OS strategy, simulated data was used to evaluate E[N] and E[T] for these designs as well. (equations for the calculations of E[N] and E[T] are found in Appendix A
In the simulations we generate correlated PFS and OS data as follows. The distribution of OS for the control group was taken as exponential with median mo months. The treatment effect for OS is specified by a parameter Δo resulting in an exponential distribution of OS for the treatment group with moΔo. Provisional PFS times were generated for control and treatment group patients using exponential distributions with median values mp months and Δ1 mp months respectively. For a patient with overall survival value Yo, and provisional PFS value Y1 , the actual PFS time was set as Yp =min(Y1,Yo). This introduction of correlation between PFS and OS means that PFS times do not have an exponential distribution. If the medians of OS and PFS are very different then the correlation is very small and PFS will have an approximate exponential distribution. In the simulations Δ1 and Δo were varied. All simulations are performed with 10,000 replications.