|Home | About | Journals | Submit | Contact Us | Français|
Low accrual to adult oncology clinical trials is a major obstacle to progress in cancer therapy. Research in barriers to trial accrual has focused primarily on physician and patient deterrents to trial participation - deterrents that affect accrual after a trial has opened.1-7 Research in processes prior to trial activation that may impact accrual has been far less common. Operational studies of trial development processes have illustrated the lengthy sequence of steps involved in the progression from concept submission to trial activation.8 Factors related to trial design may also impact accrual.9 This has not been well studied but we believe identifying factors related to accrual that are present prior to trial opening has relevant and immediate ramifications for clinical trial conduct.
We conducted a survey of study chairs and lead statisticians involved in phase III cancer trials, in order to define issues that may affect clinical trial accrual but are recognizable during protocol development. This survey complements data collected as part of a larger study, entitled the Oncology Clinical Trial Accrual Study (OCTAS), which entails a systematic evaluation of combined phase III trial experiences from five National Cancer Institute (NCI) - sponsored Clinical Trials Cooperative Groups (CTCG) performed over a ten-year time period. The perceptions of national clinical trial leaders on trial design processes and accrual influences have not been previously studied. Survey questions concentrated on investigator experience, trial design elements, accrual prediction practices, and perceived accrual influences. These responses were then evaluated in light of each trial's actual accrual experience.
The study population was created by identifying the study chair and lead statistician for each phase III trial open between January 1, 1993 and December 31, 2002, by one of five participating CTCGs. A total of 248 phase III trials were included, sponsored by the Cancer and Leukemia Group B (CALGB), Eastern Cooperative Oncology Group (ECOG), North Central Cancer Treatment Group (NCCTG), National Surgical Adjuvant Breast and Bowel Project (NSABP), or Southwest Oncology Group (SWOG). Participation was limited to U.S. CTCGs offering therapeutic trials for adult cancer patients to reduce variability in accrual experiences. Of the 8 applicable CTCGs, one (American College of Surgeons Oncology Group) was not open for the entire study period and two (Gynecology Oncology Group and Radiation Therapy Oncology Group) were not strongly pursued since less specialized groups were ideally sought. The remaining five CTCGs agreed to participate and offered a heterogeneous mix in disease sites and treatment modalities to meet the goals of this study. Protocols were reviewed to identify the study chair and lead statistician for each trial. For intergroup trials, the study chair and lead statistician were derived from the originating CTCG. Cooperative group input was sought to determine the most appropriate persons as study chair and lead statistician for trials where the person filling this role was unclear. NSABP substituted the protocol officer for the study chair in their trials; within the organizational structure of NSABP, the protocol officer assumes responsibilities commonly performed by the study chair. Since this substitution only pertains to a small number of trials (n=18), the responses from protocol officers and study chairs are reported together. Updated contact information and email addresses were confirmed for each survey recipient prior to this study. An introductory email notifying recipients of the upcoming survey was also used to help verify active email addresses. The final survey recipient population consisted of 179 unique study chairs and 49 unique statisticians since an individual could be associated with more than one trial.
A self-administered, web-based survey consisting of 28 questions for lead statisticians and 29 questions for study chairs was developed using published guidelines for question writing and survey construction.10 The survey questions were evaluated by oncologists, statisticians, and a survey methodologist from the University of Virginia (WFC) affiliated with this project. The survey instrument then underwent a pilot test among a convenience sample of five medical, surgical, and radiation oncologists with experience as clinical trial principal investigators as well as a biostatistician with cooperative group experience. Revisions were made based on comments received from expert review and the pilot test.
Each survey was unique to a specific trial with questions adapted to the study chair or lead statistician role. Questions focused on perceptions about the following: accrual prediction influences, feasibility of predicted accrual, presence of clinical equipoise, and factors contributing to accrual success for an individual trial. Additional questions queried perceptions about control arm selection and appropriateness of eligibility criteria. General questions examined prior clinical trial experience, academic rank, medical specialty, and gender. A 5-point Likert scale quantified respondents’ perceptions of influences on accrual predictions and accrual success. The scale used 1 to indicate a factor had strong influence, 3 to indicate some influence, and 5 to indicate no influence. An option to indicate that a factor was not applicable or its influence was unknown was provided.
After the introductory email, each recipient received a separate invitation to participate in the survey. The survey included a sequential roll-out among the five CTCGs between April 8 and May 5, 2008. The survey was closed on June 25, 2008. Each recipient received three reminder emails spaced two weeks apart. The email provided a link to a web-based survey for each trial assigned to a recipient. The study included a novel deferral process, allowing a recipient to defer a survey for any assigned trial and recommend another individual who could better represent the trial. This process was instituted to ensure that each survey was sent to the most appropriate person and to maximize response rates. The survey website also featured endorsement letters from the five CTCG chairmen as well as assurance regarding confidentiality of responses.
Accrual sufficiency categorization was based on the reason for trial termination documented by the CTCG, not survey responses. Target and actual accrual data were available for each trial. Sufficient accrual was defined as any of the following: (1) meeting target accrual, (2) CTCG documentation stating the trial had closed with complete or adequate accrual, (3) closure at interim analysis with conclusive results, or (4) closure due to toxicity. Target accrual comprised either the original sample size as documented in the initial protocol or a revised sample size if the trial underwent a major revision during its course which affected the statistical considerations. Insufficient accrual was defined as any of the following: (1) CTCG documentation indicating closure due to poor accrual, or (2) closure due to factors external to the trial rendering it likely unable to address the primary endpoint, such as discontinuation of a test agent or loss of equipoise resulting from new data. Five trials remained open to accrual at the time of the survey. At analysis, two more had closed with sufficient accrual and the other three remained open with 71%, 76%, and 90% of the target accrual met. Given these good accrual rates, these five trials were included in the analysis as trials with sufficient accrual. Eligible survey responses required 10 or more answered questions, excluding questions about respondent gender or medical specialty. This cutoff for eligibility was determined by distinct patterns in number of questions answered among the respondents.
Prior clinical trial experience of the study chair or lead statistician was categorized as 0, 1, 2-10, 11-20, or greater than 21 trials in which the respondent had had a leading role. This allowed consistent trial experience categorization to be used for the study chairs and lead statisticiansLS, while distinguishing limited trial experience from moderate or high levels of prior trial experience. Responses for therapeutic versus non-therapeutic trials were compared due to inherent differences between these trial types and due to a propensity for unique respondents representing the greatest number of trials to be involved with non-therapeutic trials. Therapeutic trials were defined as testing treatments specifically for cancer. Non-therapeutic trials included cancer prevention trials, behavior modification trials, quality of life trials, and trials testing treatments for cancer-related symptoms or cancer treatment-related symptoms. A measure of recalled equipoise was obtained by asking about the perceived value of experimental treatment(s) versus control treatment(s) when the trial opened and closed to accrual. This perceived value was reported in six categories: (1) control treatment highly preferred to experimental treatment, (2) control treatment preferred to experimental treatment, (3) about equal with experimental treatment expected to be disappointing, (4) about equal with experimental treatment expected to be successful, (5) experimental treatment preferred to control treatment, and (6) experimental treatment highly preferred to control treatment. Analysis of these responses was conducted both for all six categories and by reduction to three categories in which responses 1 and 2 showed lack of equipoise favoring the control arm, 3 and 4 showed equipoise, 5 and 6 showed lack of equipoise favoring the experimental arm.
Analysis of each questionnaire item was performed by respondent role (lead statistician vs. study chair), trial type (therapeutic vs. non-therapeutic), and accrual sufficiency status (sufficient vs. insufficient). The Pearson's chi-square test was used for categorical data and the Student t-test or one-way analysis of variance for continuous data. Likert scale data were viewed as ordinal with median values reported. Since respondents could indicate that a factor was not used or applicable, the proportion of respondents actually citing a factor on the Likert scale is reported with each response category. A test for paired binomials was used to evaluate perceptions of equipoise at trial opening and closing for the same respondents. Results are reported only in aggregate form. Analytic tests were performed with SAS 9.0 (SAS, Cary, NC). Although this survey was conducted with CTCG cooperation, the survey design and conduct, data analysis, and results interpretation were performed independently of the CTCGs by study personnel at the University of Virginia. This study was approved by the University of Virginia Institutional Review Board (IRB-HSR # 12582) and granted waiver of consent.
Of 496 total surveys sent out, responses were received for 335 (68%). Twenty-six responses (8%) were ineligible due to fewer than 10 questions being answered. Of the remaining 309 responses (63% overall eligible response rate), 199 came from lead statisticians (81% eligible response rate) and 110 from study chairs (45% eligible response rate). 223 (90%) of 248 trials were represented by at least one response with matched pair responses received from both the study chair and lead statistician for 86 trials (35%). Of the 25 trials for which no survey response was received, sixteen (64%) were classified as having sufficient accrual, which is similar to the rate of accrual sufficiency seen overall in this trial cohort.
Respondent gender and medical specialty are reported by unique respondent rather than by response. Study chairs most commonly were medical oncologists (75%) with the remaining study chairs representing surgical oncology (5%), radiation oncology (4%), or other (16%). Study chairs were less often female (19%) than were lead statisticians (56%). Eligible responses were received from 77 unique study chairs, representing a median of 1 trial per respondent (range 1-12), and from 34 unique lead statisticians, representing a median of 4 trials per respondent (range 1-44). Of note, only 4 lead statistician respondents represented nine or more trials. Considerable variability in responses for trial-specific questions was noted in respondents representing high numbers of trials.
Among the 223 trials for which survey responses were received, 140 (63%) were classified as sufficient accrual and 83 (37%) as insufficient accrual. Only three trials categorized as insufficient accrual were closed due to external factors. Among these 223 trials, 163 (73%) were classified as therapeutic. These classifications of accrual sufficiency and trial type were obtained from CTCG documentation and were not dependent on survey responses. There was no significant difference between therapeutic and non-therapeutic trials in proportions categorized as having sufficient accrual.
The remaining results are presented by response because these characteristics may have changed for an individual respondent representing multiple trials. Academic rank, prior trial leadership experience, and continuity of involvement in a trial's course from design to closure for study chairs and lead statisticians are depicted in Table 1. Clear and expected differences exist between study chairs and lead statisticians overall. However, no association between these respondent characteristics and trial accrual sufficiency status was identified within these two respondent groups. Specifically, seniority was not associated with greater likelihood of attaining sufficient accrual. Among study chairs who were assistant professors at the start of their trial, 74% led trials with sufficient accrual. For study chairs who were associate professors or professors, 69% and 63% respectively completed their trials with sufficient accrual. Similarly, prior trial leadership experience was not associated with greater likelihood of accrual success.
Trial design factors as reported by the study chairs are listed in Table 2, including use of a placebo or observation arm, time elapsed from trial concept to activation, and appropriateness of eligibility criteria. No significant associations were found between these reported trial design factors and accrual sufficiency. In particular, study chairs were no more likely to recall overly restrictive eligibility criteria with trials having insufficient as compared to sufficient accrual.
When given the option of marking all choices that apply, study chairs indicated that the following factors most commonly influenced selection of the control intervention:(1) systematic literature review (57%), (2) expert opinion within the CTCG (57%), and (3) standard of care within one's community based on personal experience (45%). Clinician surveys regarding the standard of care (22%), best judgment of the study chair (19%), meta-analysis of relevant trials (8%), and other (7%) were selected much less often. The reported influences in control arm selection were nearly identical between study chairs of trials with sufficient and insufficient accrual. Control arm selection in therapeutic trials was more commonly influenced by expert opinion within the CTCG than in non-therapeutic trials (65% vs. 29% respectively, p=0.002). Non-therapeutic trials were more often influenced by study chair experiences than therapeutic trials. The standard of care within one's community based on personal experience of the study chair was reported as influential in 63% of non-therapeutic compared to 41% of therapeutic trials (p=0.06). The best judgment of the study chair was cited as influential in 50% of non-therapeutic compared to 11% of therapeutic trials (p=<0.001). Lastly, only 9% of respondents thought that a redesign or contingency plan had been prepared during the initial trial design for use in the event of poor accrual. Disparity in this response was noted with 15 of 67 matched pairs disagreeing on the presence of a contingency plan. Study chairs were more likely to think a plan existed than were lead statisticians.
Both study chairs and lead statisticians were queried about perceived influences on accrual predictions during a trial's design. A CTCG's accrual experience in a particular disease, disease stage, or intervention was viewed as the strongest influence on accrual predictions by both study chairs and lead statisticians (Table 3). CTCG experience was reported equally as the top influence on patient accrual predictions for trials with sufficient and insufficient accrual. Direct input from prospective participating clinicians or patients was reported as not having a role in informing accrual predictions (Table 3).
Both study chairs and lead statisticians were asked to select one of six statements describing their recalled, perceived relative value of the experimental treatment(s) versus the control treatment(s) at opening and closing of their trial. These responses, shown in Table 4, reflect that study chairs were more prone to optimism about the experimental treatments than were lead statisticians. This relative optimism of the study chairs persisted when comparing study chair and lead statistician responses for the 75 matched pairs (data not shown). Furthermore, over 40% of study chairs report having preferred one arm over another before trial opening. Perceived equipoise was largely maintained from opening to closing with no major shifts in preference recalled for one treatment arm over another during the course of a trial. There appeared to be a greater shift in perceptions among the lead statisticians when evaluating perceptions at trial opening and closing for the same respondent, although still only in the minority of respondents. Among lead statisticians, 16% more respondents had reported about equal relative value at trial closing (p<.001). Among study chairs, 7.5% more respondents had reported the relative value of the experimental and control arms as about equal at trial closing than at opening (p=.19). The change was more often noted in respondents becoming less enthusiastic about the experimental arm over the trial course. No statistically significant difference was identified in perceptions of equipoise between trials with sufficient or insufficient accrual for both lead statisticians and study chairs (data not shown). This was noted when responses were evaluated both in the original six and reduced three categories. Similarly no significant differences were found in perceptions of equipoise between therapeutic and non-therapeutic trials (data not shown).
Overall, 41% of respondents indicated that their trial experienced significant accrual difficulties. No significant differences were seen between therapeutic (44%) and non-therapeutic trials (35%) in reporting accrual difficulties. Positive accrual experience was credited to three factors: perceived clinical relevance of the study question, lack of competing trials, and protocol designed to parallel normal practice (Table 5). In contrast, a negative accrual experience was not strongly attributed to any of sixteen specific factors offered as choices (Table 5). Although respondents were offered the opportunity to record other reasons to explain poor accrual, no persistent themes emerged in these responses.
Phase III trial experiences from the vantage point of study chairs and lead statisticians offer unusual insight into accrual prediction practices and perceived accrual influences. Our study showed considerable unanimity, particularly among study chairs, in attributing accrual success to (1) perceived clinical relevance of the study question, (2) lack of competing trials, and (3) a protocol designed to parallel normal practice. However, the trial leaders did not identify consistent factors to explain accrual difficulties. This suggests that reasons for poor accrual are not well understood, are variable and complex, or are not necessarily consistent with commonly accepted accrual barriers. Commonly described barriers such as inadequate recruitment resources, excessive expense to patients or institutions, inadequate clinician incentives, significant deviation of the protocol from usual practice, restrictive entry criteria, and an uninteresting research question, were offered as survey response options but were not viewed by trial leaders to have contributed significantly to accrual difficulties. Interestingly, the barriers noted above are typically described in studies of recruiting clinicians active in clinical trial research.3-5 In our study of senior trial leadership, the perceived reasons for low patient participation may not be reflective of factors affecting accrual on a local level. Alternately, reasons for low accrual may be multifactorial or may be actually different when viewed from the global perspective of trial leadership. These survey results will lead to additional study of this group of trials to establish trial-level factors associated with accrual success that are identifiable prior to trial activation.
Trial-level factors linked to accrual success are reflective of effective trial design, prioritization, and accrual prediction practices. Gauging clinical trial interest among physicians and patients in a relevant timeframe may be particularly challenging. Our findings on equipoise, although limited in that these represented recalled perceptions, suggested that perceptions of equipoise by trial leadership were not necessarily good indicators of accrual success. However, perceived equipoise did appear to be largely maintained over the course of the trials, which is important to ethical trial conduct.11 Influences in selecting the control intervention were no different between trials with sufficient and insufficient accrual. The importance credited to the protocol mirroring normal practice and the clinical relevance of the study question, however, support the importance of appropriate control arm selection in framing a question of clinically relevant uncertainty.12 Ability to appreciate clinical relevance of a study question or appeal of a study design may hypothetically increase with seniority or experience. However, our study showed no association of quantity of trial leadership experience or academic seniority with accrual success. Prior studies have shown that physicians overestimate their ability to accrue patients by a factor of six.13 This again speaks to the challenges of measuring the clinical relevance and acceptability of a trial among its target patients and enrolling physicians. Time elapsed from development of a trial concept to trial activation may serve as one proxy measure for clinical relevance.14 However, measuring clinical relevance is likely more complex. Studying means to reliably measure interest in a trial concept during its planning phases would prove highly valuable both in trial prioritization and in accrual predictions. Development of such methodologies could find applications among numerous stakeholders in the trial design and prioritization processes.
In our study, trials with sufficient as well as insufficient accrual were reported as having relied primarily on cooperative group experience to inform accrual predictions. Alternate strategies for accrual prediction could prove useful and should be explored. Our results further demonstrate that the perceived role of prospective clinicians and patients in the accrual prediction process appears very limited. This may offer an important opportunity for process improvement. Issued in 2005, the Report of the Clinical Trials Working Group (CTWG) of the National Cancer Advisory Board entitled “Restructuring the National Cancer Clinical Trials Enterprise,” outlines recommendations and implementation strategies to take effect over the subsequent five years in restructuring the national oncology clinical trials effort.15 Among these recommendations, the CTWG introduced an initiative to increase community physician and patient advocate representation on Scientific Steering Committees tasked with promoting development of clinical trials that address relevant and feasible study questions in the community setting.15 The three factors to which accrual success was most strongly attributed in this survey are surely considered by the Scientific Steering Committees. However, the subjectivity of clinical relevance and similarities to normal practice limits formal incorporation of these three criteria into the trial prioritization process. The actual impact of these Scientific Steering Committees in effective trial concept prioritization will need to be assessed in more current trials. As another potential opportunity for process improvement, a contingency plan in the event of poor accrual could be determined prior to trial opening. A greater emphasis on accrual strategy as part of the trial design process would allow better allocation of resources in prioritizing trials and quicker reallocation of resources if accrual still proves insufficient.16
Our study has several limitations. Most notably, perceptions of study chairs and lead statisticians may not accurately reflect accrual barriers encountered by enrolling physicians or patients. However, these clinical trial leaders were the most appropriate people to report influences on accrual predictions and trial design, a primary focus of this work. A second limitation relates to possible recall bias particularly among older trials. The study chairs and lead statisticians work so intently with a trial over its entire course that it seems that many aspects of the trial would be well-remembered. Nonetheless, questions reflecting recalled perceptions, such as those related to perceived equipoise, may be particularly sensitive to alterations over time. This survey is part of a larger study systematically examining the accrual experience of these 248 trials. To help address the degree of discordance between perception and “actuality,” a comparison of these select survey results with data abstracted about each trial from CTCG documentation may be useful. Third, the OCTAS definition of accrual sufficiency based on ability to address the primary endpoint grouped trials closed due to external factors with trials closed due to inadequate accrual. These external factors, such as discontinuation of a study agent or new data from another study, may be entirely separate from accrual. However, since closure due to external factors only related to three survey responses, these results were not reported separately. Fourth, this study's response rate for study chairs was lower than that for lead statisticians. The overall response rate was consistent with that reported for this survey genre but the study chairs’ eligible response rate fell slightly below that of the mean reported for physician surveys of 54%.17
This limited the ability to perform matched pair analyses. Importantly, the response rate did not appear biased by the trial's accrual sufficiency. Study chair and lead statistician responses also varied in some important ways and therefore were not reported in aggregate. For instance, in questions using a Likert scale, lead statisticians were more likely to cluster responses in the middle of the scale, whereas study chairs were more likely to utilize the scale's extremes. Lastly, wording of survey questions can profoundly affect the results. In our study, it is plausible that consistent factors associated with accrual difficulties may have been found if respondents had been asked to rank the provided factors in order of perceived influence rather than rate the factors’ strength of influence. However, we maintain that our study provides a more telling result by not masking whether respondents consider the factor's influence unimportant overall.
The low rate of patient participation in cancer clinical trials and the common problem of trial closure due to inadequate accrual underscore the need for critical appraisal of accrual prediction practices, trial design choices, and trial prioritization. Alternate strategies for accrual prediction that augment cooperative group experience and improved measures of clinical relevance that accurately reflect the interests of participating physicians and potential patients will be instrumental in effectively testing new advances in cancer care.
This work was supported by a grant from the National Institutes of Health [R01 CA118232] to Dr. Schroen.
This work was supported in part by: Public Health Service Grants U10CA-12027, U10CA-69974, U10CA-37377, U10CA-69651, and U24-CA-114732 from the National Cancer Institute, Department of Health and Human Services.