PRISMM failed to detect an intervention effect but demonstrated significant secular trends in care quality with performance changes in both experimental and control hospitals. Our results illustrate the potential fallacy of using historical controls for evaluating QI interventions as is current practice.
16,17 Without contemporaneous controls, changes in performance, however substantial, cannot be causally attributed to the intervention.
The lack of a demonstrable intervention effect in the PRISMM trial could indicate that this specific intervention was ineffective or that the intervention would have worked but that its implementation was incomplete. Process evaluation data (table e-2) support the latter hypothesis. While stroke care order sets, protocols, and patient education materials were made available to all intervention hospitals, not all hospitals adopted them. Furthermore, the use of order sets in actual patient care was poor. Many control hospitals either had order sets at baseline or developed them on their own. It is to be noted, however, that the actual order set usage seems to be poor across the board and future interventions should address order set use in addition to ensuring their availability. A different possibility for the lack of an intervention effect was that it was swamped by large secular trends. There was a nationwide focus on the quality of acute stroke care coincident with PRISMM including the Brain Attack Coalition stroke center recommendations published in 2000
18 and the Primary Stroke Center Certification (PSC) program launched nationwide in 2003-2004.
19 PRISMM quality indicators were similar to those proposed by these programs. Many hospitals, aware of these developments during the PRISMM trial, were preparing for PSC by 2003 when postintervention data were being collected. Four control hospitals vs 1 experimental hospital attained PSC in 2004, the first year disease-specific certification was offered for stroke (table e-3). This imbalance, unforeseen prior to trial design, persists to this day. These events likely contributed to a dilution of the intervention effect and the observed secular trends.
Strengths of our study design include the substantial number of participating hospitals. One weakness is that an intervention effect for the acute care bundle could have escaped detection due to loss of statistical power from hospital attrition (from 24 to 19 hospitals). Recruitment and retention of hospitals, schools, and organizations is a major difficulty in cluster-randomized trials. We were nevertheless able to detect significant secular trends in performance. A different related issue is regarding the balance of study design due to the postrandomization loss of 2 experimental hospitals. We addressed this by comparing the ITT and as-treated results and there was no evidence of bias due to hospital dropouts. Another limitation is that at the time of study design there were no estimates of ICC to design interventions for acute care studies and our power calculations were based on estimates reported from primary care literature.
20 Consequently, one contribution of our study is the reported ICC estimates for acute, in-hospital, and discharge care. These will be useful for forthcoming studies.
Cluster-randomized trials are uncommon in neurologic literature despite being used in other fields of medicine, such as primary care, and also in the education field, where schools are randomized to different curricula.
15 Few randomized controlled trials (RCTs) have examined the efficacy of interventions to improve the quality of stroke care, particularly complex interventions targeting care providers and hospital organizations.
21 PRISMM addresses this gap by systematically assessing a complex intervention based on a theory of behavior change
8,10,11 to bring about improvement in the care quality. Are QI interventions really necessary since secular trends apparently lead to improved care quality over time? Our postintervention data showed continued suboptimal performance on many measures. Hence, secular trends alone cannot be relied on to ensure optimal care and interventions to improve care are needed.
An ongoing debate in the quality improvement field is whether RCTs such as PRISMM have a role in evaluating QI interventions given the difficulty and expense of such studies.
22 While RCTs are the gold standard for evaluating simple therapeutic interventions, they have drawbacks when used in the evaluation of complex interventions such as PRISMM. Complex interventions have multiple interconnected parts, are difficult to implement, and aim to achieve outcomes which may be difficult to influence.
21 The main drawback (in addition to the expense and effort needed for cluster-randomized studies) is that RCT are searching for generalizable results; hence the goal is strip away the local context in which the intervention is deployed. QI interventions, on the other hand, aim to bring about social change and depend heavily on the local context and interpersonal dynamics to bring about such change.
22,23 Hence RCTs may miss contexts, mechanisms, and factors that affect the intervention outcome and therefore fail to identify factors that may influence generalizability.
22,23 In counter argument, RCTs are designed to guard against confounders. In the PRISMM, had we used the experimental arm alone without control hospitals, we could have incorrectly concluded a significant intervention effect due to the secular trends for acute and in-hospital care bundles. We believe that a wide range of methodologies should be used to evaluate QI interventions. RCT have a special role in that they guard against confounders and prevent fallacious conclusions about intervention efficacy. However, other evaluation paradigms that model the local context and mechanisms underlying process change are also important in identifying why certain QI interventions seem to work while others fail.
22,23