|Home | About | Journals | Submit | Contact Us | Français|
Studies finding weak or nonexistent relationships between hospital performance on providing recommended care and hospital-level clinical outcomes raise questions about the value and validity of process of care performance measures. Such findings may cause clinicians to question the effectiveness of the care process presumably captured by the performance measure. However, one cannot infer from hospital-level results whether patients who received the specified care had comparable, worse or superior outcomes relative to patients not receiving that care. To make such an inference has been labeled the “ecological fallacy,” an error that is well known among epidemiologists and sociologists, but less so among health care researchers and policy makers. We discuss such inappropriate inferences in the health care performance measurement field and illustrate how and why process measure-outcome relationships can differ at the patient and hospital levels. We also offer recommendations for appropriate multilevel analyses to evaluate process measure-outcome relationships at the patient and hospital levels and for a more effective role for performance measure bodies and research funding organizations in encouraging such multilevel analyses.
Parast et al.1 noted that one would expect improved outcomes to flow from incorporating evidence from randomized controlled trials in process of care performance measures. They observed, however, that multiple studies have found that hospitals providing specified evidence-based processes of care to higher percentages of their patients had only slightly or no better outcomes than hospitals providing those processes to proportionately fewer of their patients.2–6 Such findings have “left providers and health policymakers wondering whether the focus on processes of care is misplaced” (1, p. 359).
For example, Nicholas et al.7 examined hospital performance on Surgical Care Improvement Project (SCIP) practices for six different surgeries. Hospitals in the highest and lowest quintiles of performance did not differ significantly in case mix-adjusted rates of patient mortality, venous thromboembolism or surgical site infections (SSIs) relative to hospitals with medium performance. Commenting on these findings, Mabry8 provocatively suggested:
These findings, if true, call into serious question the increased time, labor, and effort currently expended by hospitals and surgeons across the United States to comply with the SCIP program process measures. How can it be that the National Quality Forum, CMS [U.S. Centers for Medicare and Medicaid Services], and others got it wrong?” (p. 1,004).
Mabry acknowledged later in his commentary that despair was perhaps premature because early incomplete SCIP data may have accounted for the absence of expected relationships, but a study with more recent data also found only 1 of 16 relationships between SCIP measures and SSIs to be significant at the hospital level.9 Should findings from hospital-level analyses cause clinicians to question whether their evidence-based care of individual patients actually results in improved patient outcomes?
Providing examples mainly from hospital-level analyses, Parast and colleagues1 discussed several issues that should be considered before clinicians and health care policy makers conclude that negative or weak relationships with outcomes indicate that care processes are ineffective—whether similar outcomes were assessed to those in the RCTs (and, ideally, effectiveness studies) providing supporting evidence for the process performance measure (PM); whether outcomes were assessed at temporally distant points after the provision of care; whether such high percentages of patients across hospitals received the recommended care that little variation was left to link to outcomes; whether biased findings may have been produced by omitted confounding variables; and whether the specification of the process PM had changed over time.
To this list, we would add one more issue that we believe is critical: the “ecological fallacy.” More than 6 decades ago, Robinson10 demonstrated that individual-level relationships cannot be inferred from relationships found with aggregated data. Drawing such erroneous inferences became known as the “ecological fallacy.”11 Thus, it is an ecological fallacy to assume that any relationship (positive, negative or null) between hospital-level performance on process of care measures and hospital-level outcomes indicates whether or not individual patients receiving the PM-specified care have (no) better or worse outcomes than patients not receiving that care.
That relationships can differ at different levels of analysis is nonintuitive and therefore can be difficult to grasp. Accordingly, we present a hypothetical example in Table Table11 to illustrate that relationships between a process of care PM and a clinical outcome can vary dramatically at the patient and hospital levels. In our example, 80 %, 85 % and 90 % of patients with a specific condition at each of three hypothetical hospitals received the PM-recommended care for their condition. However, at all three hospitals, the same percentage of patients (20 %) experienced a negative outcome (which could be mortality, an SSI, readmission, etc., depending on the condition and PM). Thus, there is no relationship between PM performance and outcome at the hospital level.
However, within each hospital we provide data on 100 hypothetical patients that conform to the hospital performance rates and outcome rate yielding the zero relationship above. The patient-level data within each hospital exhibit a negative relationship between receipt of PM-recommended care and poor outcome. Whereas only 13 %, 13 % and 11 % of patients who received the care recommended in the PM in Hospitals A, B and C, respectively, had a poor outcome, 80 %, 60 % and 55 % of the patients not receiving the recommended care experienced a poor outcome across the three hospitals, yielding odds ratios of 0.17, 0.22 and 0.20, respectively. One could plug in different data at the patient level and show positive relationships between receipt of the recommended care and poor outcome.
The hypothetical data in Table Table11 illustrate what Morgenstern12 noted over 4 decades ago: higher level (e.g., hospital-level) relationships provide no information about lower level (e.g., patient-level) relationships. Thus, in deciding on the value of a process of care PM for patients, it is not a matter of weighing hospital-level findings against the individual-level findings from RCTs, as Parast et al.1 suggested. Rather, in deciding on appropriate care processes for individual patients and on appropriate process PMs, hospital-level analyses should carry no weight at all. Positive findings with aggregated outcomes (e.g., Cataife et al.13) should not be viewed as supporting the implementation or continuation of a process PM to guide patient care, and negative hospital-level findings should not be interpreted as supporting the de-implementation of a process PM with respect to patient care. What matter most in evaluating the validity and utility of a process of care PM for guiding patient care are its relationships with outcomes at the patient level—the same level of analysis in the RCTs and effectiveness studies that typically provided supporting evidence for the process PM.
A difference in confounding variables is the most readily grasped and likely most influential factor that can account for different relationships at different levels of analysis. A confounding variable is related to both performance on the process measure and outcomes. For example, at the patient level, those who are more severely ill may be more likely to receive a recommended medical practice (“confounding by indication”) and also be more likely to experience a poor outcome than less severely ill individuals. Different confounders may come into play at the hospital level. For example, hospitals’ performance on a particular care process may be a proxy for (confounded with) a host of other factors, such as the quality of patient safety measures within the hospital, the quality of affiliated extended care facilities or the quality of follow-up care in the community. These other factors, in addition to the focal care processes, can affect clinical outcomes. Thus, relationships between hospital-level process PMs and hospital-level (aggregated) outcomes are likely to reflect a variety of confounding factors that are not as influential on process measure-outcome relationships at the patient level.
What a process PM actually assesses—its “construct validity”14—is indicated by its relationships (positive, negative and null) with measures of other constructs. That a process PM at the hospital level may be confounded with different factors than a process PM at the patient level is another way of saying that the process PM at the hospital level may be assessing something different than the process PM at the patient level. That the “same” measure may be assessing different constructs at different levels of aggregation/analysis is an observation that has been made by many analysts (e.g., Firebaugh15).
Findings from hospital-level analyses may be assumed to apply at the individual level, at least in part, because of the notion that higher quality in one area (e.g., processes of care for myocardial infarction) is likely to be linked to (confounded with) higher quality of care in other areas (e.g., infection control). If patients receive high quality care processes for their condition, it may be assumed that they will be in hospitals in which the quality of care in other domains (e.g., patient safety) also is high, making the hospitals’ patients less likely to experience a negative outcome (e.g., mortality). However, the extent as well as magnitude of associations among quality indicators at the hospital level is an empirical question. For example, Issac and Jha16 found “inconsistent and usually poor associations” between various hospital-level safety indicators and other quality-of-care indicators. Thus, some hospitals providing excellent PM-specified care to high percentages of patients with a specific condition may have poor quality in other domains that affect their patients’ outcomes, leading to null or weak relationships with outcomes in hospital-level analyses.
Although hospital-level analyses are inappropriate for gauging patient-level relationships between process PMs and outcomes, appropriate “multilevel” or “mixed-effects” analyses17,18 can partition process measure-outcome relationships into their within-hospital (patient-level) and between-hospital components. In work evaluating the predictive validity of substance use disorder (SUD) treatment quality measures, we have used propensity score, mixed-effects regression models of the following form to address differences in outcomes between patients who do and do not meet a process PM’s criteria:
(for patient i treated in hospital j, including hospital ID as a random effect and an exchangeable covariance structure), where μ 0j is the mean covariate-adjusted outcome in hospital j, β PM indicates (0,1) whether patient i met the PM’s criteria, β PMrate is the PM rate for patients at hospital j, and ε is the error term for patient i treated in hospital j.
This model answers the main question of interest with respect to the validity of process of care performance measures: Do patients meeting the PM’s criteria have better subsequent outcomes than patients who do not meet the criteria, controlling for hospital average performance and other possibly confounding hospital and patient characteristics via propensity scores (not shown in the formula)? The hospital-level component, the mean or percentage of hospital patients receiving the recommended care (PMrate), is a proxy for some of the other systemic factors (confounders) that may affect outcomes for better or worse.18 Simply controlling for the clustering or similarity of patient outcomes within hospitals through a random intercept in a mixed-effects analysis is not enough.18
In an earlier article,19 we provided an example of this approach using Veterans Health Administration (VHA) data to evaluate a continuing care measure for SUD patients by linking the measure to abstinence at a 6-month follow-up. No covariates were included as this example was intended simply to show that different relationships could occur at different levels of analysis. In a hospital-level analysis, there was no relationship between continuing care and outcome. However, in a mixed-effects regression analysis, the relationship at the patient-level was significantly positive, but the hospital-level continuing care rate had a negative (but not quite significant) relationship to abstinence. Regarding the latter finding, one speculation was that hospitals that had higher continuing care rates perhaps had less time and fewer resources to provide higher quality continuing care, but we acknowledged that any of a number of factors could have been confounded with hospital-level continuing care rates and patient abstinence.
A recent Institute of Medicine report20 proposed a minimum set of core metrics, including some tapping evidence-based care, that could be applied at multiple levels of aggregation in the health care system. However, the level of analysis issue discussed here was not addressed in the report. As we have shown, hospital-level analyses conflate hospital- and patient-level “effects” and do not indicate whether patients who receive PM-specified care processes have better outcomes than patients who do not. Thus, hospital-level relationships of process measures with outcomes should not be used to evaluate the validity of process of care PMs. Funding entities, such as the National Institutes of Health, and hospital performance measurement bodies, such as the National Quality Forum and the Measures Application Partnership, should encourage multilevel analyses that simultaneously identify hospital- and patient-level predictors, including care processes, of patient-level outcomes. Such analyses would yield important new information on the relationships of processes of care with patient outcomes and reduce the potential for misinterpretation of hospital-level findings. Unless shown to be invalid or nonproductive in multiple analyses that actually examine patient-level relationships, evidence-based process performance measures should be used to guide the care of individual patients.
The authors’ work has been supported by the US Department of Veterans Affairs Office of Research and Development Health Services Research and Development Service (HSR&D) Grants SUS 99-015 (Finney), IIR-07-092-1 (Harris), RCS 04-141 (Humphreys) and RCS-13-232 (Harris). The views expressed are those of the authors and do not necessarily reflect positions or policies of the Department of Veterans Affairs or of any other entity of the US government.
Some of the material in this article was included in a VHA Health Economics and Resource Center Cyberseminar by John W. Finney entitled: “Are Misinterpreted, Hospital-level Relationships Between Process Performance Measures and Outcomes Undermining Evidence-based Patient Care?” presented on May 22, 2013.
None of the authors has a conflict of interest.