Standards for establishing the amount of change over time needed in a HRQoL measure in order for that change to be considered important or relevant can be determined empirically by within-person change studies. However, such anchor-based methods incorporate only the patients' perspectives of important change, and do not reflect an informed clinical evaluation of HRQoL change. As stated in 1998 by Carolyn Clancy and John Eisenberg in
Science, “Additional work to enhance the interpretability of outcome measures, particularly in terms of clinical significance, is needed to increase the usefulness of these tools. Clinicians are unlikely to use patient-reported outcome measures routinely unless the reports are as familiar to them as blood pressure and other physiological measures.” (p. 256)
38 As a first step in developing standards for “clinically important differences” for the disease-specific CRQ and the generic SF-36, Version 2.0 when used in patients with COPD, we assembled a 9-person expert panel of North American physicians familiar with the use of at least 1 of these HRQoL measures among patients with COPD. Using 2 rounds of the Delphi process, 1 in-person meeting, and an iterative improvement process for circulating and correcting the final report of the panel meeting by the respective panel chairman, CIDs from an expert panel were established for each HRQoL measure.
Our panel's levels for detecting CIDs on the CRQ were on average slightly higher than previously investigated change levels for the CRQ based on patient-perceived differences. In the 1989 study by Jaeschke et al., an average change per item of 0.5 was considered a minimal (small) important difference in the dyspnea, fatigue, and emotional function dimensions of the CRQ.
21 When multiplied by the number of items in each dimension, this equates to small CID thresholds of 2.5, 2.0, and 3.5, respectively, in dyspnea, fatigue, and emotional function. In contrast, this expert panel reached consensus on small CID thresholds of 3, 2, and 5, respectively, which are attainable integer change scores for individuals. However, similar to the Jaeschke et al. results, the panel incremented these small CIDs by multipliers of 2 and 3 to set the moderate and large CID standards.
21 The panel also set CID levels for the mastery dimension, which had been excluded from the Jaeschke et al. investigation because of the lack of a similar domain in the simultaneously studied CHQ. When we applied these new standards for the evaluation of CRQ dimensional change scores among 393 outpatients with COPD in the a previous HRQoL Study
26 and compared the classifications (no change, small, moderate, or large) with the cutpoints reported by Jaeschke et al., 51 (13%) of these outpatients were classified differently in the dyspnea dimension, while 107 (27%) had different classifications in the emotional function domain.
Determining CIDs for the SF-36 scales proved to be a more challenging task for the panel that was complicated by the instrument's differing scale increments (i.e., the numeric value of a state change). Nonetheless, the panel adopted an informed and practical approach by considering the possible score changes that can result on each SF-36 scale (noted in the state change column of ). They then compared these state changes to their own experiences with COPD patients, as well as to distribution-based methods that have been used to interpret HRQoL change, and agreed on the reported levels. The magnitude of these change levels (≥8.3 points) should shed light on the widely held but poorly substantiated belief that a 3- to 5-point change on an SF-36 scale indicates a clinically important intra-individual difference. This belief stems from a 1989 report of SF-20
39 cross-sectional data collected from a large study of patients with and without chronic health conditions.
32 In those data, the predicted average on the health perceptions scale (mean = 69.1) for the hypertension group (
N = 2,708) was 3.5 points lower and statistically different (
P < .001) from the respective scale average (mean = 72.6) for those participants with no self-reported chronic conditions (
N = 2,595), after controlling for age, gender, education, and income. The “significance” of this cross-sectional difference, however, was driven by very large sample sizes, and was not determined using data that assessed patient change over time.
It is important to note that we did not provide definitions of
clinically important differences or the magnitudes of change (
small, moderate, and large) to the panelists. Instead, we hoped that the physician panelists, each experienced in the use of these measures, would come to their own understanding of these terms. Although prior work by Jaeschke et al. attempted to define a
minimally clinically important difference, that definition has some shortcomings.
21 First, it directly speaks to score differences that are perceived
beneficial to the patient, but does not address detrimental changes. Even if one considers an appropriate interpretation of this definition to include avoiding deterioration as a benefit, the patient with no perceived change is not properly recognized or classified. Second, even when the clinician and the patient agree that an important improvement or decline in HRQoL has occurred, this may not necessarily “mandate a change in the patient's management,” especially if such a change would be detrimental to the care of the patient,
24 or if all clinical options have been exhausted. We were pleased that our panelists, who are experienced in both providing clinical care for and performing research among patients with COPD as well as being experienced in the use of these HRQoL measures, came to their own interpretations of CIDs and the magnitude of those differences. Moreover, their interpretations converged when the panelists quantified these changes in the CRQ and SF-36 domains.
Despite the extensive efforts we undertook to obtain reliable and accurate results in this study, potential shortcomings remain and warrant mention. First, we employed only a single panel of physicians. Studies by RAND with several physician panels demonstrated substantial heterogeneity in their assessments of the appropriateness of different medical procedures.
40 However, the panel employed in the present study was comprised only of physicians who had substantial clinical and research experience with the task at hand, while this was not the case in the RAND studies. Second, it is possible that panelists were biased by the results of prior research and might have arrived at different conclusions had they been unfamiliar with that literature. Using experienced physicians, such exposure to earlier studies was unavoidable. Moreover, the widely varying approaches employed by different panelists at the start of the process strongly suggest that few were biased in this fashion. Third, when there are limited clinical data, clinicians may require a greater change in the data that are available before judging that a clinical change has occurred. That is, when clinical information is minimal, the recognition of clinical change will be conservative. And finally, although we made every effort to maintain the integrity of the consensus process, the possibility remains that some members of the panel succumbed to well-known influences of group process.
41 Coupled with this, there was no measure of the reliability of the panelists' judgments about clinically significant change. While the expert panel process is designed to add validity to such judgments, we cannot report a measure of actual agreement.
Establishing clinical change standards for HRQoL measures requires both clinical insight into the etiology, symptoms, and progression of the disease, as well as patient insight related to living with the chronic disease. Using the RAND consensus method for integrating the expert physicians' opinions with the available evidence, our results provide clinicians with usable standards for assessing change in the HRQoL of COPD patients seen in clinical practice, and in doing so, enhance their estimation of a COPD patient's disease severity and assessment of impact on patient care, and ultimately, improve patient outcomes. These results also provide COPD researchers with standards for evaluating the results of HRQoL research investigations among patients with COPD. However, these panel results reflect the judgment of only 1 group of physician experts. We must now compare the expert physicians' estimates of important change to the change estimates of patients with COPD. Moreover, additional insight on CIDs can be obtained from the physicians who routinely treat these COPD patients and who can directly observe their HRQoL changes.
42 Contrasting these physicians' assessments of HRQoL changes with those of their COPD patients will further elucidate how clinically important differences and patient-perceived differences in HRQoL compare.