|Home | About | Journals | Submit | Contact Us | Français|
To determine the minimal clinically important differences (MCID) of validated measures of SLE disease activity in childhood-onset systemic lupus erythematosus (cSLE).
cSLE patients (n=98) were followed every 3 months for up to 7 visits (total number of visits 623). Disease activity measures (ECLAM, SLEDAI, SLAM, BILAG, RIFLE) were completed at the time of each visit. Physician-rated changes in the disease course (clinically relevant improvement, no change, clinically relevant worsening) between visits served as the criterion standard.
MCID defined by mean change scores with improvement and worsening, or those based on the standard error of measurement with stable disease were both small and did not discriminate well between disease courses (detection rates for improvement or worsening were all < 55%). MCID based on discriminant and classification analyses yielded similar results. Alternative MCID, defined by a 70% predicted probability of improvement or worsening as per discrimination analysis, were larger but underestimated the proportion of patients with change. The RIFLE only correctly identified 26% and 8% episodes of clinically important worsening and improvement of cSLE, respectively.
The MCID of cSLE disease activity measures are often small but similar to those reported for adults with SLE. Thus even small changes in disease activity scores can be clinically relevant. Low correct detection rates of these MCID thresholds for changes in disease course support the notion that worsening and improvement with cSLE, or its response to therapy, is unlikely to be captured adequately by validated measures of disease activity alone.
Systemic lupus erythematosus (SLE) is a complex, chronic multi-system autoimmune inflammatory disease that targets young women and men (1, 2). The up to 20% of SLE patients who are diagnosed during childhood, i.e. prior to the age of 16 years (cSLE), tend to experience a more severe disease course than those with disease onset later on in life (3–5). Various measures of global disease activity have been developed for SLE in adults and subsequent validation confirmed these indices have concurrent validity for measuring disease activity with cSLE (10–12). They are the SLE Disease Activity Index (SLEDAI) (6), the Systemic Lupus Activity Measure (SLAM) (7), the European Consensus Lupus Activity Measurement (ECLAM) (8), and the British Isles Lupus Activity Group Index (BILAG) (9). Conversely, the Responder Index for Lupus Erythematosus (RIFLE) was developed specifically to define treatment response, i.e. clinically meaningful change in SLE over time (13), but has not been used in cSLE.
The concept of a minimal clinically important difference (MCID) was introduced by Jaeschke et al who defined the MCID as the “as the smallest difference in a score of a disease measure of interest that patients perceive as beneficial and that would mandate, in the absence of side-effects, a change in the patient’s management” (14). Since then, the OMERACT (Outcome Measures for Rheumatoid Arthritis Clinical Trials) group has explored the concept of MCID in depth (15, 16). There are different types of MCID, depending as to whether improvement or worsening is considered and what external standard is employed. MCID constitute threshold values for clinically relevant change, i.e. are special features of the responsiveness to change of a disease activity index. Any amount of change greater than the MCID threshold can be considered clinically meaningful or important, while any change score smaller than the MCID, irrespective of statistical significance, is clinically irrelevant.
Various statistical approaches have been suggested to calculate the MCID for an outcome measure (17). Besides determining the MCID itself, it is also of interest to assess how well an index can discriminate patients in whom a clinically important change has occurred from others,
The objective of this study was to determine the MCID of validated measures of disease activity when used in cSLE from a physician’s and parent’s perspective, using previously proposed statistical methods. We also wished to assess the ability of the disease activity indices to identify patients in whom a clinically relevant change of cSLE has occurred and to study the usefulness of the RIFLE for capturing clinically relevant change in cSLE.
Children (n=98) fulfilling American College of Rheumatology (ACR) Classification Criteria for SLE (2) were recruited at seven pediatric rheumatology centers and studied every 3 months for up to 18 months. Disease activity and change in the course of cSLE were measured at each study visit.
Besides the SLAM (7), SLEDAI (6), and ECLAM (8), the BILAG (9, 18) was completed. To convert the alphabetical domain scores of the BILAG to numerical cSLE activity scores, three alternative schemes were considered, as suggested by Gladman et al (19) (BILAGGladman: A= 4; B= 3; C= 2; D= 1; E= 0), Liang et al (7) (BILAGLiang: A= 10; B= 6.7; C= 3.3; D= 0; E= 0), and Stoll et al (20) (BILAGStoll: A= 9; B= 3; C= 1; D= 0; E= 0). Neither of these schemes has been well validated in cSLE or SLE. For all the above mentioned disease activity indices a score of ‘0’ reflects inactive disease.
The RIFLE is a 60-item questionnaire to measure change of disease activity using a 5-point Likert scale: not present, partial response, complete response, present or worsening (13, 21). It has been suggested that clinically important worsening of SLE is present if there are at least three RIFLE items with ‘worsening’ conditions; similarly, that clinically important improvement occurs when there are at least four RIFLE items with ‘partial response’ and/or ‘resolution’ conditions. This tool has not yet been validated for use in cSLE.
The managing pediatric rheumatology professsional completed a visual analog scale (VASMD; 0= inactive, 10= very active) presented with the sentence stem: ‘Considering the findings at today’s visit, the overall disease activity of the patient is (Please circle the number that appears most appropriate)’.
All raters underwent detailed and repeated training in completing the above mentioned disease activity measures.
In response to the sentence stem, ‘Compared to the last study visit three months ago and the patient’s overall disease, the patient experienced a’, the managing pediatric rheumatology professional rated the change in the disease course between consecutive visits on a 5-point Likert-scale as follows: major flare of disease; minor flare of disease; no change in disease; minor improvement of disease; or major improvement of disease. All pediatric rheumatology professionals who provided the above ratings for the course of cSLE, i.e. information about the external standard used for this validation exercise, were board-certified or board-eligible who, on average, see 20 patients with cSLE per week in their academic center and have a 10-year experience in treating cSLE.
In secondary analysis, we assessed how the scores of the disease activity indices changed with the course of cSLE as reflected by the family’s perspective. Thus, the parent-rated the change of their child’s disease on a 5-point Likert scale (much worse; somewhat worse; unchanged; somewhat improved; or much improved) that was presented with the sentence stem ‘Compared to the last study visit three months ago, and when considering medications, school, work, life at home, doctor visits, pains and feelings the overall well-being is’.
Various approaches to assessing a measure’s MCID have been proposed (22, 23). After review of the medical literature, we considered statistics that appeared to be most commonly employed to measure MCID: (A) using mixed models in longitudinal analyses that considered each patient had up to seven study visits, and the mean score change with improvement of disease activity was determined by assessing the changes in scores of the disease indices in patients who were rated either as showing ‘minor improvement’ or ‘major improvement’. Similarly, the mean score change with worsening of disease activity considered changes of the scores of the disease indices in patients who were rated as having a ‘minor flare’ or ‘major flare’, respectively. When assessing for significant differences in change scores between groups (here: clinically relevant improvement, no change in disease, clinically relevant worsening), p-values were corrected, under the mixed model frame work, using a Tukey procedure. A random effect, i.e. patients were introduced in the mixed effect model to account for within patient correlation caused by the repeated measurement; (B) alternatively, the MCID can be defined based on the standard error measurement (SEM) of changes in the disease activity scores with stable disease, i.e. of patients rated as having ‘no change in disease’ between consecutive visits; the MCID is then based on the so-called one Standard Error Measurement (1-SEM) criterion proposed by Wyrwich et al (24, 25). As done in the past, the SEM was defined as a square root of the within-episodes mean square error variance (calculated from an ANOVA model, using both episodes (nested in patients) and visits as fixed effects and accounting for within-patient (or between episodes) correlation using a generalized estimating equation (GEE) method in computation (27). Besides the traditional MCID thresholds at ± 1 SEM (or, equivalently, the 63% confidence interval [CI]), we explored alternative MCID thresholds at ± 1.645 SEM (or, equivalently, the 90% CI). Furthermore, to assess the diagnostic accuracy of the MCID with clinically important improvement, we calculated detection rates, i.e. the proportion of correctly identified episodes of improvement among the total episodes of improvement (as per the criterion standard) for each disease index. Accordingly, the detection rates with clinically worsening or for stable disease courses were calculated; (C) in a third approach to determining the MCID, each disease activity measure was assessed for its ability to discriminate between the disease courses (improved vs. no change vs. worsening) using classification and discrimination modeling (26). Linearized discrimination functions were done to calculate predicted probabilities for clinically important improvement, no change in disease, and clinically important worsening of disease for a given change score of the disease activity measure, under the classification and discrimination model framework; (D) similar to what has been suggested by the ACR Ad-Hoc Committee on SLE Response Criteria (27), clinically relevant changes of indices may be defined as the change score of disease activity measures with a 70%, 80%, or 90% predicted probability of an event to have occurred (here: clinically relevant improvement or clinically relevant worsening), i.e. each observation is assigned a probability of belonging to a given group based on the distance of its discriminant function from that of each class mean.
We also calculated intraclass correlation coefficients (ICC) to assess chance corrected agreement of the activity measures’ change scores with stable courses (28), using a similar approach as detailed in section (B) above.
For the RIFLE, kappa statistics were calculated to assess its agreement with the criterion standards. Like kappa coefficients, values of ICC can be interpreted as follows: poor agreement: ICC< 0.4; fair to good agreement: ICC≥ 0.4–0.75; excellent agreement: ICC≥ 0.75 (28).
In secondary analysis, the above detailed analysis was repeated, using the parent ratings of patients’ change in well-being between visits (instead of the physician assessment of cSLE disease course) as the criterion standard for determining the MCID of the SLE disease activity measures. All analyses were done using SAS 9.2 (SAS, Cary, NC) software. P-values < 0.05 were considered statistically significant.
The study was approved by the institutional review boards of the participating pediatric rheumatology centers. Informed consent was obtained from all parents, and, as appropriate, assent was given by the participants, prior to the study procedures.
The demographics and disease features of the cSLE patients are summarized in Table 1. Data from a total of 623 visits (or 526 between-visit intervals) of 98 children were available for analysis. There were 39 patients with biopsy-proven lupus nephritis. As per the managing pediatric rheumatology professionals, the courses of cSLE between consecutive study visits were: 89 episodes of clinically relevant worsening (12 major worsening, 77 minor worsening); 348 episodes of stable disease between visits, and 89 episodes of improvement (14 major improvements, 75 minor improvements).
From the families’ perspective, there were 59 episodes of worsening of well-being (9 major worsening, 50 minor worsening), 253 episodes of stable well-being, and 202 episodes of improved well-being (108 minor improvements, 94 major improvements). For 12 between-visit intervals no family ratings were available.
Between-visit changes of the VASMD and the scores of the disease indices are summarized in Table 2. Despite statistical significance, but irrespective of the index, mean change scores were all small and close to the smallest possible difference in score, which is ‘1’ for each of these tools. Thus increases of the ECLAM scores as small as ‘1’ appeared to be clinically relevant, while for the BILAGLiang increases of ‘3’ could be considered as clinically important. With clinically relevant improvement of cSLE, decreases in the scores of the disease activity indices were somewhat larger.
Disease measures remained most often unchanged with disease courses rated as ’no change in disease’ by the managing pediatric rheumatology professional (Table 2). Chance-corrected agreement of change score of activity index with stable disease course was ‘excellent’ for the MDVAS (ICC= 0.76) and ‘good’ for all other disease indices (all ICC≥ 0.47) (Tables 3).
Alternatively, as is shown in Table 3, the MCID can be based on the 1-SEM-Criterion, assuming that important improvement or worsening has occurred if the change score of exceeds −1 SEM or + 1 SEM, respectively. We also tested a more stringent MCID definition, i.e. the 90% CI around the mean change score or ± 1.645 SEM (Table 3). As is reflected by the respective detection rates, the tighter the CI limit was set, the more accurately patients with stable disease course could be discriminated from those who experienced clinically relevant change (Figure 1), but this occurred at the expense of decreased rates of correctly identified patients with clinically relevant change in disease (Table 3).
For example, when setting the MCID value of the SLEDAI to the one SEM (63% CI) boundary, only 56% of the patients with stable disease course would be correctly classified, while at the 90% CI mark, 77% of the patients with ‘no change in disease’ would be correctly identified as having a stable disease course. However, the 90% CI mark for defining the MCID would have greatly underestimated the frequency of patients in whom change truly had occurred.
Of note, detection rates using the 63% CI thresholds (1-SEM-Criterion) were again small and quite similar to those using the mean change scores as are presented in Table 2.
Discrimination and classification analysis provides cut-off values of disease change scores that best discriminate the three groups of patients (worsening, no change, improvement). Such cut-off values could be considered as alternative MCID thresholds. For physician-rated worsening and improvement, respectively, these MCID cut-off values [detection rates are presented in brackets] were for the VASMD at +0.6 [65%] and −0.5 [71%], the ECLAM at +0.6 [64%] and minus;0.5 [49%], the SLEDAI at +1.2 [48%] and minus;0.9 [58%], the SLAM at +0.9 [57%] and minus;0.9 [57%], the BILAGLiang at +2.6 [55%] and minus;1.3 [61%], the BILAGGladman at +0.7 [56%] and minus;0.6 [60%], and the BILAGStoll at +1.6 [60%] and minus;0.8 [52%]. Irrespective of the measure of disease activity considered, none of the MCID cut-offs determined by this statistical approach correctly classified > 64% of all episodes of the three disease courses. Figure 2, upper panels depicts representative results of these analyses for the VASMD and the SLEDAI.
Of note, MCID thresholds defined by discrimination and classification analysis were somewhat smaller but again similar to those defined by the 1-SEM-Criterion and comparable to those using the mean change scores for defining the MCID.
It has been suggested (27), that clinically relevant change in disease activity indices may be defined based on a certain desired probability to correctly detect patients with change of disease. The results of such analyses for predicted probabilities of 70% to 90% for improvement and worsening of cSLE, as would be predicted by discrimination analysis and the physician-rated change in cSLE as an external standard, are summarized in Table 4 (also Figure 2).
Clinically-relevant changes in disease indices defined by such high predicted probabilities were much larger than MCID defined by any of the other approaches to estimating the MCID.
When considering the ratings of the families (external standard: parent rating of change of patient well-being), the mean change scores of the disease measures were often even smaller than when physician ratings of change in cSLE were used as external standard.
MCID defined the 1-SEM-Criterion were for the 63% CI as follows: VASMD at ±0.7, ECLAM at ±0.9, SLEDAI at ±1.9, SLAM at ±3.8, BILAGLiang at ±4.2, BILAGGladman at ±1.4, and BILAGStoll at ±3.3, respectively.
Using discrimination and classification analysis, the MCID cut-offs that discriminated best between groups of patients with different disease courses were as follows for worsening and improvement [detection rates are presented in brackets], respectively: for the VASMD at +0.2 [42%] and minus;0.1 [37%], for the ECLAM at +0.1 [55%] and minus;0.2 [44%], for the SLEDAI at +0.4 [40%] and minus;0.2 [42%], for the SLAM at +0.2 [16%] and minus;0.9 [63%], for the BILAGLiang at +1.1 [53%] and minus;0.6 [44%], for the BILAGGladman at +0.2 [58%] and minus;0.4 [44%], and for the BILAGStoll at +0.6 [55%] and minus;0.3 [45%]. This is illustrated the VASMD and SLEDAI in Figure 2, lower panels. Families often rated the patients’ well-being as unchanged even when large changes in the VASMD and the SLEDAI had occurred.
Regardless of the activity measure considered, none of the MCID cut-offs using this statistical approach were able to correctly classify > 47% of all episodes of the three disease courses.
For reaching a 70% predicted probability of clinically important worsening of well- being to have occurred, the VASMD had to have increased by 5.2, the ECLAM by 6, the SLEDAI by 13, the SLAM by 4, the BILAGLiang by 20, the BILAGGladman by 6, and the BILAGStoll by 12, respectively.
Similarly, for achieving 70% predicted probabilities of patients whose well-being importantly improved, the respective MCID thresholds were for the VASMD at − 8.2, the ECLAM at minus;9, the SLEDAI at minus;18, the SLAM at minus;5, the BILAGLiang at minus;32, the BILAGGladman at minus;10, and the BILAGStoll at minus;21, respectively.
The RIFLE correctly identified 26% and 8% of the episodes of disease worsening and disease improvement, respectively. The kappa coefficient ± standard error of the RIFLE was only 0.06 ± 0.02. Alternative criteria for defining improvement or worsening (instead of clinically relevant worsening: ‘worsening’ of at least three RIFLE items; clinically important improvement: ‘partial response’ and/or ‘resolution’ occurs in four or more RIFLE items) did not improve the accuracy of the RIFLE for capturing cSLE disease courses (data not shown).
The MCID, i.e. the smallest changes of measures that have clinical relevance, for disease activity indices in cSLE are much sought after but difficult to ascertain. Differences smaller than the MCID are regarded as clinically irrelevant, irrespective of whether the difference is statistically significant or not (23). In the world of statistics, a significant difference is simply a difference that is unlikely to have occurred by chance and has a mathematical basis for such a claim. In the realm of health care, a difference may be statistically significant based on a simple numerical value, yet may at the same time be of little or no importance to the health or quality of life of patients.
There is not a single generally accepted mathematical approach to calculating the MCID. We present various alternative strategies to determining the MCID for disease activity measures, using previously proposed statistical approaches. Irrespective of the global disease activity index considered, the methodological approach chosen, and despite statistical significance, the MCID of the disease activity indices in cSLE were often very small, confirming previous observations in adults with SLE (30, 31). This means that, in cSLE, even small changes in disease activity may be clinically relevant. Such small MCID of disease activity measures are problematic as they bear the risk of erroneously classifying a patient as having improved or worsened by an important amount when, in fact, no such clinically relevant change has occurred. Because groups of patients with clinically relevant differences in cSLE disease courses cannot be well separated using the MCID thresholds, changes in disease activity indices alone appear unlikely to suffice for approximating cSLE courses correctly. Thus, similar to SLE in adults, clinically relevant improvement and worsening of cSLE can unlikely be defined based on changes in the scores of the tested disease activity indices alone (32, 33).
More generous MCID thresholds set at the level of 70% predicted probability for detecting patients with clinically relevant change in cSLE (improved or worse) were similar to those proposed for adults with SLE (27).
Increasing MCID thresholds to these high predicted probability rates occurs at the expense of sensitivity, i.e. will result in a large number of patients whose disease has truly worsened or improved to be rated as having experienced no change in their disease course. This again supports the notion that clinically relevant changes in the course of cSLE may not be adequately captured solely by the disease activity indices assessed in this study.
When considering the families’ perspective of the course of cSLE, the above is also true. Using the MCID as cut-off values, disease activity indices appeared even less suited to discriminate patients whose well-being had improved from those where it had remained unchanged or even worsened. Of note, this was true irrespective as to whether the disease index under consideration included items that account for patient symptoms (instead of only objectively measurable cSLE signs).
We do not think that errors in completing the disease activity tools were the basis for the small MCID values determined in this study, because all participating investigators were repeatedly trained to complete the disease activity indices.
It is noteworthy that the MCID thresholds based on the mean change scores, the 1-SEM-Criterion and the discriminant analysis were all quite similar, supporting the validity of our findings. Moreover, additional data (laboratory values, physical examination, patient symptom reports) were collected to allow for data-driven confirmation of the disease activity scores provided, and our findings were in line with those seen in adults with SLE (27, 31).
Based on our study, the RIFLE appears less useful for the assessment of cSLE than for SLE. Additional studies will be necessary to explore in more depth why the RIFLE did not perform as well in children as would have been expected based on previous studies in adults with SLE (34).
This study has to be interpreted in light of certain limitations. As has been done by others (27, 31), we used the physician and the parent, rather than the patient, assessment of change in cSLE as the criterion standards for defining the MCID. Because of the complexity of the underlying construct, expert opinion may differ widely as to whether important improvement or worsening of cSLE has occurred or not (34). We did not consider alternative criterion standards, such as change in therapy or prednisone dosages because medication change reflects the physician’s perception of the patient’s change in disease activity. Prior research suggests that pediatric rheumatologists, similar to adult rheumatologists, differ widely in their treatments of cSLE and SLE (35, 36). Moreover, inclusion of episodes of major changes in disease, rather than only of minor improvement or worsening of cSLE, r, might have led to an overestimation of the MCID, which would not have changed the conclusions of this study.
In this study, using various statistical approaches, we found the MCID of the SLEDAI, SLAM, ECLAM and BILAG for clinically important improvement or worsening of cSLE to be small, suggesting that even small changes in their scores can have clinical relevance. Based on the commonly used SEM-approach to estimating MCID, increases or decreases of the ECLAM score by 1, the SLEDAI or BILAGStoll or Gladman by 2, the SLAM by 4, or the BILAGLiang by 5 can be clinically significant. More generous MCID thresholds based on predicted probabilities using discriminant and classification analysis bear the risk of both underestimating response to therapy and the occurrence of flares in children with cSLE.
Investigators (data collection):
CCHMC, Cincinnati, OH: Drs. Robert Colbert, T. Brent Graham, Murray Passo, Thomas Griffin, Alexi Grom, and Daniel Lovell
Nationwide Children’s Hospital, Columbus, OH: Dr. Robert Rennebohm
University of Chicago Comer Children's Hospital, Chicago, IL: Dr. Linda Wagner-Weiner
Texas Scottish Rite Hospital, Dallas, TX: Shirley Henry PNP
Medical College of Wisconsin, & Children's Research Institute, Milwaukee, WI: Drs. James Nocton and Calvin Williams; Elizabeth Roth-Wojicki, PNP
CCHMC, Cincinnati, OH: Shannen Nelson (study coordinating), Jamie Meyers-Eaton (site coordinator); Lukasz Itert (database management); Kristina Wiers (data collection); CCHMC Biomedical Informatics (Web-based data management application development).
Texas Scottish Rite Hospital, Dallas, TX: Shirley Henry, PNP
University of Chicago Comer Children's Hospital, Chicago, IL: Becky Puplava (site coordinator)
Children’s Memorial Hospital, Chicago, IL: Dina Blair (site coordinator)
Medical College of Wisconsin, & Children's Research Institute, Milwaukee, WI: Marsha Malloy (data collection & site coordinator), Jeremy Zimmermann, Joshua Kapfhamer and Noshaba Khan (data collection).
Alfred I. DuPont Hospital for Children, Wilmington, DE: Drs. AnneMarie Brescia and Carlos Rosé
The study is supported by grant funded NIAMS 5U01AR51868, and P60-AR047884