|Home | About | Journals | Submit | Contact Us | Français|
The Systemic Lupus Collaborating Clinics (SLICC) revised and validated the American College of Rheumatology (ACR) SLE classification criteria in order to improve clinical relevance, meet stringent methodology requirements and incorporate new knowledge in SLE immunology.
The classification criteria were derived from a set of 702 expert-rated patient scenarios. Recursive partitioning was used to derive an initial rule that was simplified and refined based on SLICC physician consensus. SLICC validated the classification criteria in a new validation sample of 690 SLE patients and controls.
Seventeen criteria were identified. The SLICC criteria for SLE classification requires: 1) Fulfillment of at least four criteria, with at least one clinical criterion AND one immunologic criterion OR 2) Lupus nephritis as the sole clinical criterion in the presence of ANA or anti-dsDNA antibodies. In the derivation set, the SLICC classification criteria resulted in fewer misclassifications than the current ACR classification criteria (49 versus 70, p=0.0082), had greater sensitivity (94% versus 86%, p<0.0001) and equal specificity (92% versus 93%, p=0.39). In the validation set, the SLICC Classification criteria resulted in fewer misclassifications (62 versus 74, p=0.24), had greater sensitivity (97% versus 83%, p<0.0001) but less specificity (84% versus 96%, p<0.0001).
The new SLICC classification criteria performed well on a large set of patient scenarios rated by experts. They require that at least one clinical criterion and one immunologic criterion be present for a classification of SLE. Biopsy confirmed nephritis compatible with lupus (in the presence of SLE autoantibodies) is sufficient for classification.
Systemic lupus erythematosus (SLE) is a prototypic autoimmune disease, affecting more than 300,000 in the United States (1) and millions worldwide. To ensure that there is a consistent definition of SLE for the purposes of research and surveillance, classification criteria for SLE are needed. The most widely used classification criteria for SLE are those developed by the American College of Rheumatology (ACR). These classification criteria were published in 1982 (2) and revised by a committee in 1997 (3) to delete the LE cell criterion and to change the immunologic criterion to include anticardiolipin antibodies. The 1982 ACR criteria have been validated (4, 5), but not the 1997 revision.
Subsequently, multiple groups employed new statistical methodology to refine SLE classification criteria. The Cleveland Clinic weighted criteria used the application of Bayes’ theorem to develop a weighting system (6). Costenbader et al formulated the Boston Weighted Criteria, based on the Cleveland Clinic criteria, but including antiphospholipid antibodies and renal pathology (7). In addition, criteria set points were subtracted for elements that might negate the diagnosis, such as negative ANA. Some criteria definitions were revised, such as arthritis requiring objective synovitis (7). The weighted criteria were applied by Sanchez et al. and were found to be more sensitive, but less specific (8).
An alternative statistical methodology, recursive partitioning, was used by Edworthy et al (9). Recursive partitioning or “Classification and Regression Trees” (CART) is a computer-intensive method used to derive a classification rule based on multiple candidate predictor variables (10). The CART software package dichotomizes variables based on all possible cutpoints. The best discriminating cutoff is chosen for each variable. Edworthy et al used the same data set as the 1982 ACR criteria, but added two derived variables, a standardized ANA and a “composite” complement variable (9). In addition, analyses were done with the immunologic criterion divided into components (anti-dsDNA, anti-Sm, false positive test for syphilis), and the hematologic variable divided into hemolytic anemia, thrombocytopenia and leukopenia. Using the best discriminating criteria, this method allowed correct classification of a majority of cases and controls.
In his 1987 methodology paper, Fries reviewed the critical procedures in developing classification criteria for SLE to avoid circularity, that is, the avoidance of criteria that are molded to the test data and not necessarily generalizable. The critical procedures include use of a “gold” standard which must be established by highly experienced clinicians. Consecutively treated patients and multiple institutions need to be used to minimize selection bias. Control populations should be chosen to represent a realistic spectrum of related diseases that replicate the diagnostic problems that arise in real life. The variables must be defined with precision, because a small change in the definition for a criterion could lead to a large change in sensitivity and specificity. Finally the proposed criteria need to be validated on a new population (because criteria always work well in the population from which they were developed) (11).
The Systemic Lupus International Collaborating Clinics (SLICC) is an international group dedicated to SLE clinical research. This group produced tools that form the basis of outcome studies in SLE today, such as the SLICC-ACR Damage Index (12). In the current study SLICC undertook a revision of the SLE classification criteria to address multiple concerns that have arisen since the 1982 criteria were developed. The SLICC formal assessment of the important clinical manifestations of SLE and limitations of the 1982 ACR criteria is summarized in the journal: Lupus (13). Concerns about the clinical criteria in the current ACR classification including: possible duplication of highly correlated cutaneous lupus terms (such as malar rash and photosensitivity) and the absence of inclusion of many other lupus cutaneous manifestations; omission of many SLE neurologic manifestations; and the need to utilize new standards in the quantification of urine protein. Concerns about the immunologic criterion included the omission of low complement, and the need to include new knowledge on antiphospholipid antibodies. Most of all, there were concerns about patients without any immunologic criteria being classified as SLE (an autoantibody-mediated disease). Indeed clinical trials have had to add the requirement for the presence of a SLE autoantibody when recruiting patients to optimize the likelihood of response to immunosuppressive therapy (14). It was felt that important control groups, including chronic cutaneous lupus, needed to be included in a validation exercise. Therefore, we included a number of Dermatology sites. Finally, It was felt that biopsy confirmed nephritis compatible with SLE (in the presence of lupus autoantibodies) was so indisputably representative of the disease that it should be considered sufficient as a “stand alone” clinical criterion. The revision exercise was conducted in accordance with the methodology requirements summarized by Fries (11).
An initial set of precisely defined variables to be abstracted from medical records for each patient was determined at a SLICC meeting in Lund, Sweden, April 25-27, 2003. At this meeting, experts in each organ system affected by SLE gave a formal presentation, reviewing the current (1997) ACR classification criteria, and other classification approaches to that organ system. The list of variables was further refined at a meeting of SLICC in Orlando, Florida in October 2003, later published in the journal Lupus (13, 15-23).
Each participating center was asked to submit data on 10 to 12 consecutive patients with a clinical diagnosis of SLE and 12 to 15 controls. The controls were to consist of consecutively seen patients with one of the following diagnoses: rheumatoid arthritis, myositis, chronic cutaneous lupus, undifferentiated connective tissue disease, vasculitis, primary antiphospholipid antibody syndrome, scleroderma, fibromyalgia, Sjögren syndrome, rosacea, psoriasis, sarcoidosis and juvenile idiopathic arthritis. Because it was recognized that important control groups , including chronic cutaneous lupus, which had not necessarily been part of previous efforts, needed to be represented, cases were also contributed by a number of dermatologists.
The information regarding each patient was summarized in a standardized short narrative, and these were sent to 32 rheumatologists from the SLICC group. SLICC physicians classifying patients were unaware of the submitting physician's diagnosis. These clinicians then classified each patient as having, or not having, SLE. If 80% of the rheumatologists agreed on the classification, that diagnosis was considered the “consensus” diagnosis. Those scenarios that did not reach consensus in this way were later discussed by a panel of 5, and if 4/5 agreed on a classification, that diagnosis was also considered the “consensus” diagnosis.
A SLICC subcommittee (GSA, PF, CG, JM and GM) examined dozens of variables. The subcommittee reviewed extensive logistic regression analyses and decision tree analyses in order to use a data driven approach to the selection and combination of items with about 40 different combinations of over 20 items. Although a few clinically important items were kept in the final criteria because of their clinical importance (such as low complement), the selection of items was strongly influenced by logistic regression analyses in both the selection and the elimination of many items. Thus the final selection of items was data driven but refined by consensus view. These variables were then considered the candidate predictor variables for the recursive partitioning analyses.
Using recursive partitioning (CART software package) patients were divided into two groups based on all candidate variables. The resulting partitions were evaluated. The partition that resulted in the best separation of SLE cases from non-cases was chosen for the first split of the tree. At subsequent steps, the procedure was repeated within the subgroups created by previous splits. The algorithm identified subgroups of patients defined by predictor variables which were relatively homogeneous with respect to SLE diagnosis. The resulting subgroups could then be identified using a relatively simple rule. This approach was applied by the SLICC subcommittee (Chair, Graciela S. Alarcón, MD, MPH) using the candidate set of variables to result in a preliminary data-driven classification rule, with the requirement of at least one clinical and one immunologic variable being necessary.
The preliminary classification rule was discussed at three meetings of SLICC members in 2008. These small group meetings, organized with the help of Dr. Ian Bruce and Dr. David Isenberg, allowed intense discussions of the criteria deficiencies. Patients who were misclassified by the rule were used to stimulate discussions regarding how the rule or definitions of the variables could be changed to improve the classification rule. In the final SLICC meeting, discussion, followed by a vote, was used to ratify remaining items for which there was not unanimous agreement. As a result of this step we: 1) excluded anti-C1q as an immunologic criterion; 2) excluded a “constitutional” clinical criterion of fever and lymphadenopathy; and 3) included joint line tenderness with morning stiffness under the “arthritis” criterion.
To assess the performance of the new classification rule, we obtained detailed data regarding a new set of 690 additional patients. Sites were again asked to submit information on patients diagnosed with SLE, and on an approximately equal number of controls with the following diagnoses: rheumatoid arthritis, undifferentiated connective tissue disease, primary antiphospholipid antibody syndrome, vasculitis, chronic cutaneous lupus, scleroderma, Sjögren syndrome, myositis, psoriasis, fibromyalgia, alopecia areata and sarcoidosis. These data were collected on standardized case report forms and sent to the coordinating site. Information included a demographics summary, a clinical scenario, specification of ACR criteria that were met and not met, specification of SLICC criteria met and not met, auto-antibody titers and complement titers.
In addition, serum from each patient was sent to the coordinating site and analyzed at the Rheumatology Diagnostic Laboratory (Los Angeles, CA) for anti-dsDNA by ELISA, Crithidia and Farr assays, anti-Smith antibody and complement C3 and C4 levels. A second set of blood samples were tested for antiphospholipid antibodies (lupus anticoagulant, ELISA assay for IgG, IgM and IgA isotypes of anticardiolipin antibodies and anti-beta2 glycoprotein1 antibodies) at the laboratory of Joan Merrill, M.D. (Oklahoma Medical Research Foundation). Direct Coombs was done at each center's own laboratory or at Quest Diagnostics. A short description of each patient (“patient scenario”) was generated containing the submitted information and the updated auto-antibody and complement profiles.
These patient scenarios were submitted to participating SLICC members for rating as either SLE or not SLE. Twelve SLICC members rated all 690 scenarios, and three SLICC members rated some but not all scenarios. Those scenarios which did not reach 80% consensus in the initial rating process were edited for clarity, and re-rated by the larger group of 33 SLICC physicians. SLICC physicians classifying patients were unaware of the submitting physician's diagnosis. More than 80% consensus was achieved on 615 cases while 75 cases remained without a consensus diagnosis of SLE or not SLE. After this second round of ratings, the 75 nonconsensus scenarios were classified as either SLE or not SLE based on the majority opinion.
The Kappa statistic was used to quantify the chance-adjusted degree of agreement between the classification rules and the gold standard rating based on the majority opinions of the raters. McNemar's test was used to assess whether there was a significant difference between the current ACR Revised Classification criteria and the SLICC Classification criteria with respect to accuracy.
In the derivation step, abstracted data from a total of 716 patients were submitted from 25 different sites. While most sites submitted data on more than 20 patients, two sites submitted fewer than 10 patients, and one site (Johns Hopkins) submitted data on 171 patients. The 716 scenarios that were contributed had the following diagnoses at their site: systemic lupus erythematosus 293, rheumatoid arthritis 119, myositis 55, chronic cutaneous lupus 50, undifferentiated connective tissue disease 44, vasculitis 37, primary antiphospholipid antibody syndrome 33, scleroderma 28, fibromyalgia 25, Sjögren syndrome 15, rosacea 8, psoriasis 7, sarcoidosis 1, and juvenile idiopathic arthritis 1.
Each submitted patient was reviewed by 26-32 clinicians. The results of the initial classification are summarized in Table 1. For 262 (36.6%) of the 716 patients, 80% or more of the physicians diagnosed the patient as having SLE. For 354 (49.4%) patients, 80% or more diagnosed the patient as not having SLE. Thus, there was 80% or more agreement for 616 (86%) of the scenarios (with respect to SLE status). These classifications agreed with the submitting diagnosis 561 (91%) of the time.
For the remaining 100 (14%) patients, there was less agreement regarding the diagnosis (Table 1). These 100 patients then underwent further review and discussion by five member panels. Eighty percent consensus was reached for 86 of these patients. Thus, ultimately a consensus diagnosis was achieved for 702 (98%) of the 716 patients submitted for the study. The consensus diagnosis agreed with the diagnosis of the submitting physician 95% of the time. The analyses described below are based on these 702 patients.
Eighteen criteria that were associated with SLE diagnoses were identified and were initially considered. These were divided into two groups: immunological and “clinical”, based on the judgment of the SLICC subcommittee. The degrees of association between each candidate criterion and the consensus diagnosis of SLE are shown in Table 2. Recursive portioning was applied to this set of variables to arrive at our initial working rule. After discussion and examination of misclassified cases, some definitions were refined, and leukopenia and lymphopenia were combined. Table 3 provides the final list of criteria, and provides details regarding how each criterion was ultimately defined.
Criteria need not be present concurrently. The proposed classification rule is as follows:
Classify a patient as having SLE if
The patient satisfies four of the criteria listed in Table 3, including at least one clinical criterion and one immunologic criterion.
The patient has biopsy-proven nephritis compatible with SLE and with ANA or anti-dsDNA antibodies.
Table 4 shows the performance of this classification rule in our derivation set of patients. In the derivation set the proposed rule had greater sensitivity (94% versus 86%, p<0.0001) and equal specificity (92% versus 93%, p=0.39). Using McNemar's test, we found that the proposed rule resulted in significantly fewer misclassifications than the current ACR classification criteria rule (p=0.0082).
To validate the proposed new rule we used data collected on 690 additional patients that were not used to derive this rule. These patients were submitted from 15 different sites. All sites submitted data on more than 20 patients and one site (Johns Hopkins) submitted data on 180 patients. The 690 validation patient scenarios that were contributed had the following diagnoses at the contributing site: systemic lupus erythematosus 337, rheumatoid arthritis 118, undifferentiated connective tissue disease 89, primary antiphospholipid antibody syndrome 30, vasculitis 29, chronic cutaneous lupus 24, scleroderma 20, Sjögren syndrome 15, myositis 14, psoriasis 8, fibromyalgia 4, alopecia areata 1, and sarcoidosis 1.
Eighty percent agreement was achieved for 590 (86%) of the patient scenarios during the first round of rating. The 100 scenarios that did not achieve 80% agreement during the first round of ratings were then sent to a larger group of SLICC members for the second round of rating. Table 5 shows the degree of agreement achieved for all 690 scenarios based on both rounds of rating. Note that 80% or more agreement was achieved on whether the case was SLE or not SLE for all but 75 (11%) of the scenarios. The majority rule rating agreed with the submitting diagnosis (with respect to SLE status) 93% of the time.
Table 6 shows the sensitivity and specificity of each classification rule relative to the classification made by the majority of raters in the validation patients. The SLICC rule was more sensitive (97% versus 83%, p<0.0001) than the current (1997) ACR rule, but less specific (84% versus 96%, p<0.0001). Overall the SLICC rule performed better than the ACR rule, misclassifying 12 fewer patients and having a higher Kappa. The difference between the rules, however, was not statistically significant (p=0.24).
If we restrict the analysis to those 615 scenarios that achieved 80% or more agreement after the second round of rating, the sensitivity and specificity of the SLICC criteria were found to be 98% and 91% respectively. In contrast, in this subset of scenarios, the sensitivity and specificity of the ACR criteria were 88% and 98% respectively, and these reflect 9 more misclassifications.
The SLICC classification criteria for SLE represent an eight year effort of clinical review, consensus and statistical analyses. The final criteria were derived using recursive partitioning (“tree-based” approach), but were simplified to a simple rule: “lupus nephritis” (in the presence of at least one of the immunologic variables indicated) as a “stand alone” criterion, OR four criteria (with one having to be a clinical criterion and one having to be an immunologic criterion). The requirement for at least ONE clinical and ONE immunologic criterion reflects the opinion of SLICC that neither clinical criteria alone nor positive serologic tests alone should be considered SLE, as SLE is ultimately an autoantibody-driven clinical disease.
The clinical criteria improve on the revised ACR classification criteria in several important ways. Malar rash and photosensitivity are not separate items, as they are largely overlapping. One cutaneous criterion includes both acute and subacute cutaneous lupus, whereas a separate cutaneous criterion now includes discoid rash and the many different types of chronic cutaneous lupus not included in the current ACR classification criteria. To employ these optimally, it is anticipated that some proposed SLE patients will require a dermatologic consultation, and sometimes a skin biopsy. Non-scarring alopecia is included, as it was in the original ARA criteria (24): although not specific for SLE, it performed well in the univariate and recursive partitioning analyses, and met the bar of clinical consensus.
The arthritis criterion has been substantially redefined. First, it does not require a radiograph: some SLE arthritis is, in fact, erosive (25). Second, joint line tenderness with 30 minutes of morning stiffness now qualifies for arthritis. Because of the overlap of fibromyalgia and SLE in some patients, it will be necessary to confirm that there is specifically joint line tenderness and not more diffuse allodynia. It is also essential to underscore that for all the SLICC criteria, the clinician must be able to determine that the cause is likely attributable to SLE and not due to another disease process or condition.
The renal criterion now includes measurement of proteinuria by the urine protein/creatinine ratio without the requirement of a time frame for collection. This reflects acceptance that the “spot” or random urine protein/creatinine ratio is easier to obtain than a 24 hour urine protein (26), and that a qualitative estimate of proteinuria from a dipstick is insufficient for clinical judgment, as it is an unreliable quantitative measure. The gold standard, however, remains the urine protein/creatinine ratio done on a 24-hour urine collection (27).
The neurologic criterion has been substantially re-written to include a greater number of SLE neurologic manifestations than the original ACR definition of seizures or psychosis. It does not include all the ACR neuropsychiatric case definitions (28), due to the absence of specificity of most of these for SLE (29).
The hematologic criteria have been split into three parts: hemolytic anemia, leukopenia/lymphopenia and thrombocytopenia. Statistical modeling showed that it made no difference whether “once” or “more than once” was required. Therefore, to simplify assessment, the SLICC criteria require only one abnormal assessment (of course, the result must be due to SLE and not other factors, such as prednisone [for lymphopenia], immunosuppressive drug use, infection, or other causes). We accept that the cut-off range for leukopenia may need to be amended for patients of certain ethnic groups (30).
The immunologic criterion reflects new knowledge about serologic tests in SLE and also the concern of SLICC about the wider use of ELISA and multiplex assays (31). The ANA criterion remains unchanged. In the old immunologic criterion, anti-dsDNA antibodies, anti-Sm antibodies, lupus anticoagulant, false-positive test for syphilis, and anticardiolipin antibodies were combined. The new SLICC classification criteria has split these features into separate criteria, so that each may contribute to classification. The new anti-dsDNA antibody criterion, however, requires a stricter cut-off for ELISA assays. Anti-Sm antibody is now an individual criterion. The new antiphospholipid antibody criterion now includes anti-β2 glycoprotein I antibodies. The anticardiolipin definition excludes non-specific “low” levels (which were included in the revised ACR criteria) (3). IgG, IgM, or IgA isotypes are allowed for anti-β2 glycoprotein I and anticardiolipin, reflecting new knowledge that IgA isotypes are important in SLE (32).
Upon SLICC consensus, even though it did not improve the statistical modeling, we included low complement, defined by C3, C4, or total hemolytic complement, reflecting the contribution of complement to disease pathogenesis.
We included the direct Coombs (anti-globulin) test. Direct Coombs did improve statistical modeling. To avoid “double counting”, however, it is not counted if the patient has the clinical criterion of hemolytic anemia.
The final important aspect of the new SLICC classification criteria is that biopsy confirmed nephritis compatible with SLE according to the International Society of Nephrology/Renal Pathology Society (ISN/RPS) 2003 Classification of Lupus Nephritis (33), in the presence of ANA or anti-dsDNA antibodies is now sufficient for a classification of SLE. SLICC thought this was important in both clinical practice and for enrollment in clinical trials. It is acknowledged that the presence of anti-dsDNA antibodies in the absence of ANA is a rare phenomenon and may be due to laboratory error.
The SLICC classification criteria performs better than the revised ACR criteria in terms of sensitivity, but not specificity. These criteria are meant to be clinically more relevant, allowing the inclusion of more patients with clinically-defined lupus than using the current ACR criteria. They will be important in clinical trials and in longitudinal observational studies.
The SLICC classification criteria were subjected to rigorous testing. The new patient sample used for validation consisted of 690 patients and included patients from multiple centers with multiple diagnoses that have clinical features that overlap with lupus. In the validation sample the SLICC classification criteria misclassified fewer cases and had higher sensitivity, although less specificity. The difference between the ACR classification criteria and SLICC classification criteria performance was not statistically significant. The SLICC classification criteria have better face and content validity as they overcome many concerns with the current criteria. In particular, the new criteria require the presence of both clinical and serologic criteria so that patients without autoantibodies or low complement, the hallmark of SLE, cannot be classified as having SLE. Clinical trials of lupus have had to add to their inclusion criteria the requirement for lupus autoantibodies to overcome this deficiency (14).
The SLICC validation exercise serves as the first validation of the SLICC classification criteria (and validates the revised ACR criteria as well) in studies involving the largest, multicenter population sample, since the initial conception of the ACR classification criteria for SLE. It is important to emphasize that the 1997 revision of the ACR criteria was never validated. The ACR criteria continue to perform well compared to the current physician diagnosis gold standard, but do not include the updated and more inclusive definitions of variables of the SLICC criteria. The SLICC Classification Criteria provide alternative classification criteria for use in SLE clinical care and research. The validated SLICC Classification Criteria have gained in face validity over the revised ACR criteria and are more in line with advancing concepts of SLE pathogenesis. It should be noted that, as with the original revised ACR criteria, they have not been tested for purposes of diagnosis. SLICC concludes that the new criteria retain the goal of simplicity of use, yet reflect current knowledge of SLE obtained in the 29 years since the initial ACR criteria.
The authors thank RDL laboratories (Los Angeles, CA) for performing the laboratory tests and INOVA (San Diego, CA) for donating assay kits.
Grant Support: Supported by NIAMS (R01AR043727) and Lupus Foundation of America. Also supported by an unrestricted Research Grant from Human Genome Sciences. Dr. Ana-Maria Orbai is supported by NIH grant T32 AR048522.
The study was approved by institutional review boards at all institutions involved and all patients provided written informed consent.