|Home | About | Journals | Submit | Contact Us | Français|
Clinical care and therapeutic trials in idiopathic inflammatory myopathies (IIM) require accurate and consistent assessment of cutaneous involvement. The Cutaneous Assessment Tool (CAT) was designed to measure skin activity and damage in IIM. We describe the development and inter-rater reliability of the CAT, and the frequency of lesions endorsed in a large population of juvenile IIM patients.
The CAT includes 10 activity, 4 damage and 7 combined lesions. Thirty-two photographic slides depicting IIM skin lesions were assessed by 11 raters. One hundred and twenty three children were assessed by 11 pediatric rheumatologists at ten centers. Inter-rater reliability was assessed using simple agreements and intra-class correlation coefficients (ICC).
Simple agreements in recognizing lesions as present or absent were generally high (0.5 – 1.0). ICC's for CAT lesions were moderate (0.4 – 0.75) in both slides and real patients. ICC's for the CAT activity and damage scores were 0.71 and 0.81, respectively. CAT activity scores ranged from 0 – 44 (median 7, potential range 0 – 96) and CAT damage scores ranged from 0 – 13 (median 1, potential range 0 – 22). The most common cutaneous lesions endorsed were periungual capillary loop changes (63%), Gottron's papules/sign (53%), heliotrope rash (49%) and malar/facial erythema (49%).
Total CAT activity and damage scores have moderate to good reliability. Assessors generally agree on the presence of a variety of cutaneous lesions. The CAT is a promising, semi-quantitative tool to comprehensively assess skin disease activity and damage in IIM.
The idiopathic inflammatory myopathies (IIM) are a group of serious, multi-system autoimmune illnesses which cause muscle inflammation, leading to weakness, impaired endurance and physical function, and potentially permanent damage to muscle and other tissues . Adult and juvenile dermatomyositis (DM) also have a variety of cutaneous manifestations which are important components in the assessment of both ongoing disease activity and chronic lesions associated with damage. Cutaneous manifestations may also be important in polymyositis (PM) and in myositis associated with another autoimmune disease (overlap myositis) [2, 3].
We have previously shown that overall skin activity, as measured by 10 cm visual analogue scale (VAS), correlates moderately with measures of physical function in children with juvenile DM [4, 5]. It has also been suggested that ongoing skin disease activity in children with juvenile DM reflects an active vasculopathic process . Skin involvement and skin damage in particular can be an important source of morbidity for children with juvenile DM. Calcinosis, poikiloderma vasculare atrophicans or cutaneous scars may be associated with pain or disfigurement, and can negatively impact quality of life and physical function . These findings are reflected in the fact that skin disease activity is considered to be an integral component of the extra-muscular core set outcome for clinical trials in adult and juvenile DM in two independent consensus statements [8, 9]. Clearly, there is a need for a standardized, valid assessment of cutaneous disease, both activity and damage, in patients with myositis.
At present, there are a few options to assess cutaneous disease in these patients. A global 10 cm VAS has been used by some investigators [4, 5]. However, there are several disadvantages to this approach. When a global VAS is used, the assessor must simultaneously consider all of the skin lesions that a patient may have. It is not clear how individual skin lesions should be weighted. Assessors may differ in how they weight individual skin lesions, resulting in increased inter-assessor variability. Also, by combining the assessment of multiple skin lesions, considerable information is lost, including the ability to capture change in individual skin lesions. For example, one rash may improve dramatically, while a different cutaneous lesion may become much worse over time.
The problem of quantitatively assessing skin disease has also been a problem in other illnesses. For example, the Psoriasis Area and Severity Index (PASI) is widely use to assess psoriatic skin lesions both clinically and in research studies [10, 11]. Unfortunately, the experience gained in the development and use of the PASI is of limited utility in myositis. The PASI seeks to estimate the extent and severity of what are essentially the same psoriatic skin lesions, whereas adult and juvenile DM are characterized by a variety of different skin lesions which may have varying implications for both activity and damage. For this reason, the PASI was not felt to be an adequate model for the assessment of skin disease patients with myositis.
More recently, the Disease Activity Score has been described as an assessment tool for juvenile DM . It consists of a weakness/muscle function component and a cutaneous component. In the skin component, a limited number of characteristics are assessed: presence and severity of erythema, distribution of cutaneous involvement, presence of vasculitic lesions, and the presence and severity of Gottron's papules. Preliminary validation has suggested that overall the tool has good internal consistency and reliability . However, some items of the skin component appeared to have poor agreement between raters.
We developed the Cutaneous Assessment Tool (CAT) as a comprehensive tool to assess the cutaneous manifestations of adult and juvenile IIM that are associated with activity and damage, a need that was not met by those tools previously available. The specific goals of this report are to introduce the CAT and describe its development, to examine the inter-rater reliability and preliminary construct validation of the CAT in children with juvenile IIM, and to determine the endorsement rates of lesions included in the CAT in a large, prevalent cohort of children with juvenile IIM.
The original CAT was developed by an interdisciplinary group which included adult and pediatric rheumatologists and a dermatologist experienced in the assessment of myositis and other autoimmune disorders with cutaneous manifestations. The explicit intention was to include those lesions which the investigators, experts in the assessment of DM, considered to be important in the assessment of skin activity in DM. This original tool assessed 28 lesions, including 16 activity lesions representing the reversible manifestations of IIM, 5 damage lesions representing potentially irreversible residua of previously active disease or medications, and 7 lesions which represented a combination of both activity and damage. The lesions included were consistent with the classification of DM skin lesions described by Sontheimer . The tool was then critiqued by the larger collaborative group of investigators, and by a number of dermatologists experienced in cutaneous autoimmune disorders, resulting in the deletion of 5 lesions (purpura, Raynaud's phenomena, urticaria, mucinous papules and acanthosis nigricans) and the combining of 4 cutaneous manifestations into 2 lesions (Gottron's papules with Gottron's sign, malar erythema with facial erythema). The final tool assessed in the present study consisted of 21 items, including 10 pure activity lesions, 4 damage lesions and 7 lesions which included both activity and damage. It takes about 15 minutes to complete, depending on the complexity of the patient being assessed and the familiarity of the assessor with using the tool. This tool is available on the journal website as supplementary data and can also be found on the IMACS website https://dir-apps.niehs.nih.gov/imacs/index.cfm?action=home.main.]
Each cutaneous lesion listed in the CAT was defined in content and scoring. Depending on the lesion, there were between 2 (absent/present) and 7 possible responses (e.g. different descriptors of erythema) corresponding to increasing levels of activity or damage. Lesions were weighted by the investigators by assigning a priori scores based on a Delphi consensus expert opinion on the relative importance of individual lesions and their degree of activity or damage. Individual item scores were added to give a total CAT activity score ranging from 0 – 96 and a total CAT damage score ranging from 0 – 20. Higher scores corresponded to greater degrees of activity and/or damage.
We also assessed global cutaneous activity and damage with 10 cm VAS, anchored by “no evidence of skin disease activity” or “no evidence of skin disease damage” and “extremely active skin disease” or “extreme skin disease damage”, as well as 0 – 4 point ordinal scales. Higher scores corresponded to higher degrees of skin disease activity or damage.
This project consisted of two phases. In the first phase, each study participant was provided with a training set of 35 mm Kodachrome slides depicting all lesions in the CAT, as well as two test sets of 16 slides each. Slides came from the collections of investigators, including those at the National Institutes of Health, and were chosen based on their ability to depict specific lesions. Slides of patients with both adult and juvenile IIM were used due to the limited availability of pediatric slides to demonstrate all relevant lesions. Reliability data for both sets of slides were initially analyzed separately, but when no differences were observed in the reliability ratings between sets, the results were combined (results not shown). Some slides showed multiple lesions (e.g. a slide of the face with both malar erythema and heliotrope rash), but no slide showed all possible lesions. Some lesions could not be depicted in some slides (e.g. slide showing hands could not depict an oral ulceration).
In the second phase of the project, 123 children with probable or definite juvenile IIM  were examined using a standardized evaluation for muscle disease and other aspects of disease activity and damage; 113 had juvenile DM, 6 had juvenile PM and 4 had myositis associated with an underlying connective tissue disease [4, 5, 14]. These children were different from those in the slides used in the first part of the project. This group of children was enrolled consecutively at the participating centers at varied points in their disease courses. At the time of enrollment, they had a median disease duration of 18.5 months (range 0 – 137 months, 25th % 6 months, 75th % 33 months) and a median global disease activity and damage measured by 10 cm VAS of 2.1 cm (range 0 – 9.7 cm, 25th % 0.6 cm, 75th % 4.4 cm) and 1.2 (range 0 – 10 cm, 25th % 0 cm, 75th % 1.5 cm) respectively.
For the first phase of this work, the slides were initially independently scored by a group of 30 raters using the CAT (16 pediatric rheumatologists, 6 pediatric rheumatology trainees, 1 dermatologist, 6 dermatology trainees and 1 adult rheumatologist). Prior to using the CAT, a slide atlas defining the cutaneous manifestations of DM and juvenile DM of the CAT was distributed to all assessors. In addition, two lectures based on the CAT were given at meetings of the Juvenile Myositis Disease Activity Collaborative Study Group. Eleven assessors (10 pediatric rheumatologists and 1 dermatologist) attended at least one of these meetings. Assessors who did not attend at least one training session had lower reliability (data not shown). For this reason, results are reported only for those assessors who attended at least one of the training sessions and who also scored both slide sets. The ratings by the dermatologist who developed the CAT were used as the gold standard.
For the second phase of this work, children with juvenile IIM were assessed on one occasion by their usual pediatric rheumatologist using the CAT. There were eleven pediatric rheumatologists at 10 pediatric centers. There were 20 children at one center who were seen both by a pediatric rheumatologist and a dermatologist within 48 hours of each other. Only the pediatric rheumatologist's assessment was included in the analysis of the whole population.
All calculations were performed using the statistical program SAS (Release 8.02, SAS Institute Inc, Cary NC).
For the slide data, all calculations were done only for those slides where the lesion in question could possibly be present and scored. For example, only slides which included the face could show a heliotrope rash, and only slides which included the fingers could show periungual capillary loop changes. This was done to avoid the measurement characteristics being overestimated because the data were enriched by agreements on slides that could not possibly show the lesion in question. Calculations were only done for those lesions where the gold standard assessor (the dermatologist involved in the development of the CAT) identified the lesion in question as being present in at least one slide.
To assess the slide data, two assessments of inter-rater reliability were used. First, a simple agreement was calculated. This was represented as the proportion of assessors who agreed with the gold standard assessor as to whether the lesion in question was present or absent on each slide. Second, the intra-class correlation coefficient (ICC) (form (2,1))  was calculated using the actual scores that each assessor assigned to each lesion. ICC's were not calculated for damage items as these were rarely represented in the slides. Results on simple agreement were presented because this is an intuitive way of assessing reliability and more easily understood by clinicians. However, simple agreement has some disadvantages. Specifically, it is inflated by “chance agreement,” and therefore may overestimate the true agreement. It may be affected by the frequency with which a trait is observed (i.e. if a trait was absent in 90% of patients, simple agreement would be at least 90% by stating all patients were negative). Simple agreement is also unable to consider the extent of agreement or disagreement. For example, 2 assessors may agree that a lesion is present, but assign a different score for severity. Because of these issues, the ICC is a preferred method of assessing reliability. The ICC relates variation that is attributable to differences between patients to variation that is related to all sources (patients, assessors and error). An ICC of 0.5 would be interpreted as suggesting that 50% of the variation in scores was related to “true” variation between patients, while the remainder was related to differences in assessors and error. Simple agreement and the ICC cannot be directly compared, but simple agreement should suggest a greater degree of reliability than the ICC.
For those children who had been seen by both a pediatric rheumatologist and a dermatologist, inter-rater reliability of the CAT was assessed using both simple agreements and form (2,1)  of the ICC. For the simple agreements, the number of subjects in whom both assessors agreed a given lesion was present or absent was recorded separately. These values were then added and divided by 20 (the maximum number of agreements) to give a proportion of agreement. An ICC was calculated using the actual scores both assessors assigned to each lesion. The Wilcoxon Signed Rank test was used to examine whether the scores between the two assessors differed for each lesion as well as for the CAT activity and damage scores.
We considered ICC greater than 0.90 to demonstrate excellent reliability, 0.75−0.90 to demonstrate good reliability and 0.40−0.75 to demonstrate moderate reliability .
For the data derived from actual patients, the number and proportion of children with each lesion described in the CAT were calculated. Descriptive statistics for the CAT activity score and CAT damage score were calculated. Spearman's correlation coefficient was calculated to describe the relationships between the CAT activity and damage scores and the global skin disease activity and damage VAS.
Inter-rater reliability data for the CAT based on slide data is summarized in Table 1. For those lesions which could be assessed, agreements whether lesions were present or absent were generally high, with all greater than 0.84. Twelve of 13 ICC for the active lesions were moderate or higher.
For those juvenile IIM patients seen by two assessors, most lesions had moderate or good ICC (Table 2). No significant differences were found between the two assessors in the scores for individual lesions or the total scores (by Wilcoxon Signed Rank test, P-values ranged from 0.06 – 1.0).
The number and proportion of children in the entire population exhibiting each skin lesion is summarized in Table 3. The cutaneous activity lesions that were most commonly endorsed were periungual capillary loop changes (63%), Gottron's papules/sign with erythematous changes (53%), heliotrope rash (49%) and malar/facial erythema (49%). The most frequently endorsed cutaneous damage lesions were atrophy or hypo/hyperpigmentation in the distribution of Gottron's papules/sign (25%), calcinosis (15%), atrophy or hypo/hyperpigmentation in the distribution of heliotrope rash (11%) and depressed skin scars (11%). The median CAT activity score was 7.0 (interquartile range 2.0 – 10, range 0 – 44) for the whole population. The median CAT damage score was 1.0 (interquartile range 0 – 1, range 0 – 13). The CAT activity score was highly correlated with the global skin disease activity VAS (rs = 0.81, P < 0.0001). The CAT damage score was moderately correlated with the global skin disease damage score VAS (rs = 0.50, P < 0.0001). [Figure 1a,,bb]
In this paper, we have introduced and described the development of the CAT, a new tool for the assessment of skin disease activity and damage in patients with IIM. This tool fills an important gap in the assessment of both children and adults with these diseases, providing a method to comprehensively and semi-quantitatively assess the cutaneous manifestations which are an important aspect of these illnesses. We have provided data concerning reliability of individual items in the CAT and the CAT activity and damage scores. We have also used the CAT to document the frequency with which various cutaneous manifestations are seen in a prevalent population of juvenile DM with varying degrees of disease activity and damage.
The scoring of the CAT and the relative weighting of individual items and their severities was determined through the consensus expert opinion of the investigators. It is not known if the scoring method used was the optimal one for scoring this tool. However, this issue is not likely to have had a large impact on our results. The same scoring system was applied to all CAT completed by participating investigators. As well, weighting of individual items has been shown to have relatively little effect on the performance of measurement tools . Future work will investigate the use of alternate scoring methods.
The CAT was developed to include both common cutaneous lesions, but also less frequent manifestations that are important in the assessment of severity and outcome, such as cutaneous ulceration and panniculitis. Therefore, it was not surprising that there was a wide range in the frequency with which lesions were observed. It was expected that some lesions in this tool would be endorsed rarely, but they have been retained based on their clinical significance.
Children who participated in this study were at varying stages of their illness. This was important to capture aspects of both skin disease activity and damage, but may have affected the reported frequencies of individual lesions. It is likely that a cohort of children being studied at disease onset would have higher frequencies of some of the activity lesions, but even less damage than our population. Disagreements in the assessment of various skin lesions may have also resulted in the reported incidences differing from the actual incidences.
Our data show that in general, there was considerable agreement as to whether lesions were present or absent. However, ICC appeared to suggest that reliability was not quite as high, being generally in a moderate range. To some extent, this is to be expected because the ICC were calculated based on the actual score assigned to each lesion by the assessor. Thus, assessors could agree on the presence of the lesion, but disagree on its relative severity, resulting in a depressed value for the ICC. Some lesions were also notable for having low ICC (such as non-sun exposed erythema, subcutaneous edema and lipoatrophy). These lower ICC were at least partially an unfortunate effect of using slides for the assessment of the skin lesions. Without being able to closely examine the skin lesions, it may have been difficult to make an accurate assessment regarding activity or damage. The ICC may have also been impacted by the relative experience of the assessors, with the severity of rarer lesions being potentially under-estimated by less experienced assessors. Finally, it is possible that the definitions for these lesions need to improved, or that the levels of severity were not distinctive enough to generate good reliability.
When patients were assessed by two assessors (a pediatric rheumatologist and a dermatologist), good agreement was generally observed. However, sometimes surprising disagreement was seen. It is possible that the time between assessments, which was up to 48 hours, may have resulted in some patients changing their cutaneous activity. It is also possible that the expertise of assessors in their specialty fields led to differences in perspective on the presence of various lesions and their scoring. For example, in retrospect, it became clear that one of the assessors recorded erythematous lesions on the palms as non-sun exposed erythema, while the other assessor did not.
When the total CAT activity and damage scores were considered, the ICC were 0.71 and 0.81 respectively, representing moderate and good inter-rater reliability respectively. The excellent correlation of the total CAT activity and damage scores with global skin activity and damage further confirmed the usefulness of the CAT activity and damage scores in the assessment of the cutaneous manifestations of DM. Unlike the VAS ratings, however, the CAT provides a consistent approach to weighting and a systematic evaluation of the activity and damage for a wide variety of cutaneous lesions. Completion of the VAS after completion of the tool may have acted to increase the correlations somewhat, as careful consideration of the lesions in the CAT may have influenced how the VAS were completed.
The CAT activity and damage scores appear to be more reliable than the individual items. This is similar to the experience with Manual Muscle Testing (MMT). Jain et al have documented that the total and summary muscle group sub-scores had higher reliability than individual muscle scores, and concluded that acceptable reliability existed only for the total and sub-scores . Despite this concern, MMT has been important in the assessment of muscle strength of myositis patients both in clinical and research contexts [8, 9]. Relatively lower individual item reliability does not invalidate the CAT, as the total CAT activity and damage scores have acceptable reliability.
Although we attempted to perform as careful a study as possible, this work has several limitations. First, the entire skin surface could not be shown in one slide, and so we were unable to adequately assess the performance of the total scores in the slide dataset. Some characteristics, such as induration or erythema, may also be difficult to assess in a slide. However, slides did allow many assessors to examine the exact same lesion, and useful information regarding the performance of this tool was obtained. Second, despite the relatively large number of children examined in this work, some lesions were still represented rarely or not at all. Thus, for some lesions, we did not have enough information to assess the reliability. It is possible that some of those lesions were so rare that they should be deleted from future versions of this tool.
It is important to consider what advantages the CAT may have over other available tools for the assessment of skin disease in the juvenile IIM, in particular the DAS . Like the CAT, the DAS has not been fully validated. We believe that the CAT has some potential advantages over the DAS. First, the CAT seeks to consider the full range of cutaneous features of the juvenile IIM, while the DAS assesses a more limited range of skin lesions. This results in the CAT being somewhat longer and more complex, but may be a reasonable trade-off for comprehensiveness. Second, the CAT assesses both skin disease activity and damage, while the DAS considers only skin disease activity. Further work will need to consider the relative performance and preference of any tools which may become available for the assessment of skin disease in juvenile IIM.
In conclusion, we have introduced a new tool for the assessment of skin disease activity and damage in the IIM. This will allow a systematic and semi-quantitative analysis of both acute and chronic cutaneous changes in the IIM. We have demonstrated that the CAT activity and damage scores have moderate to good reliability, and that assessors generally agree on their ratings of a variety of cutaneous lesions of DM. Future validation of this tool will consider other measurement characteristics, such as construct validity and responsiveness, allowing this tool to be used in both clinical and research contexts.
We thank Drs. Maria Turner, Richard Sontheimer, Joseph Jorizzo, and William James for helpful feedback during the development of the Cutaneous Assessment Tool. We thank Drs. Maria Turner and Edward Cowen for valuable feedback on the manuscript.
Sources of support for this work. This research was supported by the intramural research programs of the National Institute of Environmental Health Sciences and the National Institute of Arthritis and Skin and Musculoskeletal Diseases, National Institutes of Health, DHHS, Bethesda, MD.
Contributing members of the Juvenile Dermatomyositis Disease Activity Collaborative Group who participated in this work: Drs. Robert, Colbert, Jaime DeInocencio, Thomas Griffin, Philip Hashkes, Raphael Hirsch, Deborah Kredich, Ronald Laxer, Joseph Levinson, Daniel Lovell, Nicola Ruperto, Earl Silverman, Robert Sundel, Scott Vogelgesang, and the dermatology residents of Walter Reed Army Medical Center.
Conflict of Interest Statement The authors of this work have no conflicts of interest to declare.
Key Messages 1. Cutaneous manifestations in the juvenile idiopathic inflammatory myopathies are an important component of disease activity and damage.
2. The Cutaneous Assessment Tool allows skin disease to be measured in a way that is systematic, semi-quantitative and reliable.