In this paper, we have introduced and described the development of the CAT, a new tool for the assessment of skin disease activity and damage in patients with IIM. This tool fills an important gap in the assessment of both children and adults with these diseases, providing a method to comprehensively and semi-quantitatively assess the cutaneous manifestations which are an important aspect of these illnesses. We have provided data concerning reliability of individual items in the CAT and the CAT activity and damage scores. We have also used the CAT to document the frequency with which various cutaneous manifestations are seen in a prevalent population of juvenile DM with varying degrees of disease activity and damage.
The scoring of the CAT and the relative weighting of individual items and their severities was determined through the consensus expert opinion of the investigators. It is not known if the scoring method used was the optimal one for scoring this tool. However, this issue is not likely to have had a large impact on our results. The same scoring system was applied to all CAT completed by participating investigators. As well, weighting of individual items has been shown to have relatively little effect on the performance of measurement tools [16
]. Future work will investigate the use of alternate scoring methods.
The CAT was developed to include both common cutaneous lesions, but also less frequent manifestations that are important in the assessment of severity and outcome, such as cutaneous ulceration and panniculitis. Therefore, it was not surprising that there was a wide range in the frequency with which lesions were observed. It was expected that some lesions in this tool would be endorsed rarely, but they have been retained based on their clinical significance.
Children who participated in this study were at varying stages of their illness. This was important to capture aspects of both skin disease activity and damage, but may have affected the reported frequencies of individual lesions. It is likely that a cohort of children being studied at disease onset would have higher frequencies of some of the activity lesions, but even less damage than our population. Disagreements in the assessment of various skin lesions may have also resulted in the reported incidences differing from the actual incidences.
Our data show that in general, there was considerable agreement as to whether lesions were present or absent. However, ICC appeared to suggest that reliability was not quite as high, being generally in a moderate range. To some extent, this is to be expected because the ICC were calculated based on the actual score assigned to each lesion by the assessor. Thus, assessors could agree on the presence of the lesion, but disagree on its relative severity, resulting in a depressed value for the ICC. Some lesions were also notable for having low ICC (such as non-sun exposed erythema, subcutaneous edema and lipoatrophy). These lower ICC were at least partially an unfortunate effect of using slides for the assessment of the skin lesions. Without being able to closely examine the skin lesions, it may have been difficult to make an accurate assessment regarding activity or damage. The ICC may have also been impacted by the relative experience of the assessors, with the severity of rarer lesions being potentially under-estimated by less experienced assessors. Finally, it is possible that the definitions for these lesions need to improved, or that the levels of severity were not distinctive enough to generate good reliability.
When patients were assessed by two assessors (a pediatric rheumatologist and a dermatologist), good agreement was generally observed. However, sometimes surprising disagreement was seen. It is possible that the time between assessments, which was up to 48 hours, may have resulted in some patients changing their cutaneous activity. It is also possible that the expertise of assessors in their specialty fields led to differences in perspective on the presence of various lesions and their scoring. For example, in retrospect, it became clear that one of the assessors recorded erythematous lesions on the palms as non-sun exposed erythema, while the other assessor did not.
When the total CAT activity and damage scores were considered, the ICC were 0.71 and 0.81 respectively, representing moderate and good inter-rater reliability respectively. The excellent correlation of the total CAT activity and damage scores with global skin activity and damage further confirmed the usefulness of the CAT activity and damage scores in the assessment of the cutaneous manifestations of DM. Unlike the VAS ratings, however, the CAT provides a consistent approach to weighting and a systematic evaluation of the activity and damage for a wide variety of cutaneous lesions. Completion of the VAS after completion of the tool may have acted to increase the correlations somewhat, as careful consideration of the lesions in the CAT may have influenced how the VAS were completed.
The CAT activity and damage scores appear to be more reliable than the individual items. This is similar to the experience with Manual Muscle Testing (MMT). Jain et al have documented that the total and summary muscle group sub-scores had higher reliability than individual muscle scores, and concluded that acceptable reliability existed only for the total and sub-scores [17
]. Despite this concern, MMT has been important in the assessment of muscle strength of myositis patients both in clinical and research contexts [8
]. Relatively lower individual item reliability does not invalidate the CAT, as the total CAT activity and damage scores have acceptable reliability.
Although we attempted to perform as careful a study as possible, this work has several limitations. First, the entire skin surface could not be shown in one slide, and so we were unable to adequately assess the performance of the total scores in the slide dataset. Some characteristics, such as induration or erythema, may also be difficult to assess in a slide. However, slides did allow many assessors to examine the exact same lesion, and useful information regarding the performance of this tool was obtained. Second, despite the relatively large number of children examined in this work, some lesions were still represented rarely or not at all. Thus, for some lesions, we did not have enough information to assess the reliability. It is possible that some of those lesions were so rare that they should be deleted from future versions of this tool.
It is important to consider what advantages the CAT may have over other available tools for the assessment of skin disease in the juvenile IIM, in particular the DAS [12
]. Like the CAT, the DAS has not been fully validated. We believe that the CAT has some potential advantages over the DAS. First, the CAT seeks to consider the full range of cutaneous features of the juvenile IIM, while the DAS assesses a more limited range of skin lesions. This results in the CAT being somewhat longer and more complex, but may be a reasonable trade-off for comprehensiveness. Second, the CAT assesses both skin disease activity and damage, while the DAS considers only skin disease activity. Further work will need to consider the relative performance and preference of any tools which may become available for the assessment of skin disease in juvenile IIM.
In conclusion, we have introduced a new tool for the assessment of skin disease activity and damage in the IIM. This will allow a systematic and semi-quantitative analysis of both acute and chronic cutaneous changes in the IIM. We have demonstrated that the CAT activity and damage scores have moderate to good reliability, and that assessors generally agree on their ratings of a variety of cutaneous lesions of DM. Future validation of this tool will consider other measurement characteristics, such as construct validity and responsiveness, allowing this tool to be used in both clinical and research contexts.