PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
J Invest Dermatol. Author manuscript; available in PMC 2013 August 24.
Published in final edited form as:
PMCID: PMC3752293
NIHMSID: NIHMS474866

Evaluation of reliability, validity, and responsiveness of the CDASI and the CAT-BM

Abstract

Background

To properly evaluate therapies for cutaneous dermatomyositis (DM), it is essential to administer an outcome instrument that is reliable, valid, and responsive to clinical change, particularly when measuring disease activity.

Objective

The purpose of this study is to compare two skin-severity DM outcome measures, the Cutaneous Disease and Activity Severity Index (CDASI) and the Cutaneous Assessment Tool-Binary Method (CAT-BM), with the physician global assessment (PGA) as the ‘gold standard’.

Methods

Ten dermatologists evaluated fourteen patients with DM using the CDASI, CAT-BM, and PGA scales. Inter-, intra-rater reliability, validity, responsiveness, and completion time were compared for each outcome instrument. Responsiveness was assessed from a different study population, where one physician evaluated 35 patients with 110 visits.

Results

The CDASI was found to have a higher inter- and intra-rater reliability. Regarding construct validity, both the CDASI and the CAT-BM were significant predictors of the PGA scales. The CDASI had the best responsiveness among the three outcome instruments examined. The CDASI had a statistically longer completion time than the CAT-BM by about 1.5 minutes.

Limitations

The small patient population may limit the external validity of the findings observed.

Conclusions

The CDASI is a better clinical tool to assess skin severity in DM.

Keywords: Dermatomyositis, CDASI, CAT-BM, Autoimmune Disease, Outcome Instrument, Reliability, Validity, Responsiveness

Introduction

Dermatomyositis (DM) is a chronic systemic autoimmune disease categorized among the idiopathic inflammatory myopathies (Dugan et al, 2009). DM is often associated with extramuscular and extracutaneous pathology, with involvement of the joints, heart (cardiomyopathy and conduction defects), and lungs (Lorizzo et al., 2008). The most widely accepted classification criteria for DM has traditionally emphasized the importance of clinical, laboratory, histopathologic, or electrophysiological evidence of muscle inflammation for making the diagnosis (Bohan et al., 1975). Subtypes of dermatomyositis, amyopathic and hypomyopathic dermatomyositis, have been described for patients with no or minor muscle findings, respectively (Gerami et al., 2006).

Characteristic inflammatory skin changes are seen in a large majority of individuals with DM (Callen et al., 2006). Nevertheless, the cutaneous manifestations of DM are among the least systemically studied aspects of the disease. This has resulted in part from the lack of validated tools to reliably determine the activity of the cutaneous manifestations of DM, especially relative to other dermatologic diseases such as psoriasis and atopic dermatitis, where disease-specific skin severity outcome instruments have been used extensively (Gaines et al., 2008; Feldman, 2005; Kunz et al, 1997; Mrowietz, 2006. The FDA has developed guidelines for researchers on how to measure clinical response through measuring disease activity, disease-induced damage, the response as determined by the patient, and health-related quality of life (Gaines et al., 2008; Concept paper, pamphlet, 2003. From these guidelines, researchers must develop an outcome instrument that will capture appropriate elements of the disease to determine clinical response. Currently, effective treatments for the cutaneous manifestation of dermatomyositis are limited. There are a number of new biological therapies that may be beneficial for patients with DM (Lorizzo et al., 2008). There is a critical need to develop optimal validated instruments to quantify organ-specific disease activity, so that the efficacy of medications can be methodically and quantitatively evaluated.

We have previously validated a cutaneous severity outcome instrument, the Cutaneous Dermatomyositis Disease Area and Severity Index (CDASI), and have shown that it may be a more effective and reliable tool compared to other outcome measures, namely the Dermatomyositis Skin Severity Index (DSSI), and the Cutaneous Assessment Tool (CAT) (Klein et al., 2008). In order to further simplify the CDASI, we have revised the original CDASI and have shown that the modified version correlates almost perfectly with the original CDASI (Yassaee, in press). The CAT was originally developed with similar goals to the CDASI and was found to have appropriate reliability, construct validity, and responsiveness in the juvenile dermatomyositis population (Huber et al., 2008 and 2007). Recently, the CAT has also been simplified, and has been validated in the juvenile population (Huber, Lachenbruch, et al., 2008). The modified versions of the CAT, named CAT-Binary Method (CAT-BM) and CAT-Maximum Method (CAT-MM), stem from an alternative scoring method of the CAT. The CAT-BM has been shown to correlate almost perfectly to the original CAT (Huber et al., 2008). There have yet to be any studies comparing the modified CDASI and the CAT-BM for use in longitudinal clinical research.

The current study evaluates and compares the modified tools, with a goal to provide partial validation of each tool for use in the adult DM population and to determine the optimal effective research tool for measuring the severity of cutaneous disease in adult DM. The goal is to establish an appropriate tool for evaluating DM within and between studies to evaluate therapeutic responses most effectively.

Results

Distribution of Scores

CDASI Total and CAT-BM Total scores had a normal distribution with scores ranging from 1-72 and 1-20, respectively (CDASI Total: Mean 24.25 +/- 14.67; CAT-BM Total: Mean 9.24 +/- 4.17).

Inter-rater Reliability

Inter-rater reliability was assessed by determining the agreement between the CDASI and the CAT-BM scores from the ten physician raters. The CDASI was found to have good inter-rater reliability among activity and total scores and moderate inter-rater reliability in damage scores, meaning the scores among physicians were in good accordance to one another among activity and total scores and moderate accordance to one another among damage scores. Contrastingly, the CAT-BM was found to have moderate inter-rater reliability in activity scores and poor inter-rater reliability among damage and total scores. The CDASI had the best inter-rater reliability overall when compared to the CAT-BM and PGA scales (Activity: CDASI 0.748, CAT-BM 0.563, PGA Activity 0.721, PGA Activity Likert 0.653; Damage: CDASI 0.563, CAT-BM 0.340, PGA Damage 0.506, PGA Damage Likert 0.542; Total CDASI 0.726, CAT-BM 0.432, PGA Overall 0.632, PGA Overall Likert 0.694) (Table 1).

Table 1
Assessment of Inter-rater reliability

Intra-rater Reliability

Intra-rater reliability measures the degree of agreement of multiple outcome scores performed by a single physician. It was assessed by determining the agreement between initial and repeat scores, using the ICC, for each outcome instrument as well as determining the significance of a difference between mean initial scores and mean repeat scores for each outcome instrument. The CDASI was found to have an almost perfect intra-rater reliability between activity and total scores and good intra-rater reliability with damage scores (ICC: Activity 0.868; Damage 0.800; Total 0.903. No significant difference between mean initial and mean repeat activity, damage, and total scores was found (Mean difference: Activity 0.00, p=1.00; Damage 0.40, p=0.728; Total -0.40, p=0.541). The CAT-BM was found to have good intra-rater reliability between activity, damage scores, and total scores (ICC: Activity 0.714; Damage 0.792 Total 0.800). No significant difference between mean initial and mean repeat activity, damage, and total scores was found (Mean difference: Activity 0.2, p=0.713; Damage 0.35, p=0.496; Total -0.15, p=0.634). PGA scales were found to have almost perfect intra-rater reliability in all assessments except for PGA Activity Likert and PGA Damage Likert (ICC 0.737 and 0.708, respectively). There was also a significant difference between initial and repeat mean scores for PGA Overall and PGA Activity Likert (Mean difference: PGA Overall 0.63, p=0.019; PGA Activity Likert - 0.24, p=0.021) (Table 2).

Table 2
Intra-rater reliability – ICC and mean differences of initial and repeat scores

Construct Validity

Validity was assessed for the CDASI and the CAT-BM by using a linear mixed model. Both the CDASI and the CAT-BM were found to be a significant predictor of the compared ‘gold standard’, the PGA scales using both the VAS and the Likert scale (all p≤0.001 among total, activity, and damage scores) (Table 3), indicating that both the CDASI and the CAT-BM were good predictors of both the VAS and the Likert PGA scales.

Table 3
Assessment of construct validity between the CDASI and the CAT-BM

As another means to assess construct validity and linearity, CDASI and CAT-BM scores were grouped by Likert scores. All CDASI and CAT-BM mean scores (Total, Activity, and Damage) expressed statistically significant distinct values when grouped by Likert scores (all p values ≤ 0.001) (Table 4), reaffirming that both tools are good predictors of the Likert PGA scales. Furthermore, both the CDASI and CAT-BM expressed a significant, near-perfect fit for linearity with all coefficient of determination values, or r2, values ≥ 0.947 (highest p=0.026). .

Table 4
Determination of differences among CDASI and CAT-BM Scores when grouped by Likert score with Linear Trend of Means

Content Validity

All physicians felt that the CDASI was complete, though one physician noted that it may be useful to have a mechanism to capture lipoatrophy from panniculitis in patients. 9/10 physicians felt that the CAT-BM was complete. One physician felt that the CAT-BM did not adequately assess the scalp.

Responsiveness

Responsiveness was measured by using the SRM, defined as the ratio of the mean of the differences (i.e. CDASI and CAT-BM scores before and after a clinical change was noted) between two time points to the standard deviation of the differences. The CDASI had the highest SRM among outcome instruments (SRM: CDASI 1.25; CAT-BM 0.93; PGA Activity 1.03; PGA Activity Likert0.61). The CDASI was the only instrument to have an SRM > 1, indicating that the mean change between visits was greater than the standard deviation change between visits. As mentioned above, the CDASI had the highest intra-rater reliability among all compared outcome instruments (Table 2).

Completion Time

The CDASI had a statistically longer completion time than the CAT-BM (Completion Time: CDASI 4.76 minutes; CAT-BM 3.19 minutes; p<0.001) with a mean time difference of 1.58 minutes (95% Confidence Interval: 1.18 minutes – 1.97 minutes).

Physician Exit Questionnaire

6/10 physicians felt that the CDASI would be more easily incorporated in a clinical setting than the CAT-BM. Those who preferred the CDASI mentioned the likelihood that it would be a more effective instrument to assess responsiveness as well as the order in which the anatomical locations were organized. Contrastingly, those who preferred the CAT-BM stated that it was a quicker instrument to complete. 6/10 physicians felt that the CAT-BM was less difficult to use. Those who preferred the CAT-BM mentioned it was quicker to complete whereas those who preferred the CDASI stated that the CAT-BM was “poorly organized” and that they would need “jump around” while completing it. 10/10 physicians felt that the CDASI was a better instrument to grade skin severity and improvement over time. Physicians commented that the CDASI measures the “degree of intensity of an eruption” whereas a “binary [method] won't be helpful in estimating response to treatment” and would “need to have complete resolution to capture change.” Furthermore, one physician commented that the CAT-BM included livedo reticularis in its scoring, which “would not be expected to improve with most therapy.”

Discussion

Validated outcome measures play an important role in standardizing patient care and in developing reliable clinical trials by objectively measuring the severity of disease. The scientific method states the importance of attaining reproducible results. An outcome measure, therefore, must also be reproducible in order to adequately function in future clinical trials. The importance of an outcome measure's reliability, which measures reproducibility, is clearly important and is necessary for attaining validity (Klein et al., 2008; Downing, 2004). ICC values were compared via the method described by Steel et al. (Steel et al., 1997). Though post-hoc power analysis showed that the difference in ICC scores did not reach statistical significance, there is a trend that the CDASI has good inter-rater reliability in regards to its Activity and Total measurements while the CAT-BM has moderate and poor inter-rater reliability for its Activity and Total measurements, respectively (Table 1). Likely, the nature of the instruments lends the CDASI to having a higher inter-rater reliability even though the CAT-BM is a binary instrument. For example, an item on the CAT-BM which was seen to have a large standard deviation among raters was item scoring the presence of non-sun exposed erythema. Since the CDASI has five to six items that would qualify as non-sun exposed erythema in addition to a larger number of items contributing to the activity score, it lends itself to having an intrinsically high inter-rater reliability since one disagreement among physicians would have less of an impact on the overall reliability than in the CAT-BM. Additionally, it is also possible that since the CDASI specifically goes through all anatomical parts, it gives more “pressure” to the rater to look through all the parts more efficiently than in the CAT-BM. Thirdly, the ambiguousness of certain question items in the CAT-BM may have contributed to a lower reliability. For example, the items scoring the presence of cuticular overgrowth or subcutaneous edema were seen to have a large standard deviation among raters. Although the CDASI may not be a binary system, the measures of activity that it scores (erythema, scale, and erosions) are defined more clearly among physicians than certain measures of activity in the CAT-BM. Notably, the inter-rater reliability among activity scores in the initial study exploring the CAT-BM (Huber, Lachenbruch, et al. 2008) reports an ICC score of 0.6 (95% CI 0.06-0.83), contrasting to our reported value of 0.34. Although our value of 0.34 lies within the 95% CI making statistical variability the most likely cause of the difference, the differing patient populations between the studies (adult vs. juvenile) may have also played a role.

Interestingly, inter-rater reliability of damage measurements were lower in both the CDASI, the CAT-BM, and PGA scales (Table 1-ICC: CDASI Damage 0.563; CAT-BM 0.340; PGA Damage 0.506, PGA Damage Likert 0.542). This is consistent for other outcome instruments that contain a damage subscore such as the CAT and the previous version of the CDASI, suggesting that physicians have difficulty agreeing with one another in their assessment of damage21. It was noted that in the physician training session, the concept of poikiloderma varied among physicians. Additionally, in a previous study, agreement of a physician's perception of poikiloderma was poor as well (Klein et al., 2008). Poikiloderma accounts for almost half, less than 10%, and theoretically 100% of the maximum damage score in the CDASI, the CAT-BM, and the PGA Damage scales, respectively. This suggests that there is another factor, perhaps an inherent limitation of the outcome measure, explaining the poor, and lower, inter-rater reliability of the CAT-BM when compared to the CDASI or PGA Damage scales.

The intra-rater reliability of the CDASI was almost perfect in activity and total scores and good across damage scores. The CAT-BM had a lower intra-reliability across activity, damage, and total scores with good intra-rater reliability in all realms (Table 2). Although this shows a trend that the CDASI has a better intra-rater reliability, post-hoc power analysis showed that the difference did not reach statistical significance.

Although an outcome instrument may be reliable, if it does not have adequate construct validity, or the ability to measure what it has been designed to measure effectively, then its usefulness is limited. Both the CDASI and the CAT-BM were shown to be significant predictors of PGA scales, which is the ‘gold standard’, and thus to have good construct validity. While both the CDASI and the CAT-BM were found to have good content validity as stated above, a physician noted that the CAT-BM did not sufficiently assess scalp disease, which can be very troublesome for patients and found in over 80% in the DM population (Tilstra et al., 2009; Kasteler, 1994).

It is also important for an outcome instrument to be able to capture the disease state of patients at the extremes of disease. This is particularly important in patients with extreme disease activity. In this study, the maximum CDASI Activity and CAT-BM Activity score reached was 61 (61% of maximum activity score) and 14 (82% of maximum activity score). This suggests that the CAT-BM may be more prone to reach its maximum limit faster than the CDASI and therefore not be able to capture differences in disease activity in more severe patients.

To implement an outcome instrument for the use of clinical trials, it is essential that it be able to measure change in disease severity. The CDASI had the best responsiveness when compared to CAT-BM and PGA scales. Furthermore, all physicians anticipated that the CDASI would be a more effective response tool than the CAT-BM. This was not a surprising result, as shown by many of the physician rater comments, predicting that the CAT-BM would have this limitation as it only documents presence or absence of a certain measure whereas the CDASI documents the degree of severity of a certain measure.

Another important factor when comparing outcome instruments is its completion time. Even a tool that is reliable and valid but takes too long to complete would not be practical in a clinical research setting. Although the CAT-BM took significantly less time to complete than the CDASI (Mean Completion Time: CAT-BM 3.19 minutes; CDASI 4.76 minutes; p<0.001), the mean difference in completion time was about 90 seconds and may not be practically relevant.

There were limitations to the study. Firstly, as the patient population was relatively small, the external validity of our findings may be limited. Secondly, the relatively small patient population may have allowed the physician raters to recall how they evaluated a patient when completing their repeat evaluation. This could potentially raise the intra-rater reliability from its true value. To minimize this impact, physicians were asked to perform their repeat evaluation on a patient they had evaluated during the morning session, thus minimizing a likelihood of recall. Thirdly, as the study session lasted about 7 hours, it is possible that the physicians may have experienced fatigue that may have impacted their patient evaluation. This was minimized by offering snacks and lunch during the day and allowing physicians to rate patients at their own pace. Fourthly, five of ten of physician participants have used both the CAT and the original version of the CDASI previously, which may have falsely elevated the reliability and validity scores in both instruments since many physicians had increased familiarity with both of the instruments. Regardless of the limitations above, we can conclude that the CDASI appears to be a more effective tool than the CAT-BM in evaluating cutaneous severity in DM.

Methods

This study has been approved by the local IRB. Declarations of Helsinki protocols were adhered and physician and patient participants gave their written, informed consent prior to study initiation.

Physician Participants

10 dermatology-boarded physicians were invited to participate in the one-day study at the Hospital of the University of Pennsylvania. Physicians were given the CDASI and the CAT-BM as well as corresponding literature prior to the study session day so that they may better familiarize themselves with the tools. On the study session day, prior to initiating the study, the physicians were given a training session with visual examples in order to score all study instruments correctly. Adequate time was given to the physicians to address any questions and/or clarifications they may have had regarding the outcome instruments.

Patient Participants

14 patients with the clinical and/or pathological evidence of DM were invited to participate in the study at the Hospital of the University of Pennsylvania. Patients represented a wide spectrum of disease. The patient population consisted of fourteen Caucasians, 3 males, 11 females, with varying degrees of muscle and cutaneous involvement (noted to have PGA Activity scores ranging from 0 to 9.3 with a mean of 3.2+/-2.8, PGA Damage scores ranging from 0-9.4 with a mean of 2.8+/- 2.6, and PGA Overall scores ranging from 0.2-9.2 with a mean of 3.4 +/- 2.5). Average age of participants was 53 +/- 16. Average duration of disease among patients was not recorded.

Study Design

The study day was divided into Session 1 and Session 2. Each physician was given a randomized number from 1-10 and consequently a folder corresponding to their number. Based on the assigned number, physicians were divided into two groups of five physicians, Group 1Ph and Group 2Ph. One physician group contained folders with packets of each outcome instrument in the order of CDASI, CAT-BM, and PGA scales for Session 1 and packets of each outcome instrument in the order of CAT-BM, CDASI, and PGA scales for Session 2. The remaining physician group contained folders with a reverse order of packets (i.e. CAT-BM, CDASI, and PGA scales for Session 1). All folders from both physician groups also contained two packets of each outcome instrument for re-rates. All physicians evaluated fourteen patients. All physicians also re-evaluated two patients. At the end of the study session, physicians were given an exit questionnaire consisting of seven questions, each of which consisting of a short answer and four questions including a multiple-choice part. Patients were randomized and divided into two groups, Group 1P, consisting of 8 patients, and Group 2P, consisting of 6 patients. During Session 1, Group 1Ph evaluated Group 1P and Group 2Ph evaluated Group 2P. During Session 2, Group 1Ph evaluated Group 2P and Group 2Ph evaluated Group 1P. No more than one physician was permitted per patient encounter at any time.

CDASI

The CDASI is a one-page, partially validated outcome instrument used to determine the severity of cutaneous disease specific to DM. Total scores range from 0-132. Scores are divided into activity and damage, with scores ranging from 0-100 and 0-32, respectively. Neither activity nor damage is scored by percentage of body surface area involvement. Disease activity is assessed by the degree of erythema, scale, and the presence of erosions or ulceration in 15 different anatomical locations. Disease damage is assessed by presence of poikiloderma or calcinosis in the 15 different anatomical locations. Periungual changes were scored from 0-2, with zero indicating no periungual changes, one indicating periungual erythema, and two indicating visible telangectasias. Alopecia scores range from 0-1 with zero indicating no alopecia in the last 30 days and one indicating presence of alopecia in the last 30 days. Gottron's sign on the knuckles are assessed similarly to the erythema scale used in other anatomical locations. When Gottron's papules were present, the erythema score obtained on the knuckles was doubled.

CAT-BM

The CAT-BM is a 1-page, normally distributed validated outcome instrument derived from an alternative scoring method of the CAT that is used to determine the severity of cutaneous disease in DM. Total scores range from 0-28, 0-17 for activity and 0-11 for damage. Neither activity nor damage is scored by percentage of body surface area involvement. Activity scores are based on the presence of erythema in 7 different anatomic areas and presence of other characteristic DM lesions. Secondary changes such as scale, erosions, or necrosis are not captured. Disease damage is scored by the presence of atrophy or dyspigmentation without erythema in the same seven different anatomic areas, as well as presence of poikiloderma, calcinosis, lipoatrophy, or a depressed scar anywhere on the body.

Assessment of inter- and intra-rater reliability

To assess intra-rater reliability, after a physician participant had completed all patient encounters, they were asked to re-evaluate two patients which they had seen during the morning session (to minimize physician recollection of scoring). Though physicians arbitrarily decided which patient to re-rate based on patient availability, it was ensured that no patient would be re-rated more than twice. Inter-rater reliability was used to assess accordance of scores among physicians. All physicians re-rated two patients. Inter-rater reliability was determined by the ten physicians who evaluated all fourteen patients. Physicians also recorded the time to complete each instrument for each patient encounter.

Validation Measures

In order to assess and compare validity among different outcome instruments, three validation measures were used, 1) the Overall Skin-Physician Global Assessment (PGA Overall), 2) the Skin Activity- Physician Global Assessment (PGA Activity), and 3) the Skin Damage- Physician Global Assessment (PGA Damage). Scores were captured using visual analogue scales (VAS) and Likert scales. The VAS is a continuous scale ranging from 0-10 where 10 represents extremely active disease. The Likert Scale ranges from 0-4, where 4 represents extremely severe disease.

Assessment of Validity

Specifically, convergent construct validity was determined by comparing the Skin Activity-PGA to the activity scores of the activity subscore of the outcome instruments, comparing the Skin Damage-PGA to the damage subscore of the outcome instruments, and comparing the Overall Skin-PGA to the overall score of the outcome instruments. Convergent construct validity refers to the degree one measure (i.e. the CDASI or the CAT-BM) correlates to another measure (i.e. the corresponding PGA) that it theoretically should correlate with. The PGAs were also used to determine if the either of the outcome instruments was skewed to any direction, which could potentially limit the usefulness in longitudinal studies. Content validity was determined by administrating the Physician Exit Questionnaire, which includes the question, “Was there any information missing from any of the measures that you feel should be added?”

Responsiveness

Responsiveness was assessed from prospective visit data collected separately from the inter-rater, intra-rater validation study. This included assessments of the CDASI, CAT-BM, and PGA scale scores, as well as an overall evaluation from the physician as to whether the patient had improved, worsened, or had no change from their previous research visit. 35 patients with a cumulative 110 visits were obtained from this data source. There were 27 visits in which a clinical change was noted. The largest clinical change per patient, as defined as the largest difference in the PGA-Activity score between two consecutive visits, was included in the analysis. The standardized response mean (SRM) was used to determine responsiveness for the CDASI and the CAT-BM. The SRM measures the ratio of the mean of the differences (i.e. CDASI and CAT-BM scores before and after a clinical change was noted) between two time points to the standard deviation of the differences. The absolute mean change was used between visits to account for improvement and worsening of disease. This approach has been used in the past (Ruperto et al., 2010; Beaton et al., 1997).

Statistical Methods

Statistical analyses were performed using statistical programs STATA and SPSS. Inter-rater reliability was determined by intraclass correlation coefficient (ICC), type ICC (2,1) via Shrout and Fleiss convention (Shrout, 1979). Previous research has dictated that an ICC between 0.5 and 0.7 to be moderate, between 0.70 to 0.81 to be good, and an ICC ≥ 0.81 to be almost perfect (Landis, 1977; Klein et al., 2008). Intra-rater reliability was determined by ICC (2,1) and paired, two-tailed t-test comparing mean scores between initial and repeat scores of each instrument. Construct validity was assessed by testing the association between outcome measure (CDASI or CAT-BM) and corresponding validation measure. Because each patient and each physician had repeated measures, we used a linear mixed model for this test, adjusting for within-patient and within-physician variations. Other covariates, such as age and gender, were not seen to have an influence. Physician subject # and patient subject # were placed as random effect factors, while PGA scores were placed as a fixed effect covariate. Likert scores were also used as an additional means to assess construct validity. Differences in CDASI and CAT-BM scores when grouped by corresponding Likert scores were evaluated using one-way ANOVA. Linear regression was also used on mean CDASI and CAT-BM scores of each Likert group to determine linearity.

Acknowledgements

This material is based upon work supported by Celgene Corporation.

Abbreviation List

DM
Dermatomyositis
FDA
Federal Drug Administration
CDASI
Cutaneous Dermatomyositis Disease Area and Severity Index
CAT-BM
Cutaneous Assessment Tool-Binary Method
CAT-MM
Cutaneous Assessment Tool-Maximum Method
CAT
Cutaneous Assessment Tool
DSSI
Dermatomyositis Skin Severity Index
PGA
Physician Global Assessment
VAS
Visual Analogue Scale
SRM
Standardized Response Mean
ICC
Intraclass Correlation Coefficient

Footnotes

Conflicts of Interest

The authors state no conflicts of interest.

References

  • Beaton DE, Hogg-Johnson S, Bombardier C. Evaluating changes in health status: Reliability and responsiveness of five generic health status measures in workers with musculoskeletal disorders. J Clin Epidemiol. 1997;50(1):79–93. 1. [PubMed]
  • Bohan APJ. Polymyositis and dermatomyositis (first of two parts). New England Journal of Medicine. 1975;292(7):344–347. the. [PubMed]
  • Bohan APJ. Polymyositis and dermatomyositis (second of two parts). New England Journal of Medicine. 1975;292(8):403–407. the. [PubMed]
  • Callen JP, Wortmann RL. Dermatomyositis. Clinics in Dermatology. 2006 Sep-Oct;24(5):363–373. [PubMed]
  • Concept paper systematic lupus erythematosus 09/16/2003. 1-4- 1901 (Pamplet).
  • Downing SM. Reliability: On the reproducibility of assessment data. Med Educ. 2004;38(9):1006–12. 09. [PubMed]
  • Dugan EM, Huber AM, Miller FW, et al. International Myositis Assessment and Clinical Studies Group. Review of the classification and assessment of the cutaneous manifestations of idiopathic inflammatory myopathies. Dermatol Online J. 2009 Feb 15;15(2):2. [PubMed]
  • Feldman SRKG. Psoriasis assessment tools in clinical trials. Ann Rheum Dis. 2005 Mar;64(Suppl 2):65–8. discussion 69-73. [PMC free article] [PubMed]
  • Gaines EWV. Development of outcome measures for autoimmune dermatoses. Archives of Dermatological Research. 2008;300(1):3–9. [PMC free article] [PubMed]
  • Gerami P, Schope JM, McDonald L, et al. A systematic review of adult-onset clinically amyopathic dermatomyositis (dermatomyositis sine’ myositis): A missing link within the spectrum of the idiopathic inflammatory myopathies. J Am Acad Dermatol. 2006;54:597–613. [PubMed]
  • Huber AM, Lachenbruch PA, Dugan EM, et al. Juvenile Dermatomyositis Disease Activity Collaborative Study Group. Alternative scoring of the cutaneous assessment tool in juvenile dermatomyositis: Results using abbreviated formats. Arthritis and Rheumatism. 2008;59(3):352–6. [PMC free article] [PubMed]
  • Huber AM, Dugan EM, Lachenbruch PA, et al. Juvenile Dermatomyositis Disease Activity Collaborative Study Group. Preliminary validation and clinical meaning of the cutaneous assessment tool in juvenile dermatomyositis. Arthritis Rheum. 2008 Feb 15;59(2):214–21. [PubMed]
  • Huber AM, Dugan EM, Lachenbruch PA, et al. Juvenile Dermatomyositis Disease Activity Collaborative Study Group. The cutaneous assessment tool: Development and reliability in juvenile idiopathic inflammatory myopathy. Rheumatology (Oxford) 2007 Oct;46(10):1606–11. [PMC free article] [PubMed]
  • Kasteler JSCJ. Scalp involvement in dermatomyositis. often overlooked or misdiagnosed. JAMA. 1994 Dec 28;272(24):1939–41. [PubMed]
  • Klein RQ, Bangert CA, Costner M, et al. Comparison of the reliability and validity of outcome instruments for cutaneous dermatomyositis. British Journal of Dermatology. 159(4):887–894. [PMC free article] [PubMed]
  • Kunz B, Oranje AP, Labrèze L, et al. Clinical validation and guidelines for the SCORAD index: Consensus report of the european task force on atopic dermatitis. Dermatology. 1997;195(1):10–19. [PubMed]
  • Landis JRKG. The measurement of observer agreement for categorical data. Biometrics. 1977 Mar;33(1):159–74. [PubMed]
  • Lorizzo LJ, III, Lorizzo JL. The treatment and prognosis of dermatomyositis: An updated review. Journal of the American Academy of Dermatology. 2008 Jul;59(1):99–112. [PubMed]
  • Mrowietz U, Elder JT, Barker J. The importance of disease associations and concomitant therapy for the long-term management of psoriasis patients. Arch Dermatol Res. 2006 Dec;298(7):309–19. [PMC free article] [PubMed]
  • Ruperto N, Bazso A, Pistorio A, et al. Agreement between multi-dimensional and renal-specific response criteria in patients with juvenile systemic lupus erythematosus and renal disease. Clin Exp Rheumatol. 2010 May 25; [Epub ahead of print] [PubMed]
  • Shrout PEFJ. Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin. 1979 Mar;86(2):420–8. [PubMed]
  • Steel RD, Torrie JH, Dickey TA. Principles and Practice of Statistics: A Biomedical Approach. McGraw Hill; USA: 1997. pp. 297–299. ISBN 0-07-061028-2.
  • Tilstra JS, Prevost N, Khera P, English JC., 3rd Scalp dermatomyositis revisited. Arch Dermatol. 2009 Sep;145(9):1062–3. [PubMed]
  • Meng X-L, Rosenthal R, Rubin DB. Comparing correlated correlation coefficients. Psychological Bulletin. 1992;111(1):172–175.
  • Yassaee M, Fiorentino D, Taylor L, et al. Modification of the cutaneous dermatomyositis disease area and severity index, an outcome measure instrument. Br J Dermatol. in press. [PMC free article] [PubMed]