|Home | About | Journals | Submit | Contact Us | Français|
Dr. R. Aggarwal (Pittsburgh, PA, USA),
Dr. Y. Allenbach (Paris, France),
Prof. O. Benveniste (Paris, France),
Prof. J. L. De Bleecker (Ghent, Belgium),
Ms. I. de Groot (Member of the Myositis Workgroup of Spierziekten Nederland, The Netherlands),
Dr. H. Devilliers (Dijon, France),
Dr. D. Hilton-Jones (Oxford, UK),
Dr. J-Y. Hogrel (Paris, France),
Prof. I.E. Lundberg (Stockholm, Sweden),
Dr. A.L. Mammen (Bethesda, MD, USA),
Mr and Mrs. Oakley (Myositis Support group Southhampton, UK),
Dr. C. Oddis (Pittsburgh, PA, USA),
Prof. G. Padberg (research director ENMC),
Mr. D. Ponce (AFMTELETHON, manager of Group of Interest Inflammatory Myositis & myositis patient, France),
Dr. L. G. Rider (Bethesda, MD, USA),
Dr. M.R. Rose (London, UK),
Dr. H. Sanner (Oslo, Norway),
Dr. A. Selva- O’Callaghan (Barcelona, Spain),
Prof. M. de Visser (Amsterdam, The Netherlands),
Prof. A. Wells (London, UK),
Dr. V. P. Werth (Philadelphia, PA, USA).
The 213rd ENMC International Workshop, outcome measures and clinical trial readiness in idiopathic inflammatory myopathies (IIM), took place in Heemskerk, The Netherlands on September 18-20, 2015 and was attended by 18 experts in IIM from 7 specialties (dermatology, internal medicine, neurology, pediatrics, physiotherapy, pulmonology, and rheumatology) and 3 patient representatives from 8 countries (Belgium, France, Norway, The Netherlands, Spain, Sweden, the United Kingdom, and the United States of America). The goal of this 213rd ENMC International Workshop was to review the different approaches in the assessment of patients with IIM, discuss areas of consensus, and determine where improvements are needed.
IIM form a heterogeneous group of acquired myopathies. But their global phenotypes, in terms of intensity and distribution of weakness or extra-muscular organ involvement, vary greatly among subgroups, which also results in differing prognoses . Based on clinical and muscle biopsy pathological criteria, five subgroups are classically described: polymyositis (PM), dermatomyositis (DM), immune-mediated necrotizing myopathies (IMNM), inclusion body myositis (IBM), and non-specific myositis . Among some clinics, PM appears to be relatively rare [3,4], with many patients being reclassified among overlap syndrome with myositis , IBM  or IMNM . Furthermore, more than 15 different myositis-specific autoantibodies (MSA) are now described and measurable, redefining even more homogenous group of patients  and the IIM classification criteria are changing [9,10].
In parallel, consensus in guidelines for conducting controlled, randomized clinical trials (RCT) for myositis have been developed  and the necessity of consensus in clinical outcomes to measure impact of treatment appear fundamental. Three actors are pivotal: the patients who aim to have a better quality of life, the clinicians who would like objective measures to assess treatment responses, and the regulatory agencies (such as the U.S. Food and Drug Administration and the European Medicines Agency) who have emphasized a preference for functional outcome measures and patient- reported outcomes [12–14].
Lisa Rider and Olivier Benveniste opened the workshop by reminding the participants of the metaphor of Neurologists coming from Mars and Rheumatologists from Venus “in the way that they may speak similar, yet different languages when describing the same myositis patients” for IIM classification , but certainly also for clinical outcome assessment. Actually, in the field of neuromuscular disorders, and in particular for the evaluation of muscle weakness, Neurologists are experienced in the assessment of muscle strength and have developed tools, such as myometry, in this domain. In the field of connective tissue disorders, Rheumatologists developed many disease activity indices, and in particular for IIM, the Myositis Disease Activity Assessment Tool (MDAAT), the Myositis Damage Index (MDI), as well as the myositis core set measures and Preliminary Definitions of Improvement were developed by the International Clinical Assessment and Studies Group (IMACS), a group of collaborating rheumatologists, neurologists, dermatologists, physical medicine specialists and others with interest and expertise in myositis [16–18]. Nonetheless, few discussions among these two different approaches have occurred to date, and the need persists to build international consensus on how to evaluate responses to therapeutic interventions in IIM.
Lisa Rider reviewed the development of the IMACS approaches to the assessment of patients with myositis. IMACS developed and later validated a candidate core set of disease activity and damage measures for adult and juvenile IIM, which could be utilized internationally (Table 1). This includes physician and patient global activity, muscle strength assessed by manual muscle testing, physical function, muscle enzymes and extramuscular activity . IMACS also established the degree of change in these core set measures indicative of minimal clinical improvement , and developed consensus preliminary definitions of improvement for adult and juvenile DM and PM, in which at least 3 of 6 measures improving by at least 20% and no more than 2 worsening, which cannot include muscle strength, constitutes criteria for clinical response for use in clinical trials and studies . Chester Oddis reviewed the different clinical outcomes used in the clinical trials for IIM, discussed consensus achieved by IMACS in the design and conduct of clinical trials , and reviewed the performance of the IMACS core set measures and response criteria in a large-scale prospective randomized clinical trial: Rituximab in Myositis (RIM) study .
Mr. and Mrs. Oakley (founders of Myositis UK ), Mr. Ponce (founder of the Myositis Group at the Association Française contre les Myopathies (AFMTelethon, France) and Ms. De Groot (Member of the Myositis Workgroup of Spierziekten Nederland, and a myositis patient) reported through their own testimony and surveys they undertook in their respective associations what myositis patients are expecting regarding outcome assessment, with such questions as: what are outcome measures, why and for whom are these important (for doctors or for patients?). The patients agreed that the information gathered by tests and questionnaires are important, but that it is also important for doctors to focus on aspects which reflect patient’s needs and problems, e.g. fatigue, behavior, sexuality, treatment side effects, handicap. The lay group published a report of the workshop on the ENMC website .
Jan de Bleecker opened this session by presenting the update of the two previous recent ENMC workshops on muscle pathology in IIM [22,23]. They were aimed to define the minimum core set of parameters necessary for examining a muscle biopsy for suspected IIM and/or to eliminate a dystrophy. They also outlined the limits of current pathological classifications in e.g. PM, DM, IMNM, IBM, and non-specific myositis  since many patients cannot be incorporated into one of those subgroups, first because of interpretation discrepancies between pathologists. Jan de Bleecker introduced the fact that pathological criteria can be revisited in more homogenous groups of patients defined by their MSA, as has been published recently for myositis with anti-synthetase autoantibodies [24–26]. He proposed to continue developing and validating a set of identically assessed pathological abnormalities that can be used across different pathology labs. A standardized assessment and description of pathological features should enable the determination of which pathological hallmarks correspond with various clinical features, therapeutic responses, serologic findings etc., and may eliminate or re-define existing entities (e.g. PM) and possibly help to define new ones. Determining a specific entity based on the conclusions of biopsy readings has led and continues to lead to too broad or too narrow definitions of some subgroups: For example, excluding some inflammation as a pathological feature of necrotizing autoimmune myopathies (IMNM) would exclude about a quarter of all these patients from this subgroup.
Yves Allenbach illustrated this approach with 3 examples: DM with anti-MDA5 autoantibodies, as compared to classic DM, anti-synthetase syndrome, and IMNM. Patients with DM and anti-MDA5 autoantibodies often have focal sites of inflammation clustered around vessels on their muscle biopsies, without perifascicular atrophy and significant vasculopathy often observed in classical DM. Furthermore, these patients also presented a unique morphologic hallmark, which is NOS2 staining of muscle fibers . Patients with anti-MDA5 autoantibodies often do not meet the pathological ENMC criteria for DM , whereas they exhibit a specific muscle pathology .
Muscle pathological analysis of biopsies from patients with anti-Jo-1 autoantibodies also show muscle damage occurring in perifascicular regions. However, patients with anti-Jo-1 autoantibodies also have prominent muscle fiber necrosis and HLA-II expression in perifascicular regions [24–26]. Patients with anti-Jo-1 autoantibodies have a perifascicular necrotizing myositis . Furthermore, these patients also frequently have a unique morphologic hallmark of myonuclear actin filament inclusions .
Analysis of biopsies from patients with anti-signal recognition particle (SRP) and 3-Hydroxy-3-Methylglutaryl-Coenzyme A Reductase autoantibodies (anti-HMGCR Abs) frequently demonstrate muscle fiber necrosis, which is diffusely distributed. Most of these patients meet the ENMC criteria for IMNM , but more than a quarter of them harbor significant inflammation resulting in the pathological diagnosis of unspecific myositis according to the ENMC criteria (personal data), and raising the issue of where the demarcation between these entities truly lies or whether there is a continuum among them.
These examples illustrated that the muscle patterns defined by the ENMC criteria  are not fully representative of the whole spectrum of muscle pathology that occurs in the IIM. In addition, MSA are also associated with characteristic muscle pathologic patterns as well.
Andrew Mammen discussed the role of MSA for the classification of IIM and as inclusion criteria in future RCT. Today more than 15 MSA can be measured. Andrew Mammen reviewed the clinical phenotypes and prognoses of the different subgroups of patients defined by their MSA. He also showed data that various MSA have different biopsy features, including more frequent mitochondrial dysfunction in patients with TIF1γ, an absence of primary inflammation in the biopsies of patients with anti-NXP2 autoantibodies, but frequent primary inflammation in the biopsies of patients with anti-Mi-2 or PM-Scl autoantibodies . He discussed differences in response to therapy and better outcomes in older patients with anti-HMGCR autoantibodies who are frequently statin exposed, compared to younger patients with the same autoantibody who do not improve as well with intravenous immune globulin (IVIg) treatment and are not often exposed to statins . However, both have similar biopsy features and the same HLA allelic association and the same autoantibody target. His proposal is to define myositis subgroups based on MSA and focus trials on subgroups defined by autoantibody status.
After these three presentations, the general discussion included the following points: 1) The experts felt that MSA will be a strong factor in the future classification of myositis. The level of evidence is sufficient today for the categorization of myositis with anti-synthetase autoantibodies, in that anti-Jo1 and other tRNA synthetase autoantibodies identify patients with a distinct disorder: the anti-synthetase syndrome [24–26]. For the other MSA, data suggest that these may also represent stable and mutually exclusive syndromes [30,31], and ongoing analyses of large cohorts of IIM patients may refine these findings. 2) Despite attempts to carefully phenotype IIM patients, there remained discord among the participating specialists in recognizing the same groups of patients. Some in the group felt that PM as defined by Bohan and Peter  should not be used since, although the criteria require first eliminating all other diagnoses, the process to do this is not clearly delineated, but many patients with PM actually have other non-inflammatory myopathies. Some felt that if all current methods to define the myositis phenotypes are used, many PM patients could be reclassified as having IBM, anti-synthetase syndrome, anti-SRP or anti-HMGCR IMNM, non-specific myositis, DM without a rash (with typical pathological features of DM in the muscle) or genetic muscle diseases. In contrast, others in the group felt that a proportion of IIM patients in their practice are represented by a distinct PM subgroup. 3) The physiopathogenesis of the different IIM subsets may vary. For example, there is now clear evidence that the spectrum of DM may belong to the group of interferonopathies [33,34]. As a consequence, and because many of the therapeutic strategies being planned will use highly-targeted therapies (e.g. ongoing anti-interferon strategies), the recommendation of the group is to study as homogenous groups of patients as possible in future clinical trials. Olivier Benveniste and others from the the ENMC Myositis Outcomes Study Group proposed that, based on their personal experience, these groups might include: 1) IBM; 2) DM; 3) anti-synthetase syndrome; and 4) IMNM (with anti-SRP or anti-HMGCR). Furthermore, Dr. Benveniste’s proposal, which was not endorsed by all of the workshop participants, in that among the group of adult DM, patients with anti-MDA5 and anti-TIF-1γ autoantibodies would not be included in clinical trials because anti-MDA5 is frequently associated with a dramatic skin-lung syndrome  and anti-TiF-1γ (notably after age 50 years) is frequently associated with cancer , leading to a life threatening prognosis . Regarding JDM, a large (n=139) randomized trial including solely these patients has been published recently , demonstrating the feasibility of such focused inclusion criteria for a single subgroup of patients. In contrast, a number of participants of the ENMC Myositis Outcomes Study Group, particularly those involved with the conduct of clinical trials, voiced concerns to restrict enrollment in future IIM clinical trials for several reasons, including that there is limited data on how MSA or biopsy findings predict prognosis or responses to therapies, that therapeutic trials are often in need of larger numbers of patients and therefore more inclusive criteria, and that currently there is no standardized methodology for the detection of MSA. The schema proposed above was also not uniformly accepted in that some patients may fall into more than one of the proposed groups, and many DM patients would be excluded from trials if the anti-MDA-5 and TIF-1γ subsets were excluded.
The group then reviewed three main targets of IIM (muscle weakness, interstitial lung disease and skin involvement) and the way to evaluate these complications during RCT.
Hervé Devilliers reported a study on 51 IIM patients (DM, IMNM or anti-synthetase syndrome) naïve of any immunosuppressant who were assessed just before starting treatment and 6 months later . This study aimed to compare responsiveness to the treatment by the calculation of the standardized response mean (SRM) and the effect size (ES). The entry criteria for the study was to include patients with selected IIMs defined by the ENMC criteria , sufficiently severe to necessitate initial use of prednisone (starting at 1 mg/kg/d) in combination with one immunosuppressant (methotrexate, azathioprine or cyclophosphamide) and IVIg. The patients were evaluated at the time of IVIg infusion. Manual muscle testing (MMT) evaluating shoulder abduction, elbow flexion, neck flexion, lower limb abduction, knee flexion and extension, and thigh flexion was performed using the Medical Research Council (MRC) 0-5 scoring. Hervé Devilliers observed that the muscle groups most sensitive to change (SRM > 1.1, ES > 0.9) are shoulder abductors and hip flexors, followed by elbow flexors, neck flexors and knee flexors (SRM > 0.8, ES>0.8). In this study, hip abductors and knee extensors were less sensitive to change (SRM < 0.7 and ES < 0.6), either because these muscles remain very strong or are difficult to be manually tested . From a Rasch analysis which investigated physicians’ ability to discriminate among the MRC 0-5 scoring in more than 1000 patients with different neuromuscular disorders and with various degrees of weakness , it appears that physicians are not able to discriminate between most MRC 0-5 categories, but also that a grade 4 on the MRC scale lacks sensitivity and represents a large variation in strength. As a result, the authors propose a modification of the MRC 0-5 grading system to four response categories (0, paralysis; 1, severe weakness; 2, slight weakness; and 3, normal strength) that may enhance the ability of clinicians to differentiate degrees of weakness with greater accuracy . MMT with more categories (such as the Kendall with 10 categories) was the subject of debate among the group, with some members of the ENMC Myositis Outcomes Study Group recognizing that categorizations such as Kendall score 7, 8 and 9 (holds test position against slight to moderate pressure (7), moderate pressure (8), and moderate to strong pressure (9)) are too subjective. The proposal of Hervé Devilliers is to use a MMT5 score of 5 muscle groups, with 0-3 scoring of shoulder abductors (deltoid), hip flexors (iliopsoas), elbow flexors (biceps brachii), neck flexors and knee flexors (hamstrings) for the evaluation of IIM patients.
Jean-Yves Hogrel reviewed the various methods used to evaluate muscle strength in different neuromuscular disorders. These methods can be subdivided in 5 categories: MMT, hand-held dynamometry (HHD), fixed dynamometry (generally referred to as quantitative muscle testing or “QMT”), isokinetic dynamometry (such as with a Biodex) and specific dynamometry (à la carte designed devices). He illustrated the correlations obtained between MMT, QMT and Biodex measurements in healthy subjects, Duchenne dystrophy or spinal muscular atrophy patients [41,42]. He illustrated the relationship between strength and motor abilities (evaluated by functional tests such as the 6 minute walking distance (6MWD)), leading to ceiling effects (increasing strength on QMT with a constant distance traveled on the 6MWD) or floor effects (undetectable observed effect such as for patients incapable of walking who achieve a 6MWD of 0, despite a measurable generated force on QMT). He illustrated the accuracy and resolution of different devices available for strength measurement, and the complexity of the process for their validation.
In the field of IIM, Jean-Yves Hogrel also presented the natural history study regarding muscle weakness progression in patients with untreated IBM [43,44]. In this slowly progressive IIM, the only significant measurable change over 9 months in 22 IBM patients was knee extension (quadriceps strength) measured by a Biodex (p=0.02) . MMT was the least sensitive to change over time in patients with IBM, compared to other measures of strength and function, including muscle strength tested by myometry (with Biodex or different quantitative dynamometric devices), functional tests (such as 6MWD), or different functional scales, such as the IBM weakness composite index  or the IBM functional rating scale .
Finally, Yves Allenbach and Jean-Yves Hogrel presented unpublished data from a small, pilot, prospective study on 5 IIM patients (DM, IMNM and anti-synthetase syndrome), followed since the time of treatment initiation . The outcomes evaluated over 6 months were those proposed by IMACS, including manual muscle testing of 8 muscle groups [MMT8], creatine kinase [CK] level, Health Assessment Questionnaire [HAQ], patient and physician global disease activity, and the global disease activity (MDAAT) index, but in parallel, the patients wore wrist accelerometers, continuously for 15 consecutive days per month. This device recorded the global movement of the patients throughout the day. The values obtained were compared to values from age- and gender-matched healthy subjects. The preliminary results were encouraging, showing a better sensitivity to change than any of the other measures , although there were good correlations with HAQ, as well as patient and physician global activity. This device permitted obtaining an objective measure of daily movement, which did not necessitate expertise or training. It may also interest regulatory agencies, in that it reflects muscle movement in real life situations. Of course, these results have to be validated in larger, prospective cohorts of patients, as well as randomized controlled trials.
Lisa Rider presented the IMACS Core Set Measures for muscle weakness and functional disability. One of the IMACS core set measures of disease activity in myositis is muscle strength , and IMACS selected manual muscle testing to assess strength for several reasons: because it is widely used historically in IIM clinical trials, it can test proximal, distal and axial muscles, it can be used internationally in clinics, it is accepted by neurologists and rheumatologists, and it can be used in adults and children . IMACS worked with several rehabilitation medicine and physical therapy specialists to standardize the performance of MMT, including standardizing commands and positioning of patients, the order of muscle groups tested, use of summary scores which have better reliability than individual muscle groups, and providing training materials on the IMACS website . The MMT8 was developed as a shorter version of a total MMT score with similar validity and reliability as 24 proximal, distal and axial muscle groups, but with improved sensitivity to change . Examination of data from two small recent clinical trials, Etanercept in DM and the NIH cohort of the RIM trial also demonstrated comparable or better responsiveness of MMT compared to quantitative muscle strength testing [50,51]. She also reviewed functional assessment tools, including the data on the reliability, construct validity and responsiveness of the HAQ [52–54], and an observational functional scale, the Childhood Myositis Assessment Scale (CMAS), that has been extensively validated in patients with juvenile DM [54–56]. Both the HAQ and CMAS have had moderate to excellent responsiveness in the two aforementioned clinical trials [50,51]. Apart from responsiveness and limited construct validity with MMT and quality of life measures, including subscales of the Short Form 36 (SF-36)  and Myositis Activities Profile (MAP) the HAQ has not been fully validated for adults with myositis.
Ingrid Lundberg presented the Myositis Functional Index-2 (FI-2), as an observational assessment of muscle function and endurance that has undergone validation in patients with PM and DM . From data from their clinics, MMT-8 has more of a ceiling effect in that 85-95% of patients achieve a maximal score, compared to only 25% of patients with FI-2 (Alexanderson and Lundberg, unpublished). One advantage of MMT-8 is its brevity of 5 minutes performance time, compared with the FI-2, which takes 20 minutes to complete. They are currently working on validating a shortened version of the FI-2, including three out of the original seven muscle groups; shoulder flexion, neck flexion and hip flexion, which were the most discriminative tasks . Preliminary data suggests high intra- and inter rater reliability of the shorter FI-3. However, additional data collection is currently ongoing.
Michael Rose reminded the participants of his recent review of rating scales to assess function in muscle diseases. He found 119 scales reported, among which 19 were muscle disease-specific . Among these 19 activity rating scales designed specifically for muscle diseases, the HAQ performs well, but none are sufficiently comprehensive in their measurement or feasibility or sufficiently validated in their performance, as would be required by some regulatory authorities , representing an urgent unmet need.
The ENMC Myositis Outcomes Study Group participants discussed the following issues related to muscle weakness and functional evaluation: IMACS core set measures and definitions of improvement have been validated and should be recommended in future RCTs (Table 2). The evaluation of the MMT8 on the 0-10 point scale was accepted by the group as a well-validated approach for current use in clinical trials (Table 2). The evaluation by MMT may be further abbreviated by moving from the MMT8 0-10 categories (Kendall) to a shortened version of fewer muscle groups scored on a 0-3 scale; however, this requires validation. Regulatory agencies may prefer functional, real life measurements as clinical trial endpoints. Of currently available functional measures, the Health Assessment Questionnaire and Childhood Myositis Assessment Scale for juvenile DM already have sufficient validation data for use in RCTs (Table 2). However, functional tests or scales are limited by their intrinsic ceiling and floor effects, but several should be examined in future studies, including the Motor Functional Measure, Functional Index, the Childhood Myositis Assessment Scale, 6 minute walk time, and accelerometry as a measure of motor movement, all of which are observational functional tools. The limitations of functional tests should be taken into account in the design of future RCT by adapting these tests to the subgroup of patients with particular phenotypes (highlighting again the interest to have the more homogenous groups as possible) and their severity. New approaches with devices capable of monitoring motion during everyday life activities of the patients, such as accelerometers (but others are also under development) are certainly promising, but should be validated before any recommendation can be made.
Lisa Rider presented the MDAAT and MDI Assessment (Table 1). The MDAAT assesses disease activity in six extra-muscular systems, including constitutional, cutaneous, skeletal, gastrointestinal, pulmonary and cardiovascular. It combines assessment of key individual items using the Myositis Intention to Treat Index (MITAX), with a visual analog severity scale for each system. An extramuscular global score and muscle system are also included in the tool. The tool has been validated, with good rater reliability, as well as good content, construct and discriminant validity in adult and juvenile DM and PM patients [16,54,56,61,62]. In the NIH cohort of the RIM trial, several systems had excellent responsiveness, including the extramuscular global score , and in the Etanercept in DM trial, the overall score was moderately responsive, but the muscle component was more responsive than the cutaneous domain . The MDI is a validated tool to assess damage related to disease, as well as comorbid conditions and long-term medication toxicities. It is comprised of 11 organ systems in which individual items are scored to obtain the extent of damage, as well as a series of visual analog scales for each system to indicate the damage severity. The tool not only has good content, rater and construct validity, but also excellent predictive validity, in that patients who died have higher rate of change in damage, as well as higher baseline pulmonary and cardiac damage scores in adult DM and PM patients, and those with a chronic illness course also have higher damage scores [16,54,63,64].
Rohit Aggarwal reviewed the development of new response criteria for myositis (Table 2). The definitions of improvement that were previously developed  are considered preliminary, in that they define only minimal clinical improvement, lack clinical trial validation, and provide equal weight to all core set measures; currently there are more sensitive methodologic approaches as well. A large multi-center interdisciplinary project to develop new response criteria for myositis used large datasets from natural history studies and clinical trials, a large number of myositis experts to develop consensus ratings of patient profiles, and several statistical methodologies to develop and test a large number of new definitions of clinical response. One of the methods included conjoint analysis, using the 1000Minds survey of experts, which enabled the determination of differential weights to the core set measures, and the development of hybrid model definitions of improvement. The final consensus response criteria for DM and PM, which performed very well in both patient profiles and in the RIM trial, is a conjoint analysis definition. These response criteria use absolute percentage change in the core set measures, differentially weight the core set measures with muscle strength and physician global weighted more heavily, and involve a hybrid definition in which thresholds of minimal, moderate and major improvement have been defined, but also a total improvement score (a continuous outcome on a 0-100 scale) may be used to test differences between treatment arms in a clinical trial setting  . The same response criteria will also be used for juvenile DM, but with different thresholds for improvement .
In the general discussion, the workshop participants accepted the IMACS core set measures and response criteria for use in future therapeutic trials (Table 2). However, there was concern whether these response criteria would be suitable for all subgroups of patients. In particular, they were thought likely not to work well for groups of patients in whom these criteria have not been validated, including those IMNM or IBM patients who have only weakness without extramuscular involvement, or in hypomyopathic DM patients who have only skin disease without weakness. Some trials may need subset-specific outcome measures or different primary endpoints.
Athol Wells reviewed work in idiopathic pulmonary fibrosis (IPF), including recent major clinical trials for two new approved therapies, and in scleroderma, both of which have much relevance to ILD associated with IIM. First, in these recent trials, the cardinal goal was prevention of, or reduction in, progression, rather than reversal of disease, and this has determined the choice of endpoints [67–69]. An important dilemma in these settings has been to enroll patients who have a high probability of disease progression. The United Kingdom Research Staff Association (UKRSA) staging system relies on a combination of high resolution chest computed tomography (HRCT) extent, with a threshold of > 20% involvement, and forced vital capacity (FVC) of < 70% to select patients with extensive disease, who have also had a higher likelihood of disease progression . In scleroderma, not only is more severe disease more likely to progress, but also progression is more likely within three years of onset of systemic disease and also more likely if recent progression has been observed . Mortality is not a realistic endpoint for clinical trials, but a >10% decline in FVC best predicts mortality and is a preferred clinical trial endpoint . In IPF, a highly progressive disease, the preferred primary end-point is decline in FVC, analyzed as a continuous variable. However, in scleroderma and in IIM, lung disease is, on average less progressive: analyses of continuous FVC change are likely to be confounded by measurement variation. Categorical change in FVC (a 10% decline), a key secondary end-point in IPF trials, appears more suited to trials in IIM . Currently there is ongoing discussion of ways to improve the sensitivity of this end-point: one approach which is increasingly favored is to accept lesser decline in FVC (5-10%) as indicative of disease progression, provided that there is evidence of decline as judged by a second variable (which might, in principle, be diffusion capacity of lung for carbon dioxide [DLCO], 6MWD, serial HRCT, or a change in the level of exertional dyspnea).
Ingrid Lundberg reviewed the performance of the pulmonary domain for MDAAT and MDI in assessing ILD associated with IIM. The pulmonary domain of the MDAAT has had good to excellent rater reliability, was very responsive to change in the RIM trial and correlates well with Jo1 ELISA results in longitudinal assessment [51,61,62]. In the MDI, pulmonary damage, including impaired lung function, pulmonary fibrosis, longstanding dysphonia, were frequently observed in DM/PM patients and pulmonary damage severity was increased compared to juvenile DM patients . The Outcome Measures in Rheumatology (OMERACT) connective tissue disease related ILD interest group not only identified FVC as an important endpoint, but also other measures of lung function, including DLCO and supplemental oxygen requirement as primary endpoints for clinical trials. Six minute walk time, dyspnea, health-related quality of life (HRQoL), lung imaging including the extent of ILD on HRCT as well as the extent of honeycombing and ground glass opacities, are other important domains recommended to be measured in clinical trials of ILD .
Rohit Aggarwal further discussed the OMERACT measures for connective tissue disease-associated ILD for use in clinical trials. Progression-free survival (or time to progression) was recommended as a primary composite endpoint for clinical trials and observational studies of connective tissue disease- associated ILD. Progression, then, was defined as >10% decline in predicted FVC, or 5-10% decline in predicted FVC in combination with ≥15% decline in DLCO, or death . This endpoint needs validation in clinical cohorts and trials, and Dr. Aggarwal presented data from the Pittsburgh cohort demonstrating that a ≥10% decline in FVC predicts mortality in IIM.
The general discussion accepted that the experience of the idiopathic fibrosis trials, as well as the consensus from OMERACT and the tools from IMACS may be appropriately applied to clinical trials in certain IIM subgroups (Table 2). Discussion centered around the fact that many IIM patients with ILD also have weakness, including chest wall weakness and so FEV1 or FVC may be decreased for both reasons. In the idiopathic pulmonary fibrosis trials, the 6 minute walk test had a large standard deviation of change, and in IIM, problems with multisystem disease, fatigue, arthritis, muscle weakness and decreased endurance would be factors contributing to poor scores. A composite endpoint that captures change in FVC along with change in quality of life may work well, but needs to be tested. The chest CT findings of organizing pneumonia or isolated ground glass changes without fibrosis are findings that are potentially reversible; however, CT imaging changes cannot be related to clinically meaningful endpoints, such as mortality.
Victoria Werth reviewed the assessment tools for cutaneous disease in IIM. The most developed of the available tools is the Cutaneous Dermatomyositis Disease Area and Severity Index (CDASI), which assesses skin disease activity, including erythema, scale and ulceration, across 15 body areas, as well as Gottron’s papules, periungual capillary changes and alopecia. In addition, cutaneous damage, which includes poikiloderma and calcinosis, is similarly assessed across the 15 body areas and the hands . The CDASI has good to excellent inter-rater reliability, good construct validity, including good correlation with change in HRQoL, and superior responsiveness compared to other cutaneous assessment tools [75,76]. The clinical interpretability of CDASI scores has also been defined, with CDASI activity ≥ 14 indicative of moderate to severe activity and a 5 point change defined as clinically relevant . Another cutaneous assessment tool available is the Cutaneous Assessment Tool (CAT), which assesses 17 activity and 11 cutaneous damage items, and is more clinically-usable in binary mode [78,79]. Rater reliability and construct validity are comparable to the CDASI, but it may be less sensitive to change than the CDASI [54,75] (Table 2).
Helga Sanner discussed issues specific to juvenile DM, including the frequent presence of calcinosis, lipodystrophy and vasculopathic changes, the less frequently-present manifestations of ILD and malignancy, and the difference in the distribution of MSA phenotypes in juvenile DM patients [80,81]. Helga Sanner has led a long-term follow-up study of 60 juvenile DM patients and 59 controls from an inception cohort in Norway, with a number of important observations: 90% had cumulative organ damage assessed by the MDI, adult patients had lower HRQoL, and 51-73% had measurable disease activity by the MDAAT or PRINTO criteria [82–84]. More than half the patients had muscle damage as assessed by magnetic resonance imaging, and up to 40% had documented weakness at last follow-up evaluation . Patients frequently had subclinical pulmonary changes, with low TLC in 26%, low DLCO in 49% and HRCT findings in 37%, including ILD in 14% which is higher than the reported rate of ILD in JDM . A number of patients also have evidence of subclinical cardiac diastolic and systolic dysfunction [87,88]. Helga Sanner discussed some limitations in current assessment tools, including lack of attribution of damage vs. activity in the Disease Activity Score, the absence of skin disease activity reflected in the PRINTO criteria for inactive disease , and that the assessment of damage if defined as permanent and cumulative, rather than persistent, may not capture reversible long-term sequelae that may be present in patients with juvenile DM.
Albert Selva O’Callaghan opened this session. He reminded the group that over the previous decades, several cross-sectional studies have investigated health-related quality of life (HRQoL) in patients with myositis [53,89–93]. What we have learned from these efforts is that HRQoL is decreased in patients with myositis, with no published differences between the clinical, pathological or even immunological subsets of patients, likely because of the small samples studied. Moreover, based on the published limited literature, neither the disease duration nor the clinical course seems to be implicated in HRQoL. Data on the ability to detect changes in HRQoL over time in longitudinal studies are lacking, which is an important drawback for the use of this outcome in clinical trials.
Among the available instruments to evaluate HRQoL, the most widely used in patients with myositis is the SF-36, which has good psychometric properties and is recommended by the IMACS group . Nonetheless, it is costly to use. In a global scenario, in which patients from different parts of the world would participate in a study, this is not a minor factor. Other such tools have also been successfully used for HRQoL study in myositis, including the Nottingham Health Profile , Sickness Impact Profile , and recently, the World Health Organization-endorsed test in its brief version (WHOQOL-BREF), among the only free-access instruments . These generic health profiles which generally have good reliability and construct validity, each have advantages and disadvantages (reviewed in ) (Table 2).
It is uncertain whether a disease-specific or generic HRQoL instrument will better address particular situations, including identification of the most affected domains, such as physical health and the environment, or detecting issues in patients within disease subsets such as the anti-synthetase syndrome or cancer-associated myositis, because of the heterogeneity, complexity and multi-systemic involvement of inflammatory muscle diseases. What we need now is a consensus agreement regarding which instruments are the best to use in our patients, based on experience, previous recommendations and cost, and the development of intervention studies targeting previously identified highly-affected specific domains in order to improve the HRQoL of our patients.
Michael Rose reminded the audience that he developed in 2007 the Individualized Neuromuscular Quality of Life (INQoL) questionnaire, which is a muscle disease-specific QoL questionnaire for patients with muscle disorders . INQoL has been translated and validated for populations in the UK, USA, Italy, Serbia, and the Netherlands [95–99]. Nevertheless, although the INQoL has been tested in IBM patients, it has not been tested in other IIM subgroups. In contrast to many muscle diseases the availability of treatment for IIM, but with uncertain immediate and long term response, and with the threat of relapse, might mean that specific QoL measures for IIM will be needed to capture the impact of these uncertainties on QoL.
Victoria Werth reviewed quality of life measures in the assessment of cutaneous disease associated with DM and other IIM, presenting data to support that the assessment of skin QoL is important in patients with DM (Table 2). In terms of skin-specific QoL tools, the Derm Life Quality Index (DLQI) is a 10 item tool that is used most frequently. The Skindex-29 is comprised of three domains, including symptoms, emotions and functioning, and the Skindex-16 has two domains, including psychosocial and symptoms. Both DLQI and Skindex have been validated in other cutaneous diseases and there is experience also with the Skindex-29 in DM, with the function domain correlating best with the DLQI, but other domains also with moderate correlation . An Itch Global visual analog scale (VAS) score also correlates best with Skindex- Symptoms domain, compared to other domains of the Skindex or to the DLQI . DM patients with change in their CDASI score also had significant improvement in their Skindex-29 scores, as well as Itch or Pain VAS scores .
Ingrid Lundberg updated the group on the work of the OMERACT Myositis Special Interest Group on Patient Reported Outcomes. While the SF-36 has good construct and criterion validity and good feasibility in IIM , many domains improve over 18 months, but at 3 year follow-up, still remain below normal . Discriminant validity of some domains (physical function, vitality) also appears better than others (general health, mental health), in the setting of a 12-week exercise intervention study in DM/PM patients . The OMERACT group led IIM patient focus groups in qualitative research to identify the way myositis affects their lives. High-level themes included pain, physical stiffness or loss of flexibility, fatigue, emotional impact, the effect of the disease on relationships and treatment-related side effects . The Myositis Activities Profile (MAP), a 31 item disease-specific questionnaire with 4 subscales of movement, social, self-care, domestic and leisure activities was examined by the OMERACT group and also taken back to the patients in individual interviews using the “think aloud” technique, leading to a revised MAP that will be tested by Rasch methodology to optimize subscale and scoring and eliminate redundancy  (Table 2).
The ENMC Myositis Outcomes Study Group concluded that the IMACS measures of disease activity and the new response criteria for DM and PM are globally well-designed and validated tools for IIM evaluation (Table 2). Strengths of the core set measures include the comprehensiveness of the measures, prior validation studies, practicality for use in clinics, and applicability to both adults and children with myositis. The new response criteria are also a composite endpoint that can measure minimal, moderate and major improvement on a continuous scale, provide differential weights to each of the core set measures, and do not require large degrees of improvement in all of the measures to meet criteria for clinical improvement. The group also identified validated measures of pulmonary and cutaneous outcomes to be incorporated in future trials, especially those focused on patients with ILD or with DM cutaneous activity. This meeting also pointed out some limitations of the current tools. These include the methods to evaluate muscle weakness and the absence of well-validated performance-based observational functional scales. Furthermore, the development of applications permitting evaluation of patients’ real life movements is a very interesting approach that urgently needs validation. HRQoL measures were also considered to be important, but in need of refinement to ensure content validity and responsiveness in trial settings.
The organizers conclude this workshop by proposing a new study to re-examine the Core Set Outcome Measures of IMACS and to develop new measures:
To validate these approaches, at least 100 patients in each of the subgroups (for example: DM, anti-synthetase syndrome and IMNM) are needed with a range of disease activity and clinical manifestations. These patients would be seen twice at the appointed hospitals: once at the start of the illness or at the onset of a disease flare and once after 6 months, to examine sensitivity to change.
Going forward, the ENMC Myositis Outcomes Study Group agreed that neurologists, rheumatologists, dermatologists, pulmonologists and other specialists caring for myositis patients will benefit from working together in the design and conduct of future myositis clinical trials and studies.
This workshop was made possible with the financial support of the European Neuromuscular Centre (ENMC) and its main sponsors: Association Française contre les Myopathies (France), Deutsche Gesellschaft für Muskelkranke (Germany), Muscular Dystrophy Campaign (UK), Muskelsvindfonden (Denmark), Prinses Beatrix Fonds (The Netherlands), Schweizerische Stiftung für die Erforschung der Muskelkrankheiten (Switzerland), Telethon Foundation (Italy), Spierziekten Nederland (The Netherlands) and Associated members, and the Finnish Neuromuscular Association (Finland). The workshop received supplementary funding from Myositis UK, The Myositis Association, AstraZeneca, aTyr Pharmaceuticals, LFB Group, and MedImmune. This work was supported in part by the intramural research program of the National Institutes of Health, National Institute of Environmental Health Sciences. The views expressed are those of the authors and not necessarily those of the US government, the (UK) National Health Service (NHS), the NIHR, or the (UK) Department of Health. We thank Drs. James Katz and Lisa Christopher-Stine for critical reading of the manuscript.