|Home | About | Journals | Submit | Contact Us | Français|
We previously described a composite MRI scale combining T1-lesions, T2-lesions and whole brain atrophy in multiple sclerosis (MS): the Magnetic Resonance Disease Severity Scale (MRDSS).
Test strength of the MRDSS versus individual MRI measures for sensitivity to longitudinal change.
We studied 84 MS patients over a 3.2±0.3 year follow-up. Baseline and follow-up T2-lesion volume (T2LV), T1-hypointense lesion volume (T1LV), and brain parenchymal fraction (BPF) were measured. MRDSS was the combination of standardized T2LV, T1/T2 ratio and BPF.
Patients had higher MRDSS at follow-up versus baseline (p<0.001). BPF decreased (p<0.001), T1/T2 increased (p<0.001), and T2LV was unchanged (p>0.5). Change in MRDSS was larger than the change in MRI subcomponents. While MRDSS showed significant change in relapsing-remitting (RR) (p<0.001) and secondary progressive (SP) phenotypes (p<0.05), BPF and T1/T2 ratio changed only in RRMS (p<0.001). Longitudinal change in MRDSS was significantly different between RRMS and SPMS (p=0.0027); however, change in the individual MRI components did not differ. Evaluation with respect to predicting on-study clinical worsening as measured by EDSS revealed a significant association only for T2LV (p=0.038).
Results suggest improved sensitivity of MRDSS to longitudinal change versus individual MRI measures. MRDSS has particularly high sensitivity in RRMS.
Conventional magnetic resonance imaging (MRI) measures of brain atrophy, T1 hypointense and T2 hyperintense lesion volume, provide a qualitative estimate of damage that has accumulated in patients with multiple sclerosis (MS). However, often there is a clinical and MRI dissociation, whereby conventional MRI findings show a weak relationship to clinical status, such as disability as measured by the Expanded Disability Status Scale (EDSS) . Furthermore, ideal treatment planning and risk stratification would require the availability of accurate and reliable surrogates that predict disease course. While current MRI measures have a role in providing clinically relevant disease activity and severity to some extent, there remains a need for improved biomarkers which are sensitive and comprehensive in their predictive ability.
Composite MRI measurements offer a new approach to assess various aspects of disease involvement which are reflected in a myriad of MRI findings, which demonstrate both inflammatory and neurodegenerative changes [2-6]. Previously, we developed a composite scale to define the severity of cerebral damage in MS, known as the Magnetic Resonance Disease Severity Scale (MRDSS) , which combines three continuous measures: T2 hyperintense lesions, the ratio of T1 hypointense to T2 lesion volume, and whole brain atrophy (normalized whole brain volume). MRDSS showed a larger effect size than any of the individual MRI components in distinguishing among clinical phenotype groups and was associated with disability progression occurring during the subsequent three years. However the longitudinal sensitivity of MRDSS is not known.
The purpose of this study was to longitudinally assess the sensitivity and validity of the MRDSS over three years in a cohort of MS patients representing a wide range of disability and disease duration and all three major clinical phenotype groups. We focused on the sensitivity to change in MRDSS and its relationship to clinical disability in comparison to the change in the individual components of MRDSS assessed independently.
Demographic and clinical characteristics of the subjects are summarized in Table 1. We retrospectively identified 84 patients with MS from a consecutive sample being prospectively enrolled and monitored as part of the Comprehensive Longitudinal Investigation of MS at the Brigham and Women’s Hospital and Partners MS Center (CLIMB). CLIMB is an ongoing prospective observational cohort study that began following patients in 2000 . Inclusion into the current study was based on the following criteria: 1) age 18-60 at baseline; 2) brain MRI at baseline and follow-up scan 2.5 to 3.5 years later performed at the Brigham and Women’s Hospital on the 1.5T unit dedicated for MS care using the scanning protocol established for the CLIMB study; 3) baseline and follow-up examination with EDSS scoring  performed by an MS specialist neurologist at the Partners MS Center; 4) baseline and follow-up EDSS testing performed within three months of the MRI; 6) Established MS diagnosis  at baseline of either relapsing-remitting (RR), secondary progressive (SP) or primary progressive (PP) by established criteria . These patients were part of our initial study which focused on the relationship between baseline MRDSS only and clinical status; more details on the study population, study design, medication use, etc. have been previously published . The current sample represents a subset of patients from the previous study in whom follow-up MRI was available. The current study was approved by our institutional review board.
Progression of neurologic disability at follow-up was defined as 1-point progression on EDSS if the baseline score was less than 6 or 0.5 point progression if the baseline score was 6 or higher . EDSS worsening at follow-up had to be sustained for at least three months to be considered progression. Patients were thus classified as stable or progressed on EDSS at the three-year follow-up. These patients were followed for (mean ± SD) 3.2 ± 0.3 years (See Table 1).
All patients underwent baseline and follow-up brain MRI on the same scanner using the same scanning protocol . MRI was obtained on a Signa 1.5-T unit (GE Signa, General Electrics, Milwaukee, WI) using a quadrature head coil. Axial brain imaging included T1-weighted spin-echo (TR/TE: 725/20) and dual-echo spin-echo T2-weighted (3000/80/30) images with 256 × 256 × 54 voxels and a nominal voxel size of 0.9375 × 0.9375 × 3 mm, without inter-slice gaps. After infusion of intravenous gadolinium (0.1 mmole/kg), and a five-minute delay, axial T1-weighted spin-echo imaging was repeated.
Using automated template-driven segmentation (TDS+)  from the dual echo images, T2 hyperintense lesion volume (T2LV) and normalized whole brain volume [brain parenchymal fraction (BPF)] were determined, the latter of which was an estimate of whole brain atrophy .
Analysis of hypointense lesions (black holes) on T1-weighted images was performed using a computer based semiautomated edge-finding tool . A black hole was defined as a lesion appearing visibly hypointense to the surrounding white matter and showing at least partial hyperintensity on dual echo T2 images, but non-enhancement on post-gadolinium studies (to reduce the likelihood of including transient black holes) . The presence of T1 hypointense lesions (black holes) was determined by consensus of two trained observers as part of a reading panel.
To describe the destructive potential of lesions, a ratio of T1 hypointense to T2 hyperintense lesion volume was created for each patient.
Template-driven segmentation with partial volume correction achieves an intraclass correlation of 0.994, interscan coefficient of variation of 4.98%, and mean ± SD volume bias of 0.01 ± 0.68 mL . The T1LV measurement showed intraobserver and interobserver coefficients of variation of 1.7% and 4.5%, respectively .
A detailed description of the calculation of the original MRDSS was provided in a previous manuscript ; a brief summary of the calculation and changes to the model is provided here. All MRI data were rounded to 2 decimal places prior to analysis. Because the distributions of T2LV and T1:T2 were skewed, log (T2LV) or logistic (T1:T2) transformation was performed. The T2LV, BPF, and T1:T2 were then standardized (z) by subtracting the mean and dividing by the standard deviation of our sample at baseline. Patients with zero T1:T2 were not included in the normalization but were assigned a value more extreme (-2) than the most extreme nonzero standardized T1:T2 (-1.92), similar to the MS Functional Composite scale . In the original paper, a more extreme value of -2.5 was used, but we found that -2 led to better performance when investigating changes over time in the MRDSS. Results using arcsin-square root transformation for the T1:T2 were similar and did not alter conclusions (data not shown). The individual standardized scores were equally weighted and summed for each patient: zMRDSS = zT2LV + [zT1LV:T2LV] − zBPF (z = [raw score−mean]/standard deviation). Each subject’s composite value was then transformed to a continuous 0 to 10 MRDSS score (zero is lowest severity) based on the highest and lowest zMRDSS values.
A Wilcoxon signed rank test was used to investigate the longitudinal change in the MRDSS and each individual MRI marker in the entire sample and in the RRMS and SPMS patients separately. The difference in the longitudinal change between the RRMS and SPMS patients was investigated using a Wilcoxon rank sum test. The association between baseline EDSS and MRI measures and the change in the MRDSS was assessed using Spearman’s (EDSS vs. change in MRDSS) and Pearson’s correlation coefficient (MRI factors vs. change in MRDSS). Logistic regression was used to estimate the effect of baseline MRI variables and change in MRI variables on the probability of progression. Baseline EDSS was included as a categorical covariate in all prediction models with categories (0, 1-1.5, 2-2.5, 3-3.5, 4-5.5 and ≥ 6) for the entire group, categories (0, 1-1.5, 2-2.5, 3-3.5 and ≥ 4) for RRMS, and categories (0-3.5, 4-5.5, ≥ 6) for progressive patients. EDSS was also treated as a continuous covariate in secondary analyses, and the results were largely unchanged. Models controlling for length of follow-up and baseline age were also fit. Finally, the MRI characteristics of RRMS and SPMS patients at last visit were compared using a Wilcoxon rank sum test for each measure. A p-value less than 0.05 was considered significant; a p value less than 0.1 but greater than 0.05 was considered a trend.
The clinical data are summarized in Table 1. The EDSS scores for the group during the observation period remained stable (from 3.2 ± 2.0 to 3.4 ± 2.2), although 21 (25%) patients developed progression of physical disability during the 3 year follow up period.
Table 2 shows the change in the MRDSS as well as the individual MRI measures over the three year follow-up in the whole cohort and within the clinical phenotype groups. For the BPF, T1/T2 and MRDSS, a statistically significant worsening of disease severity was observed on each (decreasing BPF and increasing T1/T2 or MRDSS). The MRDSS demonstrated the largest magnitude of change in terms of effect size over time in comparison to individual variables and lowest p value in the whole patient cohort (estimated mean change ± SD, 0.64 ± 0.80; p = 1.5 × 10-10). Within the MS clinical phenotypes, a larger magnitude of worsening in the MRDSS was observed in the RR compared to the SP patients (0.74 ± 0.60 vs. 0.27 ± 0.48; p = 0.0027), but the change over time in the MRDSS was statistically significant in both groups (pRRMS=1.2 × 10-9, pSPMS = 0.021). In contrast, BPF and T1/T2 worsened significantly in the RR but not the SP group. Furthermore, when comparing the MRI change over time between the RR and SP groups, only the MRDSS showed a significant difference between groups among all MRI variables. Throughout every aspect of this analysis, T2 lesion volume did not significantly change during the observation period in the entire population or in either subgroup. Analysis of individual raw vs. standardized MRI measures had little effect on these results (Table 2). In terms of the change relative to the baseline value, the distribution of the change in the standardized BPF and T2 lesion volume was approximately equal for all baseline values. Conversely, patients with low baseline values showed greater change in the T1/T2 and MRDSS compared to patients with high baseline values (data not shown).
We tested the relationship between baseline data (both clinical and MRI measures) and subsequent on-study change in MRDSS (Table 3). This was performed in the whole cohort and then in specific MS clinical subgroups. Significant associations were observed in the whole cohort, and a lower baseline MRDSS, lower baseline T2LV, and lower T1/T2 were associated with larger worsening on MRDSS. However, these correlations remained in the weak to moderate range. It appeared that clinical phenotype group influenced these associations. When examining whether the RR or SP phenotype groups drove these significant correlations, the baseline MRDSS showed a significant but weak association with the change in MRDSS in the RR group only. The baseline MRDSS had similar correlation in the PP/SP group although the correlation was not statistically significant due to smaller sample sizes. Baseline T1/T2 was significantly associated with the change in both the RR and the combined PP/SP groups. Furthermore, a lower baseline BPF was associated with a higher risk of subsequent progression on MRDSS but only in the SP group. Thus, these results were mixed in that for the RR group less MRI-defined disease at baseline, whether measured by the MRDSS or its subcomponents, predicts a higher risk of on-study worsening of MRDSS. Conversely, for SP patients, an opposite effect was seen for baseline BPF in that worse disease (lower BPF) at baseline was congruent with the subsequent direction of change in MRDSS.
While these were reported from a larger cohort previously , we again tested the relationship between baseline MRI data (MRDSS and all individual subcomponent MRI measures) and subsequent on-study risk for developing sustained progression in disability on the EDSS scale in this smaller cohort. The current cohort was smaller than the previously published cohort because of the requirement for inclusion in the present study that patients have both baseline and 3 year follow-up MRI scans available from the same scanning platform and protocol. This was performed in the whole cohort and then on an exploratory basis among various clinical phenotype subgroups. Table 4 shows the results comparing baseline MRDSS and individual standardized MRI measures as candidate predictors of on-study clinical worsening. When examining the whole cohort or subgroups of patients, the only significant association was seen for a lower baseline T2LV and a higher risk of progressing on EDSS. However, this relationship was in the weak to moderate range. It appeared that clinical phenotype influenced this association as it was driven by patients with SP and PP MS (particularly PP). In agreement with these findings, the pooled PP/SP group also showed trends towards a lower baseline MRI disease severity (for MRDSS and other individual MRI subcomponents) predicting a higher risk of progressing on EDSS. Overall, these results were in agreement with models predicting change in MRDSS in that for particular clinical phenotypes, less cerebral MRI-defined disease at baseline predicted a higher risk of on-study disease progression on EDSS.
Table 5 demonstrates the association between the change in the MRI measures and the probability of developing sustained progression of physical disability during the observation period. There were no significant findings or trends in the whole cohort (Table 5) or when examining just RR or SP patients (all p > 0.1; data not shown). All results were similar after controlling for age and length of follow-up and when considering non-standardized (raw) individual measures (data not shown).
Data are summarized in Table 6, There were statistically significant differences between RRMS and SPMS with regard to MRDSS (p = 0.0069), zBPF (p = 0.038), and zT1:T2 ratio (p = 0.0059), all in the expected direction – SP had higher severity of these MRI measures than RR patients. However, there were no statistically significant differences in the zT2 lesion volume between groups (p = 0.13). All results were similar when considering non-standardized (raw) individual measures (data not shown). The significant differences between the groups on the MRI measures confirmed the findings in our previous article  in which similar significant MRI-clinical relationships were observed at baseline, thus demonstrating internal consistency.
In the present study, we tested the strength of the MRDSS against individual MRI measures for sensitivity to longitudinal change in a relatively small group of patients with MS. The major finding of our study was that MRDSS showed improved sensitivity to longitudinal change vs. individual MRI measures of lesions and atrophy. MRDSS was better able to differentiate RR and SP patients than individual MRI components longitudinally and was the only variable that significantly changed in SP patients over time. Our data further support the potential utility of our neuroimaging-based composite scale to comprehensively evaluate disease severity in MS.
Our study directly compared the longitudinal performance of the MRDSS with two brain MRI measures of lesions (T1 hypointense, T2 hyperintense) and whole brain atrophy. While the individual MRI measures also showed some degree of longitudinal change, the MRDSS showed significant change in the entire patient cohort and within the patient subgroups to a better extent than the individual MRI measures, suggesting its sensitivity in both the inflammatory and neurodegenerative stages of the disease. Unlike the individual MRI measures, MRDSS was the only variable that significantly worsened in both RR and SP groups. While BPF and T1 hypointense lesions also showed a statistically significant change over time, there was no significant difference in the amount of change between the RR and SP phenotypes. Furthermore, T2 lesions did not significantly change over time in the entire patient cohort nor within the patient subgroups. Similar results regarding lesions (T1 hypointense, T2 hyperintense) and whole brain atrophy have been reported in prior studies [17-19].
With regard to the relationship between baseline data (both clinical and MRI measures) and subsequent on-study change in MRDSS, there was a tendency towards larger worsening on MRDSS with a lower baseline MRDSS, lower baseline T2LV, and lower T1/T2. This result was driven in part by the upper limit imposed by the scale; however, even when the zMRDSS was used rather than the 0-10 scale, a similar correlation was observed. The patients with the largest changes in the MRDSS were patients with no T1 lesion volume at baseline. To address this limitation, we adjusted the z-score for these patients compared to our original paper, but the impact of these patients remained. When the alternative transformation for the T1/T2 (arc-sine square root) was investigated, the correlation between baseline MRDSS and change in MRDSS was reduced, but it remained statistically significant. Future work should involve considering other statistical transformations of the T1 lesion volume that may further reduce the impact of patients who change from no T1 lesion volume.
We also evaluated the relationship between clinical and MRI data. The pattern of MRDSS-clinical relationships at exit in the current study is consistent with our previously reported baseline data suggesting that MRDSS has good internal consistency and concurrent validity . On the other hand, we found some unexpected results in terms of its predictive validity for disability progression. Our previous study found that baseline MRDSS was associated with the risk of developing sustained progression of physical disability three years later . However, this association was not statistically significant in our reduced sample. In addition, the on-study change in MRDSS was also not related to clinical progression. Thus, it seems that our scale has an unreliable strength for predicting clinical change in terms of disability in the current form and needs further refinements. Similar to MRDSS, individual MRI measures also showed unreliable strength for predicting clinical change. This is in agreement with prior studies demonstrating that these measures show relatively weak correlations with clinical progression [1, 17-19]. Of note, we found an inverse correlation between baseline T2LV and a higher risk of progressing on EDSS, driven largely by the SPMS/PPMS patients. One potential explanation for this observation is that SPMS/PPMS patients with low T2 lesion volume in our sample may have lesions in clinical eloquent areas such that even though the total volume is low, the impact on clinical features may be high. Of note, there is a well known dissociation between T2 lesion volume and clinical findings [1, 18-21], which is not entirely surprising given that T2 lesion volume is not completely representative of all features of disease progression, such as neurodegeneration. Nonetheless, given the lack of association with change in T2 lesion volume, our results must be considered preliminary and require validation in larger samples. The tendency for SP and PP patients with lower MRI disease burden to progress was also seen with the MRDSS and T1/T2 measures.
Several other limitations were also evident in our study. A more even balance of primary and secondary progressive patients with a larger sample size would afford the opportunity to continue to evaluate the MRDSS across a wide spectrum of disease states. This would strengthen any conclusions reached regarding its utility in the general MS population as well as enabling it to be used as a potential marker in clinical trials with smaller sample sizes. Additional measurements detailing diffuse damage in the normal appearing brain tissue (such as magnetization transfer, diffusion imaging, or MR spectroscopy) [18,19,21], would help refine the scale and potentially allow for greater predictability even earlier in the disease course prior to accumulation of atrophy. The inclusion of gray matter atrophy and spinal cord damage may also be a very informative addition to our scale, given that involvement of these areas of the CNS are common and clinically relevant in MS [22,23]. Furthermore, the use of 3T MRI  may also increase the sensitivity of structural changes detected and may strengthen the validity of the composite scale. We are currently collecting data to refine the MRDSS scale along these lines, which we will present in future publications.
In further considering directions for future studies, investigating correlation of the MRDSS with other disability and quality of life variables, which are not heavily represented in the EDSS score, such as cognition and fatigue, would help to better understand the validity of the scale. Unfortunately, quality of life measures, including cognitive and fatigue scores, were not available for these cases. Assessment of the MRDSS over shorter (less than 2 years) and longer (greater than 6 years) periods of time would be useful to determine the dynamic longitudinal range of the scale. It would be particularly interesting to test if the MRDSS can predict the conversion from RR to SP MS in individual cases.
In conclusion, after defining the MRDSS scale in our original work and obtaining some information on its validity , we now have shown the longitudinal characteristics of the scale, including the sensitivity to change over time. Our study suggests that the composite MRDSS scale may be able to capture the destructive aspects of the disease with more longitudinal sensitivity than derived from conventional MRI lesion and whole brain atrophy measurements. Further studies are warranted to confirm and extend our findings regarding the potential utility of the MRDSS.
This study was supported by the National Institutes of Health (1R01NS055083-01) and National Multiple Sclerosis Society (RG3705A1; RG3798A2). We thank Ms. Sophie Tamm for assistance with manuscript preparation.
These findings were presented in preliminary form at the 1st joint meeting of ACTRIMS (the Americas Committee on Treatment and Research in Multiple Sclerosis) and its counterparts in Europe and Latin America: ECTRIMS and LACTRIMS, Montreal, September 17-20, 2008 and the 62nd annual meeting of the American Academy of Neurology, Toronto, Canada, April 10-17, 2010.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.