|Home | About | Journals | Submit | Contact Us | Français|
Combine MRI measures of disease severity into a multiple sclerosis (MS) composite score.
Retrospectively analysis of prospectively collected data
Community-based and referral subspecialty clinic in an academic hospital
103 patients with MS [age (mean ± SD) 42.7 ± 9.1 years, disease duration 14.1 ± 9.2 years, EDSS score 3.3 ± 2.2, 60% (n=62) relapsing-remitting (RR), 32% (n=33) secondary progressive (SP), and 8% (n=8) primary progressive].
Brain MRI measures included baseline T2 hyperintense (T2LV) and T1 hypointense (T1LV) lesion volume, and brain parenchymal fraction (BPF), a marker of global atrophy. The ratio of T1LV to T2LV assessed lesion severity. A Magnetic Resonance Disease Severity Scale (MRDSS) score, on a continuous scale from 0 to 10, was derived for each patient using T2LV, BPF, and T1/T2 ratio.
MRDSS score averaged 5.1 ± 2.6. Baseline MRI and EDSS correlations were moderate for BPF, T1/T2, and MRDSS and weak for T2LV. MRDSS showed a larger effect size than any of the individual MRI components in distinguishing RR from SP patients. Models containing either T2LV or MRDSS were significantly associated with EDSS disability progression during the 3.2 ± 0.3 year observation period, when adjusting for baseline EDSS score.
Combining brain MRI lesion and atrophy measures can predict clinical progression in patients with MS and provides the basis to develop an MRI-based continuous scale as a marker of MS disease severity.
Conventional MRI-based brain atrophy and lesion measures serve as markers of the damage occurring in multiple sclerosis (MS) but show weak relationships to clinical status or progression.1 Conventional MRI lesions measures do not capture diffuse white matter pathology,2 and plateau with advancing disease.3 Measures of atrophy rely on capturing downstream destructive effects, rather than early focal changes.4 Individual MRI measures are also limited by the lack of specificity.5
Composite MRI measurements offer a new approach to link MRI with clinical or therapeutic outcomes.6–9 Our goal was to describe the severity of brain MRI involvement by a novel combination of three measures:
The rationale behind choosing these measures is, first, they have relatively low collinearity to ensure the inclusion of separate disease components. Second: to capture three relevant aspects of the disease: overt lesions regardless of underlying severity (T2 lesions), the destructive potential of lesions (T1 to T2 ratio), and diffuse neurodegeneration. Third: to address the clinical-MRI paradox that has been noted in MS studies. Fourth: to develop a scale measuring disease severity, not activity. Thus, gadolinium-enhancing lesions were not included. This is analogous to the Expanded Disability Status Scale (EDSS), measuring current severity not relapses or acute activity.10 Fifth: we chose a priori to equally weight the components to ensure that the scale was descriptive of the cerebral state and could be used for many different potential comparisons. We refer to this composite as the Magnetic Resonance Disease Severity Scale (MRDSS). We tested the validity of MRDSS vs. clinical status in a three-year longitudinal study of patients representing a wide range of disability, disease duration, and clinical phenotype.
We retrospectively identified 103 patients (Table 1) from the Comprehensive Longitudinal Investigation of MS at Brigham study.11 Inclusion criteria: 1) age 18–60; 2) Brain MRI performed using our MS-designated protocol; 3) baseline EDSS scoring10 by an MS-neurologist within three months of MRI; 4) follow-up EDSS scoring 2.5 to 3.5 years later; 5) Established MS12 at baseline [relapsing-remitting (RR), secondary progressive (SP) or primary progressive (PP)].13 Clinically isolated syndromes were not included, as many of them will not develop MS.14 All patients except 12 were treated with immunotherapy during the observation period (Table 1). Baseline characteristics (Table 1) were comparable to a large population-based MS cohort.15 This study was approved by our institutional review board.
Ninety-six patients (93%) had clinical follow-up 3.2 ± 0.3 years later. Follow-up disability was stable or progressed [1-point progression on follow-up EDSS if the baseline score was less than 6 (or 0.5 point progression if baseline was 6 or higher), sustained for three months] (Table 1).
Patients underwent 1.5T MRI with axial T1-weighted spin-echo (TR/TE: 725/20) and dual-echo T2-weighted (3000/80/30) images [voxel size 0.9375 × 0.9375 × 3 mm]. T1-imaging was repeated five minutes after 0.1 mmole/kg intravenous gadolinium. Our analysis, based on a hypothesis of a more informative composite measure, was tested without prior knowledge of the component MRI measures in the sample.
TDS+16 determined whole brain T2 hyperintense lesion volume (T2LV) and brain parenchymal fraction (BPF, an estimate of whole brain atrophy)17 from the dual-echo T2-weighted images using an automated template driven approach with partial volume correction. 16 Although FLAIR images would have likely yielded a higher lesion load than dual echo T2 images, we have not yet developed an automated FLAIR segmentation method and thus relied on our established method. T1 hypointense lesions (black holes) were the consensus of two trained observers,18 and showed at least partial hyperintensity on T2 images, but no gadolinium-enhancement (to reduce the likelihood of including transient lesions).19
TDS+ achieves an intra-class correlation of 0.994, inter-scan coefficient of variation (COV) of 4.98%, and volume bias of 0.01±0.68 mL.16 Our T1 hypointense lesion volume measurement shows intraobserver and interobserver COVs of 1.7% and 4.5%.18 In the current cohort, MRDSS achieved intraobserver and interobserver COVs of 2.3% and 4.4%.
All calculations necessary to derive MRDSS score from the MRI data were scaled using the cohort studied, not from external data. MRI data were rounded to two decimal places. When non-rounded data was used instead, the results did not change (data not shown). We did not use absolute T1 hypointense lesion volume for the MRDSS because of high collinearity with T2LV (r=0.79). Because the distributions of T2LV and T1/T2 were skewed, log (T2LV) or logistic (T1/T2) transformation was performed. T2LV, BPF and T1/T2 were then standardized (z*) by subtracting the mean and dividing by standard deviation. The individual components of the MRDSS had only moderate inter-correlation with one another, both before and after transformations (r=0.53–0.58). Patients with zero T1 volume were not included in the normalization, but were assigned a value more extreme (−2.5) than the most extreme non-zero standardized T1/T2 (−1.92), similar to the MS Functional Composite.20 We selected the more extreme value because patients with zero T1 hypointensities have meaningfully less severe disease than patients with a small amount. The value −2.5 approximated the magnitude of the difference between the two smallest values on the standardized scale. The individual standardized scores were equally weighted and summed for each patient:
Each subject’s composite value was then transformed to a continuous 0–10 MRDSS score (zero is lowest severity).
Spearman correlation or the Wilcoxon test assessed the association between baseline MRDSS score and other variables. Univariate logistic regression tested the association between baseline MRI (BPF, T2, T1/T2 and MRDSS) and 3-year clinical progression. Multivariate regression controlled for co-variates. The area under the curve (AUC) in the receiver operator characteristic (ROC) curve investigated the predictive ability of the model. The 95% confidence interval for AUC was the percentiles of a bootstrap distribution.21,22 Comparing MRDSS to the raw MRI data demonstrated the value of the MRDSS over conventional approaches, while comparisons to standardized scores investigated the effect of combining measures into a composite scale. A p<0.05 was considered significant.
The distribution of MRDSS scores is shown in Figure 1 and Tables 1 and and22 and representative MRI scans are shown in Figure 2. During follow-up, 24 patients developed sustained progression of disability (Table 1); this included three patients of 12 (25%) not receiving disease-modifying therapy.
Baseline MRI-EDSS comparisons showed moderate correlations for BPF (r=−0.47, p<0.001), T1/T2 (r=−0.46, p<0.001), and MRDSS (r=−0.48, p<0.001) and weak correlation for T2LV (r=−0.25, p<0.05). MRI correlations with disease duration were either weakly significant (BPF), or non-significant (T2LV, T1/T2, MRDSS) (data not shown). All MRI measures differed between RR and SP groups, but more so for MRDSS (Table 2).
When adjusting for baseline EDSS score, only MRDSS and T2LV showed a significant association with clinical progression (Table 3). Standardized T2LV showed a closer association with clinical progression than did non-standardized T2LV, indicating that standardization contributed to the improvement in MRDSS vs. individual MRI measures. In the ROC curves for the ability of MRI to predict clinical progression, both T2LV and MRDSS showed higher AUCs than BPF or T1/T2 (Table 3). T2LV showed the highest AUC (best predictive ability) but overlapped with the 95% confidence intervals for T2LV and MRDSS (Table 3). Adding disease duration as a covariate or using the standardized vs. raw EDSS score, did not affect the models (data not shown).
The MRDSS encompasses three equally weighted whole brain MRI measures of lesions and atrophy. T1 hypointense lesions are expressed as a ratio to T2LV because only a subset of T2 hyperintense lesions are T1 hypointense (particularly destructive lesions).23 We have developed and tested this scale in a three-year longitudinal MS cohort with a wide range of disease duration and disease severity, including minimal, mild, moderate, and severe physical disability. MRDSS has concurrent validity when compared with clinical status. MRDSS showed the largest effect size in differentiating RR and SP MS groups compared to the individual MRI components. MRDSS predicts the risk of developing sustained progression of physical disability three years later. However, the MRDSS offers only a modest improvement compared with current metrics.
While the individual MRI measures also some association with disability, the MRDSS showed the unique combination of both concurrent and predictive validity. For example, while baseline MRDSS, BPF and T1/T2 were moderately correlated with baseline EDSS score, T2LV showed a weak correlation. Furthermore, BPF and T1/T2 were not significantly associated with clinical progression in the regression modeling. In the ROC analysis based on the regression models, all of the confidence intervals were wide, but the estimates were largest for T2LV, followed by MRDSS.
While these findings support the utility of the MRDSS. It is likely that our sample is too small to assure generalizability; i.e., the 0–10 scaling of the MRDSS derived from the current cohort may not apply to a larger population. This was not a natural history study because most patients were receiving therapy during the observation period which could impact some MRI measures differently and confound the results. Disability progression was defined by EDSS worsening, which is limited by nonlinearity, variability and heavy weighting towards ambulation.10 Future studies should test the MRDSS in a larger sample size and assess whether it is related to other clinical manifestations such as cognition.
Previous MS studies combined brain MRI data into composite or multiparametric assessments. The z4 composite6,8,9 combines T2LV, T1 hypointense lesion volume, BPF, and gadolinium-enhancing lesions. In a follow-up study of patients with RR or SP MS, the z4 at 3 months predicted the change in physical disability (p <0.02).6 Correlations with EDSS were not reported nor was it reported whether the z4 had better association with clinical change than did the individual MRI measures. In a cross-sectional study of patients with RR MS, z4 differentiated patients who had been stable, worse, or improved over the previous two years; however, brain atrophy showed a stronger association with clinical status than did z4.8 There are notable differences between our MRDSS and the z4. First, we did not include gadolinium-enhancing lesions as part of the scale, based on the rationale presented in the Introduction section. Secondly, gadolinium enhancement poorly predicts sustained changes in physical disability, while the other MRI measures assessed in our study have shown such predictive value.1 T1 hypointense lesions volume is an absolute measure in the z4 but is divided by T2LV in MRDSS. We chose the T1/T2 ratio in part because of the collinearity problems related to the high correlation between T1 hypointense lesion volume and T2LV.
Another group7 combined brain T2LV, T1 hypointense lesion volume, magnetization transfer, diffusion, and MR spectroscopy data in a cross-sectional study of 23 patients. The correlations between EDSS and many of the individual MRI metrics were weak. In contrast, composite MRI scores showed better associations with EDSS (p 0.004 to 0.0001). In contrast to the present study, brain atrophy was not considered nor was the predictive relationship between MRI and longitudinal clinical change.
More validation work is necessary to evaluate the MRDSS. We are in the process of assessing longitudinal MRDSS change, which we shall report in subsequent publications. We are planning to add MRI measures of occult damage,2 such as magnetization transfer, diffusion imaging, or MR spectroscopy, or MRI measures of gray matter24 and spinal cord damage25 to the MRDSS. The inclusion of a spinal cord metric, for example, might increase the predictive value for disease progression, especially if the outcome is EDSS score.
This study was presented at the 59th annual meeting of the American Academy of Neurology and was supported by research grants from the National Institutes of Health (1R01NS055083-01) and National Multiple Sclerosis Society (RG3705A1; RG3798A2). The funders had no role in design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript. We thank Ms. Sophie Tamm for assistance with manuscript preparation. Dr. Bakshi had full access to all data and takes responsibility for the integrity of the data and accuracy of the analysis.
Disclosure: The authors report no conflicts of interest
Statistical analysis: Conducted by co-author Dr. Brian Healy.