The retrospective part of this study aimed to build up and evaluate a model combining neurophysiologic and clinical evaluations to obtain a reliable prediction of the progression of disability in MS patients with particular attention to the role of evoked potentials. A summary score considering both abnormalities of latencies as well as of morphology and of amplitude symmetry of the principal EP components
] was utilized as input in the analyses to assess the prognostic value of EPs. Since none of the recent works have compared the different EP score systems which have been proposed in the last decade, consistent with a previous work of our group
], we chose the scoring system preserving most of the EP information
]. The latter allows a maximum of 6 points for each side and for any of 5 EP modalities, as opposed to only 4 points in Leocani et al.
] and 3 in Kallman et al.
At FNE, our patients showed a correlation between EP score and EDSS which was lower compared to that reported by Invernizzi et al., Leocani et al., and Kallman et al.’s group 2
], but greater than that reported by Jung et al. and Kallman et al.’s group 1
]. As shown in Figure
, this apparent inconsistency is likely due to two important factors. First, the correlation between EDSS and EP score depends on the severity of disability already present at FNE insofar as the correlation tends to increase as disability builds up (Figure
). As already suggested by Leocani et al.
], this is a pattern toward a ceiling effect that results from the inclusion of subjects with more severe disability and a progressive disease course.
Figure 6 Correlations between EDSS and EP scores in the last 6 years literature. The correlations between EDSS and EP scores reflect the researchers’ choice of patients selection criteria. 1: Kallmann et al. 2006
 group 1, F_EDSS =2.0, range (0–4). (more ...)
Second, disease duration also impacts the degree of clinical disability and, consequently, the correlation between clinical and subclinical measures. This is clearly shown in Figure
(x-axis) where the correlation between EDSS and EP score tends to grow as the disease duration increases because so does the F_EDSS. The effect of the disease duration showed up more clearly when we analyzed the correlation between F_EP score and F_EDSS after dividing our sample by the median value of F_EDSS (Table
). The correlation was statistically significant only in the higher F_EDSS subgroup. In contrast to Invernizzi et al.
], this finding suggests that the difference in disease duration between the 2 subgroups thus identified (3.7 yrs, 5.9 yrs; p=0.005) was decisive. Moreover, when we stratified by the time from the first symptom to FNE, only patients assessed more than 6 years after disease onset showed a moderate correlation (ρ=0.47) between the F_EP score and F_EDSS, while at the last follow-up the correlations increased approximately to the same extent also in the two remaining subgroups. These findings are in line with Kallmann et al.
] who found a significant correlation between F_EDSS and F_EP score in patients with a long disease duration at FNE (mean 9.6 years), while in the group with a shorter disease duration (mean 1.2 years) the correlation was not significant.
Moreover, Hughes et al.
] recently confirmed that the prediction at 5 and 10 years based on the EDSS is higher when applied to patients with 4–5 years of disease duration. Findings from other fields of research such as MRI are also affected by similar choices: in Onu et al.’s work
], the authors admitted that significant correlations between MRI and EDSS depended on the inclusion of patients with high EDSS (0–5.5) and a long disease duration (mean = 9.3 years), whereas in a work by Metwalli et al.
], who studied patients with lower EDSS (0–3) and shorter disease durations (mean = 1.2 years), no significant correlations were found. Though it is certainly true that sampling the entire range of the EDSS can lead to better coefficients whatever the correlate
], the opportunity of such a choice is questionable as the most critical medical decisions are those made in the early phases of MS.
An early MS diagnosis, in addition to being preferred by MS patients
], generally also imply low F_EDSS scores; indeed, as recently reviewed by C. Renoux
], a shorter time from onset to moderate disability (DSS 3.0 or 4.0) has been associated with a faster rate to severe disability; moreover, worsening in MS becomes more common after EDSS 4.0
], further remarking the importance to include only patients with low F_EDSS if useful clinical predictions are to be made. Accordingly, we could confirm that the variable TT2 represents the early progression of the disease and can predict further worsening. Although it was set at the value of zero for a part of our patient population (35% already with EDSS 2.0 at FNE), the variable TT2 was significant both in bivariate and multivariate analyses and also a significant correlation with L_EDSS (−0.55; p < 0.0001) was rather stable. It is worth reminding here that a threshold of EDSS 2.0 was also recently applied to the definition of BMS (Benign Multiple Sclerosis) together with a disease duration equal or longer than 10 years
Consequently, a logistic regression model including the F_EP score as well as the TT2 variable was applied to a sample of 143 RRMS patients having a mean disease duration of 4.5 years and a mean F_EDSS of 1.3. The aim of the model was to predict the progression of disability defined as the risk of reaching the threshold of EDSS 3.5
]. Likelihood ratio tests successfully assessed the relevance of the overall model (likelihood ratio χ² = 45.84, p<0.0001) and of TT2 and EP score variables (likelihood ratio χ² = 22.69, p<0.0001 and likelihood ratio χ² = 14.49, p=0.0001 respectively). The multivariate approach was supported also by a significant improvement of the AUC compared with that previously obtained with single variables (AUC: 0.8135, p<0.02). A first consideration is that the EP score has a strong prognostic value when EDSS 2.0 is reached in 3–5 years from FNE (Figure
), since the difference in the probabilities of further worsening between the two subgroups (i.e. above or below the median value of the EP score) is nearly 30%. This means that when TT2 is short, the probability of worsening switches from 30% to over 60% by having , for example, 4 or 8 points of the EP score respectively.
On the other hand, high EP scores (over 20–25 points) or a long time to reach EDSS 2.0 (over 10–15 years) were not associated with very different probabilities of worsening among the subgroups obtained by dividing the whole sample by the median value of TT2 (Figure
) or of the EP score (Figure
). First, this indicates that data of patients with severe subclinical damage (i.e. high EP scores) and patients with progressive MS courses should not be considered for predictive models if the aim is an early identification of different patterns of progression; indeed, patients with these characteristics are well known to be candidates to clinical worsening and no prediction is needed. Second, patients with long disease duration at FNE have to be excluded if long disease duration is not associated to clinical stability. We have shown (Figure
) that since mean disease duration is related to mean EDSS, worsening is to be expected; by the same token, if a long disease duration is associated to clinical stability, like in BMS, further worsening becomes unlikely and again no prediction is really needed.
The EP score and TT2 have the greatest utility when their values are able to show different patterns of worsening. By dividing our sample in 4 groups, namely (a) high F_EP score + short TT2 (b) high F_EP score + long TT2; (c) low F_EP score + short TT2 and (d) low F_EP score + long TT2, we showed that our model can identify separate patterns. Groups (a) and (b) have in common a high subclinical impairment and therefore are candidates to clinical worsening whatever the conversion time to clinical disability (as shown by overlapping solid and dotted lines approximately in the last 30 values of the x-axis in Figure
). On the other hand, in the case of groups (c) and (d) characterized by a low EP score, the difference in TT2 determines a higher probability of worsening to group (c) than to group (d) (solid and dotted lines in the first 5 values of the x-axis in Figure
). A high subclinical impairment (i.e. high EP score) is likely to reflect a massive attack of MS to the central nervous system and the probability that the adaptive brain responses will not be sufficient to compensate the damages; thus the disability will likely worsen whatever its speed of progression. On the other hand, a low subclinical impairment could enable compensation processes. In this case, even small differences in EP score and speed of disability progression (TT2) may identify patients in whom compensatory processes can be successfully enabled, like in benign MS
]. Accordingly, Figure
shows that as the time from FNE to EDSS 2.0 increases, the chance of further worsening decreases following two probability curves depending on the F_EP score; when the EP score is above the median value, the probability takes about 15 years without clinical progression to become negligible (<10%). On the other hand, when the EP score is below the median value, the probability of further worsening approaches 0 after only 4–5 years without clinical progression. This last finding gives the EPs a possible role in the debate concerning the definition of BMS
]. According to an earlier EDSS-based definition of BMS a patient with a final EDSS ≤ 3.0 was declared as “benign” after 15 years from onset
]. Recently, the diagnostic criteria for BMS have been redefined as an EDSS score ≤ 2.0 after a disease duration of at least 10 years
]. However, the debate has been reopened by recent reports which state that cognitive impairment was detected in 45% of a large group of patients fulfilling traditional criteria for BMS
], but also by the suggestion that neuropsychological tests can contribute to a more accurate identification of “true” BMS
]. In this study we provide evidence that the EP score may be an interesting covariate for the definition of BMS. Indeed, the risk of worsening is almost null in patients with a F_EP score lower than 5 points and a time to EDSS 2.0 of 4 or more years (dotted curve in Figure
, representing 22% of the patient population). As shown in Figure
, our results are in line with the earlier and last stringent definition of BMS because the probability goes under about 10% after 10 years and under 5% after 15 years; furthermore, the dotted curve indicates that for patients with low F_EP score, the risk becomes less than 10% after about 4–5 years if the EDSS remains below 2.0, and already tends to zero between 5 and 10 years. This finding suggests that the information drawn from the EPs could substantially improve the sensitivity and decrease the time needed to make a diagnosis of BMS according with the last criteria
]. We acknowledge however, that the failure to include the motor evoked potentials - not available for the entire patient population and which have been shown to be significantly correlated to the EDSS score
] - and other potentially useful covariates such as neuropsychological tests
] could have reduced the accuracy of our prediction. It is our belief that a protocol combining all the variables of interest for the prediction of disability should be encouraged.
As recently underlined by Schlaeger et al.
], predictions based on EPs do not seem to be influenced by immunomodulatory treatments. Vucic S  suggested that this fact may imply an element of disease irreversibility already at the time of initial assessment . If it was true, it would be advisable to reconsider the therapeutic and monitoring approach to BMS in the light of early predictions based on appropriate multivariate models. This also appears to support the need for further studies employing sensory and motor EPs together with neuropsychological tests to provide a more reliable prediction of BMS.
In the prospective part of this work we evaluated the risk of progression to EDSS 3.5 by applying our model to data partially obtained during the period 2009–2011. The outcome was correctly predicted by the model in 16 of 17 patients who completed the two years follow-up; the subject who was misclassified received a prediction close to 0.5. To improve the usefulness of the model and reduce false negatives, we are paying special attention to patients with a predicted probability in the range between 0.4 and 0.6. Four of the 33 patients who were assessed only once during 2009–2011 fulfilled this requirement and are now being closely monitored.