|Home | About | Journals | Submit | Contact Us | Français|
Multiple Sclerosis (MS) is a chronic progressive and disabling disease of the central nervous system with dramatic variations in the combination and severity of symptoms it can produce. The lack of reliable disease specific health-related quality of life (HRQL) measures for use in clinical trials prompted the development of the Neurology Quality of Life (Neuro-QOL) instrument, which includes 13 scales that assess physical, emotional, cognitive, and social domains, for use in a variety of neurological illnesses.
Initial assessment of the reliability and validation of the Neuro-QOL short forms (SFs) in MS.
We assessed reliability, concurrent validity, known groups validity, and responsiveness between cross-sectional and longitudinal data in 161 recruited MS subjects.
Internal consistency was high for all measures (α = 0.81 - 0.95) and ICCs were within acceptable range (0.76 - 0.91), concurrent and known groups validity were highest with the Global HRQL question. Longitudinal assessment was limited by the lack of disease progression in the group.
The Neuro-QOL SFs demonstrate good internal consistency, test-re-test reliability, and concurrent and know groups validity in this MS population, supporting the validity of Neuro-QOL in adults with MS
Multiple sclerosis (MS) is a progressive, degenerative disease of the central nervous system1, affecting between 250,000 to 350,000 people in the United States with about 10,400 new diagnoses yearly2. MS typically manifests in young adulthood and follows an unpredictable, widely varying clinical course. Relapses and progression throughout the disease course result in accumulating disability and a profound impact on health-related quality of life (HRQL) and diminished physical, social, and cognitive functioning compared to other chronic disorders3, 4.
The impact of MS on HRQL has prompted the development and implementation of many disease-specific measures in clinical trials, however with limited success.5 Variability in the interventions studied and measures employed, inconsistent use of psychometric methods to develop and implement measures make synthesizing results and determining robustness and validity of available measures challenging. The NIH, FDA, and other federal agencies are interested in evaluating specific aspects of function that are comparable across interventions and diseases. As such, there is a push for implementing measures that cross physical, emotional, and social functioning, especially those developed using modern psychometric techniques such as item response theory (IRT), and can be administered using a variety of formats.
The goal of the NINDS-funded Neuro-QoL project was to develop a core set of universally applicable HRQL questions for patients with chronic neurological conditions supplemented with disease-specific questions. The project underwent multiple phases leading up to the final clinical validation of the Neuro-QoL item banks and associated short forms (SFs), which are brief, fixed-length forms of 6-10 items each. The adult conditions included in Neuro-QoL were Amyotrophic Lateral Sclerosis (ALS), Adult Epilepsy, MS, Parkinson's Disease, and Stroke. Two pediatric conditions were also included: pediatric epilepsy and muscular dystrophy. The developers ensured clinical and psychometric validity of these tools by identifying the needs of the clinical research community6, 7, ensuring clinical and patient-driven evidence of importance and relevance of the selected QoL domains, and an expert-based consensus selection of the priority conditions8. Input from experts, caregivers, and patients determined the QoL domains included in the Neuro-QoL9. Item banks for each of the 13 domains were constructed and calibrated using Item Response Theory (IRT) in a sample of adults and children from the General Population (GP) and Clinical Sample (CS) of those suffering from neurological conditions as previously described9-11 with scoring and interpretation details available online via Neuro-QoL.12 In this paper we report the multi-site validation of the Neuro-QoL SFs with a clinical sample of adult MS patients.
We compared cross-sectional and longitudinal data from the Neuro-QoL SFs, MS-specific and generic legacy measures, and the PROMIS global health scale. Patients were recruited as part of a large multi-center study to validate Neuro-QoL measures across five adult and two pediatric neurological diseases. The five MS-sites (Cleveland Clinic, Dartmouth-Hitchcock Medical Center, NorthShore University Health System, University of Chicago Hospital, and the University of Texas Health Science Center at San Antonio) all had Institutional Review Board (IRB) approvals. Study inclusion criteria for adults were: Age 18 or older, English-speaking, community resident, and having sufficient cognitive ability to complete the informed consent process for each participating site. MS-specific inclusion criteria were clinician-confirmed diagnosis of MS. The subjects included were a convenience sample of consecutive clinic attendees.
Data were collected at 3 time points – Baseline, Day-7 and Month-6. Baseline and Month-6 data were collected at the clinical sites and Day-7 data were collected by phone. There was a 5-9 day window for the test-retest assessment (Day-7) and a 5-7 month window for the responsiveness assessment (Month-6). Baseline and Month-6 evaluations included the Neuro-QoL instruments, concurrent validity measures, and socio-demographic and clinical data forms that were self-administered by computer or conducted by study personnel. Clinician ratings and chart reviews were also conducted as part of these two visits. The 30-minute Day-7 visit was conducted to assess test-retest performance of the Neuro-QoL instruments administered to subjects over the phone by study personnel. All data were submitted to, and managed by, the coordinating center at Northwestern University.
Neuro-QoL SFs were validated in relation to generic and MS-specific measures of physical, mental and social functional status, and disease severity. These data were obtained by subject self-rated and clinical assessments (Table 1).
The 13 Neuro-QoL SFs (Figure 1) were self-administered at baseline and Month-6 and administered at Day-7 via phone. T-Scores were calculated with a T=50 indicating an average range of function compared to a reference population with a standard deviation (SD) of 10. G P T-Scores were the reference values for Neuro-QoL Positive Affect and Well-Being, Applied Cognition General Concerns, Applied Cognition-Executive Function, Lower Extremity (Mobility), Upper Extremity (Fine Motor, ADL), Ability to Participate in Social Roles and Activities, Satisfaction with Social Roles and Activities, Depression and Anxiety. CS T-Scores were the reference values for Stigma, Fatigue, Sleep disturbance, and Emotional and Behavioral Dyscontrol.
Karnofsky Performance Status Scale (KPS)13 rates functional impairment and diagnosis-independent breakdown of activity level across patients.
Symbol Digit Modalities Test (SDMT)14 tests information processing speed, visual acuity, and figural memory. The oral version was administered.
PROMIS 10-Item Global Health Scale (GHS)17 items include global ratings of the five primary PROMIS domains (physical function, fatigue, pain, emotional distress, and social health) and general health perceptions that cut across domains. It can be scored into a Global Physical Health component and Global Mental Health component.
Global HRQL Question (GHQ)18, a single item from the Functional Assessment of Chronic Illness Therapy (FACIT), “I am content with the quality of my life right now,” was used as a global measure of quality of life and assessment of convergent validity. It has five response options, ranging from “not at all” to “very much.”
Global Rating of Change (GRC)19 is an assessment of patients’ subjective evaluation of the amount of change they experienced over the six month period of the study. We have previously simplified the original 15-level response option to a 7-level option, now ranging from −3 = “very much worse” to +3 = “very much better”20. Individual GRC questions were developed for 6 life domains including Physical, Emotional, Cognitive, Social/Family and Symptomatic Well-being and overall quality of life .
KPS13 was assessed as described above.
The Multiple Sclerosis Functional Composite (MSFC)21 consists of three objective quantitative tests of neurological functioning: the 9-Hole Peg Test (9-HPT; upper extremity function), Timed-25-Foot Walk (T-25FW; mobility) and the Symbol Digit Modality Test-oral version (SDMT; cognition). Raw scores for each test were converted into Z-scores for each component; Z-scores were averaged to create an overall composite score, per instructions from the measure developers.
Functional Assessment of Multiple Sclerosis (FAMS)22 is a MS-specific measure to assess the convergent validity in this population including 44 items summarized into six subscales: mobility, symptoms, emotional well-being (depression), general contentment, thinking/fatigue, and family/social well-being.
Descriptive statistics were calculated for the Neuro-QoL, external validation measures, and socio-demographic and clinical variables at the baseline assessment and follow-up visits. When comparing MS patients’ Neuro-QoL T- scores with the GP and CS reference groups, score difference less than 0.5 SD units (i.e., 5 points, range 45-55) were considered to be within the range of the reference groups’ average.
Internal consistency was calculated for Neuro-QoL SF base-line scores using Cronbach's alpha coefficient with coefficient scores of .70 or higher considered acceptable. Test–Retest Reliability of Neuro-QoL SFs at baseline and Day-7 was assessed with Intraclass Correlation Coefficients (ICC) and corresponding 95% confidence intervals, with coefficient scores of .70 or higher considered acceptable.
Concurrent validity was assessed at baseline by calculating the Spearman rho correlation coefficients between Neuro-QoL SF scores and the generic and disease specific legacy measures. Interpretation guidelines for these correlations were: <0.30 = nominal; 0.30-0.49 = small; 0.5-0.69 = moderate; ≥0.70 = large. The strength of these correlations was hypothesized a priori to the analysis and results are based on those predictions, with correlations >0.50 considered acceptable. Known groups validity was assessed at baseline using analysis of variance comparing baseline mean Neuro-QoL SF scores between MS patients grouped by MS severity using the MSFC and self-reported GHQ.
Neuro-QoL sensitivity to change was conducted by evaluating general linear models using each patient's change score between month-6 and baseline relative to change in the generic GRC. A correlation of ≥0.30 was set as the criterion for responsive with p≤ 0.05 considered moderately significant and p=0.001 highly significant.
No imputation of missing data was done for patients who failed to participate at the sensitivity to change assessment; however, we prospectively monitored the reasons for missing data (e.g., refusal, disease progression, death) and compared characteristics of patients who did and did not participate.
A total of 161 MS participants completed the baseline assessment with 132 also completing the Month-6 assessments. Baseline demographics indicated that subjects were predominantly female (86%), white (88%), and non-Hispanic (93%). Their average age was 49.8 years (SD=10.5), 58.4 % were married and 90% had some college education or degree. Thirty-seven percent were on disability and 34% were fully employed. MSFC scores ranged from −2.90 to 1.7, with mean of 0.0 (SD=.69). The most common disease courses were relapsing remitting (62.9%) and secondary progressive (28.9%). No systematic differences were found between participants who did or did not complete all assessments or in these scores across the 5 study-sites (data not shown).
MS patients were within 5 T score units of the mean GP for Positive Affect and Well-Being, Applied Cognition-Executive Function, Ability to Participate in Social Roles and Activities, Satisfaction with Social Roles and Activities, Depression and Anxiety. MS patients scored worse than 0.5 SD units for Applied Cognition-General Concerns (M=42.7), Lower Extremity (M=43.5), and Upper Extremity (M=44.3). When compared to CS reference group, they were within 0.5 SD for Stigma, Fatigue and Sleep Disturbance, and Emotional and Behavioral Dyscontrol. Internal consistency was high for all measures (α=0.81-0.95) and ICCs were within acceptable range for all SFs (0.76-0.91) (Table 2).
The KPS correlated highly with Lower Extremity and moderately with Upper Extremity, Satisfaction with Roles and Activities, and Stigma (Table 3). Two correlations for Depression and Fatigue did not meet the criterion (r>0.50). The EQ-5D moderately correlated with 6 of the 13 Neuro-QoL measures and demonstrated low correlations with the remaining 7. The Global Quality of Life question met the high correlation criterion for Ability to Participate in Social Roles and Positive Affect and Well-Being, and correlated moderately with the other scales, except for the Upper and Lower Extremity Functioning and Applied Cognition.
Six of the Neuro-QoL measures strongly correlated with the FAMS Total Score and the remaining 7 only moderately correlated, with the highest being Satisfaction with Social Roles (r=0.830) and lowest being Upper Extremity (Fine Motor; r=0.578) (Table 4a).
Five Neuro-QoL measures demonstrated strong correlations with the FAMS subscales; the strongest being between FAMS General Contentment and Neuro-QoL Positive Affect and Well-Being (r=0.862) and between FAMS Mobility and Neuro-QoL Lower Extremity (r=0.862) with the remaining subscales demonstrating moderate correlations (Table 4a).
The MSFC total moderately correlated with Lower Extremity (r=0.546) and Upper Extremity (r=−0.591) but not with cognition or depression Neuro-QoL measures. The MSFC T-25-FW strongly correlated with Mobility (r=0.81), the MSFC 9-HPT moderately correlated with Upper Extremity (r=0.631) and the MSFC-SDMT had only low correlations with the Neuro-QoL scales.
Comparing the MS subjects grouped by MSFC tertile, the Neuro-QoL Physical Constructs Upper Extremity (eta2=0.30) showed strong discriminative ability and Lower Extremity (eta2=0.19) showed moderate discrimination. The Emotional, Cognitive, and Social Constructs did not discriminate among the MSFC groups (Table 5).
When grouped by response to the single item GHQ question, the Physical Constructs, Fatigue (eta2=0.40), and Sleep (eta2=0.32) showed strong discriminative ability. All 5 Emotional Constructs, Depression (eta2=0.44), Anxiety (eta2=0.27), Stigma (eta2=0.30), Positive Affect and Well-being (eta2=0.65), and Social and Emotional Dsycontrol (eta2=0.18) showed strong discriminative ability as did Social Constructs, Ability to Participate in Social Roles and Activities (eta2=0.44), and Satisfaction with Social Roles and Responsibilities (eta2=0.44). Physical Constructs Lower Extremity (eta2=0.15) and Upper Extremity (eta2=0.14) showed moderate discriminative ability as did Cognitive Constructs, Applied Cognition-General Concerns (eta2=0.18) and Applied Cognition-Executive Function (eta2=0.19) (Table 5).
Each of the Neuro-QoL SFs responsiveness from baseline to month-6 was based on the responses to the 6 GRC scores (Table 6). All reported measures met the correlation criteria of ≥0.30 and were strongly significant (p<0.001) For the Physical GRC (41.6% remained about the same) there was no reported change in the Neuro-QoL Physical Construct measures. The Social/Family GRC (55.3% remained about the same) strongly correlated with the Positive Affect and Well-Being measure. Emotional GRC (44.1% remained about the same) strongly correlated with Depression and Positive Affect and Well-Being, Cognitive GRC (54.7% remained about the same) strongly correlated with Fatigue and Positive Affect and Well-Being. Symptomatic GRC (39.8% remained about the same) strongly correlated with Positive Affect and Well-Being. Overall QOL GRC (45.3% remained about the same) strongly correlated with Depression and Positive Affect and Well-Being. Generally, these GRCs were good anchors as demonstrated by their correlations with Neuro-QoL change scores. While there is some support for Neuro-QoL responsiveness, the analysis power was sharply limited due to the fact that many patients did not experience significant change during the six months they were in the study.
We evaluated the reliability and validity of the thirteen Neuro-QoL short forms in a sample of persons with MS living in the community who were recruited from five study sites across the United States.
Not surprisingly, compared to the GP, this MS sample demonstrated scores worse than 0.5 SD on cognitive function-general concerns, worse upper and lower extremity physical function, and poorer satisfaction with social roles and activities. That the MS subjects showed comparable Positive Affect and Well Being, Anxiety and Depression than the GP sample that is likely to be a result of our sample.
Additional testing with more significantly affected MS patients is ongoing and will increase our understanding of this. MS patients are comparable to the CP for Stigma, Fatigue, Sleep Disturbance and Emotional and Behavioral Dyscontrol, highlighting the relevance of stigma and behavioral assessment in the MS population. These findings demonstrates that Neuro-QoL is a valuable tool for understanding the relative impact of different neurological conditions and different sub-groups within a disease group. All of the Neuro-QoL measures demonstrated high internal consistency.
The correlation findings indicate that the Neuro-QoL measures offer unique insight into the experience of persons with MS compared to the other measures. The moderate correlations between each of the generic measures and the Neuro-QoL Stigma measure is of interest, as none of the generic measures specifically address stigma, which is not commonly assessed as a negative consequence of MS.
The MSFC is a clinical assessment of MS disability that includes a total score and a score for each of its components, as expected the total MSFC score showed moderate correlations with the Lower Extremity and Upper Extremity but the overall MSTP did not correlate with the Cognitive, Social or Emotional domains. The lack of correlation between the SDMT and the two Neuro-QoL cognition measures is consistent with reports of poor correlation between self-reported cognition and neuropsychological tests.23-25
Since the only significant correlation the MSFC had was with the Neuro-QoL Lower and Upper scales, it is not surprising that only those two measures demonstrated known groups discrimination. In contrast, when grouped by the GHQ, all of the domains showed between group discrimination.
This sample of MS patients showed very limited change in their status over the 6-month period of this observational study. This is expected given that most studies of MS disease-modifying therapies require a 24-month period to distinguish between treatment arms. It is not surprising that there was limited responsiveness to change in the Neuro-QoL measures, There were similarly limited changes in FAMS scores over the same study period (Data not shown).
Individuals with MS included in this sample experienced a rather limited level of disability as they needed to be able to have walking and hand function that allowed them to complete the MSFC While a T-score is interpreted relative to a reference population, improved understanding of the “meaning” of a score in the context of individuals living with a given disease will lend clinical meaningfulness to these measures. Work is underway to improve such interpretation guidelines.26, 27
These data provide initial validation for the Neuro-QoL SF measures in a sample of persons with MS. The measures demonstrate strong internal consistency and test-retest reliability. Given their focus on quality of life in neurological conditions, their stronger correlation with other disease-specific PROs compared to generic PROs, and stronger correlations with PROs than with clinical measures of MS severity is expected. The Neuro-QoL assesses several domains of well-being not typically assessed using traditional MS-specific PROs. Those additional domains include Positive Affect and Well-Being and Emotional and Behavioral Dyscontrol. The relevance of Stigma as a component of quality of life clearly emerged from these data. The availability of one PRO measure that assesses physical, cognitive, and emotional domains of well-being and has been evaluated using a unified validation approach in five adult and two pediatric neurological conditions represents a major advancement in the ability to assess the impact of different interventions within one disease group and across individuals living with several neurological diseases. We believe Neuro-QoL provides an excellent opportunity for researchers and clinicians alike to explore aspects of MS patients’ experiences that have not been previously studied and advances opportunities to study the impact of different diseases across neurological conditions.