|Home | About | Journals | Submit | Contact Us | Français|
We aimed to develop, validate, and evaluate a disease-specific outcome measure for SBMA: the Spinal and Bulbar Muscular Atrophy Functional Rating Scale (SBMAFRS). We examined the Japanese version (SBMAFRS-J) in 80 Japanese SBMA subjects to evaluate its validity and reliability. We then assessed this scale longitudinally in 41 additional SBMA subjects. The English version (SBMAFRS-E) was also tested in 15 US subjects. The total score of the SBMAFRS-J was distributed normally without an extreme ceiling or floor effect. For SBMAFRS-J, the high intra- and inter-rater agreement was confirmed (intra-class correlation coefficients [ICCs] 0.910 and 0.797, respectively), and internal consistency was satisfactory (Cronbach’s alpha 0.700–0.822). In addition, SBMAFRS-J demonstrated concurrent, convergent, and discriminant validity, except for the respiratory subscale. The inter-rater reliability and internal consistency of SBMAFRS-E were also satisfactory. Longitudinally, SBMAFRS-J showed a higher sensitivity to disease progression than the existing clinical measures. In conclusion, we developed and validated a disease-specific functional rating scale for SBMA in both Japanese and English versions, although it needs to be re-assessed in interventional studies with a larger sample size including English speaking subjects.
Spinal and bulbar muscular atrophy (SBMA), also known as Kennedy’s disease, is an adult-onset, hereditary neuromuscular disease characterized by extremity muscle atrophy, weakness, contraction fasciculation, and bulbar involvement [1–3]. The progression of neurological deficits is usually slow in SBMA, with the average interval between the onset of symptoms and death approximately 20 years [4,5].
There is currently no effective treatment to slow the progression of SBMA. Although clinical trials of potential therapies have been done [6–9], definite efficacy has not been clearly demonstrated in randomized controlled trials. These results appear to be partly attributable to the absence of established outcome measures that are sensitive to the disease-specific symptoms as well as the limited number of patients, which may diminish statistical power. Namely, since the progression of SBMA is slow, existing outcome measures such as those for amyotrophic lateral sclerosis (ALS) are not sensitive enough to detect the deterioration of motor function in the patients .
Thus, the present study aimed to develop and validate a disease-specific functional rating scale for SBMA (SBMAFRS) and to evaluate whether this scale is more advantageous than other existing outcome measures. In addition, for future use in global clinical studies, the English version of SBMAFRS was also evaluated for reliability.
The SBMAFRS was designed as a quantitative outcome measure of global disability in SBMA, especially for use in clinical trials. According to this concept, we developed the SBMAFRS using the following four stages: item generation and scale development; cross-cultural translation; scale validation; and scale evaluation.
At the item generation and scale development stage, we first listed up all the items of the revised Amyotrophic Lateral Sclerosis Functional Rating Scale (ALSFRS-R) and those of the modified Norris Scale, and created novel items regarding bulbar and truncal function, which were not included in the existing scales. Then seven experts of SBMA selected as many candidate items as possible with consideration of the score distribution in our previous studies [7–10]. From the candidates, the best suitable fourteen items were selected and modified to fit the symptoms and severity of SBMA if necessary. Correspondence of the items of SBMAFRS and those of the existing scales was shown in Table SA-1. In the process of determining item balance, we gave weight to disease-specific features and their clinical relevance to SBMA. In the SBMAFRS, 5 of the 14 items addressed bulbar-related symptoms because these are the primary symptoms of SBMA and are occasionally lethal. Although pulmonary function tends to be preserved in SBMA, one breathing-related item was adopted, because serious respiratory dysfunction has been reported in certain subsets of patients and associated with the prognosis [5,8]. This strategy may permit the identification of specific areas that respond to treatment or the sub-classification of SBMA into etiologically or therapeutically relevant subtypes.
We designed alternatives for each item so that the scale was sensitive enough to detect subtle neurological changes. For example, although moderate dysphagia that requires a change of dietary consistency is scored as 3 in the ALSFRS-R, we subdivided this status into the following 2 scores in the SBMAFRS: score 2, avoids some particular textures of food that may cause choking, etc.; and score 1, food textures always have to be modified to a soft or chopped consistency. In addition, we also ensured that the expressions used in the alternatives were as clear as possible to avoid ambiguity. For instance, a specified frequency was required to choose the alternative in the item for control of salivation: score 3, less than once a week; and score 2, not less than once a week.
As a result, a functional rating scale consisting of 14 items was generated. At the cross-cultural translation stage, we translated the original Japanese version (SBMAFRS-J) into English (SBMAFRS-E) for the common use of this scale in English-speaking countries. The translation procedure followed standard methods . The SBMAFRS-J was translated into English considering cultural differences by a bilingual translator, and it was later submitted to an expert committee. This committee, consisting of C.G., G.S., and M.K., reviewed and revised the linguistic and conceptual equivalence of the original Japanese version and the first English version. Next, this English version was back translated into Japanese by a Japanese native without a priori knowledge of the intent and concept of this scale. Finally, the expert committee reviewed all reports and confirmed the linguistic and conceptual equivalence of this scale, and completed the English version. At the scale validation stage, we assessed the SBMAFRS-J on 80 subjects with SBMA, and evaluated its validity and reliability. We also assessed the reliability of the SBMAFRS-E on 15 subjects with SBMA in the USA. At the scale evaluation stage, we performed a 48-week longitudinal analysis on an additional 41 Japanese subjects with SBMA to evaluate the usefulness of the SBMAFRS-J as a potential endpoint in clinical trials. The evaluators of the SBMAFRS were physicians or research nurses who were familiar with the medical care for SBMA.
We recruited a total of 80 consecutive male Japanese subjects with SBMA as the SBMAFRS-J validation group. The inclusion criteria were as follows: 1) genetically confirmed male Japanese SBMA subjects with more than one of the following findings: muscle weakness, muscle atrophy, and bulbar dysfunction; 2) subjects who were 25–75 years old at the time of informed consent; and 3) subjects who had not undergone surgical castration.
To assess inter-rater reliability, a total of 80 Japanese participants were assessed twice by two independent evaluators. To assess intra-rater reliability (test–retest reproducibility), 26 of the 80 Japanese participants were assessed twice at an interval of 1–3 weeks. This 1–3-week period was considered to be short enough to avoid pathophysiological progression, but long enough to reduce the chance that the subjects would recollect their previous responses.
We independently recruited a total of 41 male Japanese subjects with SBMA as the SBMAFRS-J evaluation group and followed them for 48 weeks to assess the usefulness of this outcome measure to detect disease progression. The inclusion criteria were the same as those of the SBMAFRS-J validation group.
We also recruited a total of 15 male subjects with SBMA in the USA as the SBMAFRS-E validation group. The inclusion criteria were the same as those of the Japanese version. To assess inter-rater reliability, all participants were assessed twice by two independent evaluators.
We followed the Japanese subjects at Nagoya University Hospital, and collected their data between May 2010 and July 2014. The US subjects were followed at NIH, and their data were collected between September 2012 and May 2013.
This study also included 41 Japanese male healthy controls matched for age and with no significant medical history of neurological disorders. We informed the controls that their results would be compared to those of subjects with SBMA, and they were evaluated at Kojunkai Chuo Clinic, Tokai, Japan.
This study conformed to the ethics guidelines for human genome/gene analysis research and to those for epidemiological studies endorsed by the Japanese government. The ethical committee of Nagoya University Graduate School of Medicine approved this study, and all participants gave their written informed consent. All of the US subjects signed informed consent at the National Institutes of Health in Bethesda, Maryland, USA.
We used two other functional rating scales that are commonly applied to evaluate disease severity, i.e., the ALSFRS-R and the modified Norris score, to assess the external validity of the SBMAFRS.
The ALSFRS is a validated, questionnaire-based scale that measures the physical function of subjects with ALS in carrying out activities of daily living . The revised version of this scale, the ALSFRS-R, was generated to improve the disproportionate weighting to limb and bulbar function, as compared to respiratory dysfunction. Each of the 12 items is rated in 4 ordinal categories, and thus the best possible score is 48. The ALSFRS-R was translated into Japanese, and its validity has been confirmed .
The modified Norris Score is another rating scale for ALS that consists of two parts, i.e., the Limb Norris Score and the Norris Bulbar Score. The Limb Norris Score has 21 items to evaluate limb function and the Norris Bulbar Score has 13 items to assess bulbar function. Each item is rated in 4 ordinal categories, and thus the best possible scores are 63 and 39, respectively. The original version was translated into Japanese, and its validity has been confirmed .
Statistical analyses were performed using SPSS Statistics 17.0 (SPSS Japan, Inc., Tokyo, Japan). We used descriptive variables such as the mean and standard deviation to summarize the quantitative measures. The normality of the score distribution was assessed by Shapiro–Wilk test. The score distributions were also assessed by skewness and kurtosis. For multiple comparisons, p values were corrected using Dunnett’s test. We examined scaling assumption (confirmatory factor analyses); acceptability (distribution of the subscales and total score); reliability (test–retest reliability: intra-class correlation coefficients [ICCs] with a one-way mixed effect model; inter-rater reliability: ICCs with a two-way mixed effect model); and validity (internal consistency: Cronbach’s alpha; inter-scale correlations; and convergent and discriminate construct validity). Often, 0.70 is recommended as a minimum standard for ICCs to indicate reliability. A value of 0.70–0.90 is considered to indicate good internal consistency for Cronbach’s alpha . In the longitudinal analysis, a standardized response mean (SRM) was also calculated as an index of the effect size for a direct comparison among the outcome measures. Values of 0.20, 0.50, and 0.80 were considered to represent small, moderate, and large changes over time, respectively . The SRM was calculated using the ratio of the mean score change to the standard deviation of score change.
The SBMAFRS consists of 14 items, each of which contains 5 (0–4) alternatives (Table 1). The 14 items consist of 5 subscales: bulbar-, upper limb-, trunk-, lower limb-, and breathing-related. The total available score ranges from 0 (worst) to 56 (normal). The bulbar-related subscale contains 5 items, the upper limb-related subscale contains 2 items, the trunk-related subscale contains 4 items, the lower limb-related subscale contains 2 items, and the breathing-related subscale contains 1 item. To minimize the difference in the questioning and examination styles used among raters, a rating algorithm that specified the exact procedures to be used was attached to this scale (Fig. SA-1).
The background characteristics of the subjects included in the present study were similar to those of previous clinical studies [7,8,10]. There was no significant difference in demographics among the three study populations: Scale validation group (Japan), Scale evaluation group (Japan), and Scale validation group (USA) (Table 2).
The scores of the SBMAFRS-J and SBMAFRS-E were 36.5 ± 7.4 (20–54) and 37.3 ± 3.8 (31–43), respectively. The total score was distributed in a normal manner (Fig. 1A). Absolute values of skewness and kurtosis of the SBMAFRS total scores were smaller than those of other functional scales, suggesting the distribution normality of SBMAFRS (Table SA-2). The comparison between the total score distribution of the SBMAFRS-J and ALSFRS-R showed that the SBMAFRS-J scores varied from low to high without a ceiling or floor effect, although the distribution of the ALSFRS-R scores was skewed to the right (Fig. 1B). The total score of the SBMAFRS-J for the 41 healthy controls was 55.9 ± 0.4 (54–56), which was close to the highest possible score.
Intra-rater agreement based on the total score of the SBMAFRS-J obtained in 26 Japanese subjects with SBMA was excellent. Detailed analysis of the ICCs for each subscale demonstrated that all subscales, except for the breathing function subscale, had an excellent intra-rater agreement (Table 3). Inter-rater reliability was also high, although detailed analysis of the ICCs for each item indicated lower values than for intra-rater reliability. Internal consistency was satisfactory, with Cronbach’s alpha values ranging from 0.700 to 0.822 (Table 3). Inter-rater reliability and internal consistency of the SBMAFRS-E were similar to those of the Japanese version of this scale, SBMAFRS-J, although the ICC for the breathing-related subscale of the SBMAFRS-E was better than that of the SBMAFRS-J.
As a “gold standard” disease-specific functional parameter has not been established for SBMA, we could not evaluate the criterion validity. Instead, we assessed the correlation of the subscales of the SBMAFRS with those of existing functional scales to evaluate their convergent and discriminant construct validity (Table 4). The directions and patterns of correlations were consistent with what we assumed. For example, the bulbar-related subscale of the SBMAFRS correlated best with that of the ALSFRS-R and Norris Bulbar Score, and worst with the lower limb- or respiration-related subscale of the ALSFRS-R and Limb Norris Score. Conversely, the lower limb-related subscale of the SBMAFRS correlated best with that of the ALSFRS-R and Limb Norris Score, and worst with the bulbar-related subscale of the ALSFRS-R and Norris Bulbar Score.
As stated above, the SBMAFRS was developed with the assumption that the items can be categorized into the following 5 subscales: bulbar-, upper limb-, trunk-, lower limb-, and breathing-related. To verify this assumption, we applied factor analysis using oblique varimax rotation for the SBMAFRS-J (Table SA-3). This analysis extracted 3 factors that contributed 60.0% of the total variance. Factor 1 mainly included the trunk- and lower limb-related subscales, while the bulbar- and upper limb-related subscales both contributed to factor 2. These results confirmed the factorial validity of the SBMAFRS, although the bulbar and upper limb subscales were not divided; neither were the trunk and lower limb subscales. Thus, the motor function of the SBMA subjects assessed using the SBMAFRS appears to be classified into 3 domains: bulbar and upper limb; trunk and lower limb; and respiratory. These findings were consistent with our previous study showing that bulbar impairment is closely related to upper limb dysfunction, and trunk dysfunction is associated with lower limb weakness in SBMA .
To evaluate the sensitivity of the SBMAFRS as an outcome measure, we prospectively analyzed longitudinal changes in the SBMAFRS-J and other functional parameters for 48 weeks. The results showed slow but steady disease progression in all subscales of the SBMAFRS-J, as well as in the other functional parameters (Table 5, Fig. 2, and Fig. SA-3). To facilitate a direct comparison of responsiveness among the various functional parameters, we calculated the SRM as an index of the effect size by the longitudinal mean change/standard deviation of score change for each parameter (Table 5). The SRM, which enables a comparison of sensitivity among the outcome measures, of the SBMAFRS was larger than that of the other scales, indicating that the SBMAFRS would require the smallest sample size in clinical trials and thus appears to be the most sensitive functional parameter [17,18]. Sample size estimation based on this longitudinal analysis was the lowest for the SBMAFRS, followed by the ALSFRS-R and Norris Bulbar Score, suggesting that the SBMAFRS was a sensitive clinical measure which detects the disease progression over time (Fig. 3).
Outcome measures that enable the rigorous quantification of disease severity or quality of life are essential to estimate the efficacy of potential therapies [19–24]. Disease-specific functional rating scales that measure global dysfunction, based on a structured psychometric validation procedure, are increasingly important in defining the primary endpoint in clinical trials, because the quality of the rating scales has the potential to determine the validity of such studies [25–27]. Nevertheless, to date, no disease-specific outcome measures for SBMA are available. In the present study, we created a disease-specific functional rating scale for SBMA (SBMAFRS) and validated this score in cross-sectional and longitudinal analysis. One major advantage of the SBMAFRS is that it was developed as a compound scale to capture the multiple aspects of SBMA. The results of the present study suggest that the SBMAFRS was valid and reliable measure for the quantitative evaluation of SBMA. In the longitudinal evaluation, the change of the SBMAFRS-J over time was more readily detectable than that of the other functional parameters, although still larger sample sizes are required for detection of therapeutic efficacy. These results demonstrate that the SBMAFRS is a disease-specific outcome measure that can be used for clinical studies.
The SBMAFRS was designed to overcome several problems that arose when we applied the functional measures for ALS or other neuromuscular disorders to SBMA. First, it tends to be difficult to notice the chronological changes of neurological symptoms in subjects with SBMA because disease progression in this disease is much slower than in ALS. In the process of scale development, each item was designed to detect subtle symptoms by subdividing the alternatives of the milder symptoms. Second, the vague descriptions of the alternatives in the existing clinical measures might result in different responses between raters. We developed each alternative of the SBMAFRS to ensure that the expressions used were as clear as possible. As a result, the scores of the SBMAFRS were distributed in a wider range than those of the ALSFRS-R, suggesting that the SBMAFRS measures the physical function of individual subjects with mild motor dysfunction more sensitively than the ALSFRS-R.
The confirmatory factor analysis of this scale indicated that the bulbar function is strongly associated with the upper limb function in spinal and bulbar muscular atrophy, as we previously demonstrated . A similar relationship is also shown in ALS: the degree of bulbar palsy is more strongly associated with upper limb weakness than lower limb weakness . As a possible pathomechanism for this phenomenon, spread of lesions from the bulbar region to the upper limb is demonstrated in ALS . Given that the spread of neuropathology is also suggested in polyglutamine diseases, a similar mechanism may underlie the pathophysiology of SBMA .
The SBMAFRS-J was also examined prospectively over 48 weeks to evaluate its sensitivity to changes over time. To this end, we calculated the SRM as the index of the effect size. Calculation of the SRM enables a direct comparison of the sensitivity of the scales to detect changes among outcome measures. Compared to the other measures, the SBMAFRS had a larger SRM, indicating that its use might reduce the number of subjects needed for clinical trials. The effect sizes of disease-modifying therapies in slowly progressive neurodegenerative disorders generally tend to be small because of smaller longitudinal changes in proportion to large symptomatic daily fluctuations [31–33]. Although a relatively small SRM is an inevitable feature of slowly progressive neurodegenerative diseases such as SBMA, effort should be made to further increase the sensitivity of this scale through the experience gained by its use in future clinical trials.
Although our results suggest that the SBMAFRS is a valid disease-specific score of SBMA, there are limitations that need to be improved. First, low reliability of the breathing-related item is a critical drawback of the SBMAFRS-J, which is attributable to several factors. One of the most important issues was that each alternative of breathing-related item was categorized within a narrow range, because respiratory function in the most of patients with SBMA is preserved and a few need ventilation support . This over-categorization might lead to the discrepancies among evaluations and result in low intra-rater or inter-rater agreement. In addition, a minor disagreement led to low ICC for the breathing-related item of the SBMAFRS-J, since the score distribution biased extremely: 58 out of 80 subjects were judged as having the same score in the inter-rater agreement assessment (Table SA-4A). Similar results were also obtained as for the intra-rater reproducibility (Table SA-4B). Furthermore, the number of raters was eight for the SBMAFRS-J, but two for SBMAFRS-E. This difference might also lead to the differential reliability of breathing-related subscale of the SBMAFRS-E, since only one item was included in this subscore. The small number of the US subjects and relatively short longitudinal follow-up period also limit the interpretation of our results. In the future study, a larger cohort needs to be followed-up for longer time to confirm the utility of the SBMAFRS. Second, the old-fashioned clinimetrics that adopted items without standardization or weighing as in SBMAFRS would be disadvantageous for being used in clinical trials because of the limited linearity. In this regard, the Rasch and item response theory is an important approach to transform ordinal scores into interval measures that are scale-independent and accurate for patients’ assessment . In fact, this methodology has been applied for several functional scales of neurological disorders such as the 9-item fatigue severity scale (FSS) for immune-mediated neuropathies  and the modified Medical Research Council grading system . Furthermore, the concept of minimal clinically important difference (MCID) should also be considered as a means of assessing treatment response in clinical trials .
In summary, we created a disease-specific functional rating scale for SBMA, a slowly progressive neurodegenerative disorder. Although our results indicate the validity and sensitiveness of this scale, the SBMAFRS, the feasibility of this scale as an outcome measure needs to be re-assessed in interventional studies with a larger sample size including the English-speaking subjects.
Dr. Katsuno is supported by a Grant-in-Aid for Scientific Research on Innovated Areas “Foundation of Synapse and Neurocircuit Pathology” from MEXT, Japan, (22110005); KAKENHI grants from MEXT/JSPS, Japan, (No. 22110005, 26293206, 26670440, and 26670439); and Core Research for Evolutional Science and Technology (CREST) from the Japan Science and Technology Agency (JST); and a grant from the Daiichi Sankyo Foundation of Life Science.
Dr. Sobue is supported by KAKENHI grants from MEXT/JSPS, Japan (No. 21229011); grants from the Ministry of Health, Labour and Welfare, Japan; and Core Research for Evolutional Science and Technology (CREST) from the Japan Science and Technology Agency (JST).