|Home | About | Journals | Submit | Contact Us | Français|
Classification of patients with LBP may be important for improving clinical outcomes and research efficiency. The purpose of this study was to examine the inter-tester reliability of two trained physical therapists to classify patients with low back pain (LBP) using the standardized Movement System Impairment (MSI) classification system. The five proposed MSI classifications are based on the most consistent patterns of movement and alignment observed throughout the examination that correlate with the patient’s symptom behavior.
Test-retest to assess reliability
Academic healthcare center outpatient facility
Thirty subjects with chronic, recurrent LBP (mean age 31.1 ± 12.9 years, 21 F:9 M) were examined independently by two experienced physical therapists.
Training consisted of self-study of a procedure manual, supervised practice of examination procedures and classification rules and discussion. Subjects were examined independently by each therapist using a test-retest design. Each therapist assigned a LBP classification upon completion of the examination. Both therapists were blinded to the other therapist’s findings.
Inter-tester reliability of therapists classifying the LBP problems was indexed by the percent agreement and kappa coefficient.
Overall percent agreement on the classification assigned was 83% with kappa =.75 (95% CI =.51 to.99; P<.0001).
Inter-tester reliability of classification of patients with LBP using a standardized clinical examination based on the MSI classification system is substantial.
Low back pain (LBP) is a common condition that is associated with significant economic burden.1–5 Despite advances in clinical assessment and imaging there is no consistent evidence supporting any one conservative treatment approach for effective management of LBP. Thus, conservative treatment continues to be a challenge. Authors have suggested that the lack of consistent evidence to support conservative treatment may be due to the use of heterogeneous study populations for comparison.6, 7 Several investigators have suggested that classification of individuals with LBP into homogeneous subgroups may result in more efficient treatment strategies and improved clinical outcomes.8–10
Classifying homogeneous subgroups of people with LBP may increase the likelihood of responding to specific treatment. Traditionally, classification of LBP has been based on pathoanatomy; however pathoanatomy is purported to be identified in only 10% of patients with LBP.11 Thus, classification based on pathoanatomy may not be the most effective method to guide treatment. Classification based on information collected from the clinical examination may be useful in identifying subgroups and guiding treatment choices.
Although many impairment-based classification systems for LBP have been proposed,12–17 only 3 systems were designed to direct rehabilitation and have been studied to some degree. The 3 systems that meet these criteria are the McKenzie (MK) LBP Classification system,15 the Treatment-Based Classification (TBC) system,16 and the Movement-System Impairment (MSI) Classification system for LBP.17 Table 1 provides an overview of the major similarities and differences across the 3 systems.
The primary basis for classification in the MK system is the person’s symptom behavior with spinal movements and sustained postures performed within a clinical examination. Specifically, symptoms are assessed with a series of single and repeated spinal movements or prolonged postures in different directions. The purpose of the symptom testing is to identify the pattern of spinal movements and postures that worsen and improve the person’s symptoms. Generally, the repeated spinal movements or postures that improve the symptoms (for example repeated extension in standing) are then prescribed as the exercise to manage the LBP.15
The TBC system was developed in response to limitations noted in the MK system.16 Similar to the MK examination the TBC examination includes assessment of symptoms with single and repeated spinal movements and sustained postures. Together, symptoms during testing and physical examination signs provide a basis for the person’s LBP classification. Treatments based on the person’s classification may include stabilization exercise, passive mobilization and manipulation, repeated spinal movements and sustained positions or traction. Instruction in modification of functional activities is general and provided only for the flexion and extension syndromes of the “specific exercise” classification.
Similar to the MK and TBC systems, the MSI system includes descriptions of classifications of LBP based on impairments related to symptoms and mechanical factors identified during a standardized examination. The MSI examination used to classify is similar to the MK and TBC examinations in that it includes tests of single trunk movements and symptoms with these movements. There are, however, several differences. First, the MSI examination is focused on symptoms produced not only with overt trunk movements (e.g., forward bending) but also with limb movements (e.g., hip rotation in prone).18, 19 The initial movement and alignment tests in which the person uses his preferred strategy are referred to as primary tests. Second, as an alternative to performing tests of repeated spinal movements, in the MSI examination symptomatic tests are immediately followed by a secondary test in which the person’s preferred alignment or movement strategy is modified. The effect of the secondary test is assessed relative to the symptomatic primary test. Overall, the modifications involve changing a person’s strategy by either 1) positioning the lumbar region in a neutral alignment, or 2) restricting lumbar region movement and encouraging movement in other segments (e.g., thoracic region or hip) as the person performs the secondary test.20 Third, examiner judgments of the characteristics of movement focus on amount of motion as well as the relative timing of movements of the spine and proximal limb joints.
The MSI classification of the LBP problem is based on the most consistent pattern of movement and alignment observed throughout the examination and that correlate with the individual’s symptom behavior. The MSI classifications are named for the direction(s) of movements and alignments that appear to contribute to the LBP problem The classifications include lumbar (1) extension, (2) flexion, (3) rotation, (4) extension with rotation, and (5) flexion with rotation.21 Studies to validate 3 of the 5 classifications22 and various test items have been reported.19, 23–27
In order for a classification system to be useful, examiners must be able to determine an individual’s classification reliably. Thus far, only one study has been published on the ability of examiners to classify LBP problems using the MSI system. Our research group reported on the inter-tester reliability of physical therapists using the standardized examination to classify LBP based on the MSI Model for LBP.28 We reported that the original group of therapists demonstrated moderate29 reliability in classifying the LBP problem.25 The prior investigation of inter-tester reliability, however, included only therapists who were involved in the development of the examination for classifying LBP and had worked together extensively before testing their reliability. In addition, in our previous study, only one examination was performed with each patient while both therapists were present, one examiner and one observer.
The purpose of this study, therefore, was to examine the inter-tester reliability of 2 trained physical therapists to classify individuals with LBP based on the MSI classification system using a test-retest design. One therapist was involved in the development of the standardized examination for classifying. The second therapist involved in the study was not involved in the development process. Examination of the reliability of a new cohort of examiners using a test-retest design is an important step in increasing the generalizability of our previously reported findings. We hypothesized that experienced physical therapists could reliably classify people with LBP into subgroups based on the MSI model.
This study was approved by the Human Research Protection Office of blinded. A standardized examination, based on the MSI classification system was used to examine and classify subjects with LBP.17, 28 The goal of the examination is to identify the direction-specific movement and alignment strategies that consistently reproduce or increase a subject’s symptoms. A description of the five MSI subgroups has been published previously.22
Two physical therapists participated in the study. Each therapist had greater than 10 years of experience treating people with musculoskeletal conditions. The first author (MHH) was a board certified clinical specialist in orthopedic physical therapy, who had received instruction in using the MSI system through continuing education and used the concepts of the system in her clinical practice for 7 years. The first author was not involved in the development of the examination. The second author (LVD) had primary responsibility for developing the examination28 used in the current study and has studied properties of the examination and related treatment extensively.18–20, 22–25, 28, 30–32 Neither author was examined for reliability in the previous inter-tester reliability study.25, 28
The first author was trained by the second author in the operational definitions for examination items and responses, examination procedures and rules for classification. Training consisted of self-study of an operations manual and practice in the examination procedures with people with and without LBP (N=10). The operations manual included 1) the operational definitions for test items and responses that might be demonstrated by the subject, 2) the construct that each item is proposed to represent, 3) procedures for performing the examination, and 4) explicit rules for classifying the LBP problem. The rules for classification are included in Appendix A. The second author was present during practice to ensure proper performance of the examination procedures and application of the classification rules. The first author practiced using an examination form to record results from the examination and classification decision. There was time allowed for discussion after each training session.
Thirty subjects with chronic, recurrent LBP (mean age 31.1 ± 12.9 years, 21 F:9 M) were examined independently by the 2 therapists. Subjects were recruited from the community through newspaper and television advertisements, flyers placed in physician and physical therapy clinics and our University’s volunteer registry. Potential subjects contacted the research coordinator in the laboratory where the study was conducted. A detailed description of the study was provided to the potential subject and then he was asked if he would like to participate. If the subject indicated interest, he was then screened more extensively during the telephone interview to determine his eligibility based on the inclusion and exclusion criteria. Subjects between the ages of 18 and 60 who had symptoms related to a LBP problem were eligible to participate. Low back pain symptoms could include pain and parasthesias in the region of the lower back, proximal lower extremity or distal lower extremity.7 Subjects were excluded in the case of pregnancy or if they had been previously diagnosed by a physician with one or more of the following conditions: severe kyphosis, scoliosis, spinal stenosis, spinal surgery in the prior 3 months, more than one surgical procedure of the spine, cancer, rheumatoid arthritis, ankylosing spondylitis, or neurological disease. Subjects were also excluded if they were pending spinal surgery or were unable to stand and walk without an assistive device. All subjects read and signed an informed consent statement approved by Blinded Human Research Protection Office before participating in the study.
Subjects were examined independently by each therapist using a test-retest design. Each subject was examined by both therapists on the same day with a 15 minute break between examinations. The order in which the therapists performed the examination was determined by convenience of the therapists’ schedules. Examinations were performed in an enclosed treatment room. The therapist waiting to perform the examination was not allowed in the treatment room or surrounding laboratory area.
Prior to the testing session, the subject completed a set of self-report forms that captured data regarding demographics, general health status (SF-36)33, LBP history and LBP-related functional limitations (Oswestry Low Back Pain Disability Index)34. The first therapist then obtained history information and performed the physical examination. A standardized clinical examination form was used to record findings and the LBP classification.
While the subject rested the first therapist reviewed the self-report and history information with the second therapist. The second therapist then independently performed the physical examination and recorded her findings and LBP classification. Both therapists were blinded to the other therapist’s findings and were not allowed to discuss examination procedures during the testing phase of the study. Data forms were collected by a research coordinator after each testing session. A research assistant, independent of the data collection process, entered the data into text files and Systat 10.2 data files.
Descriptive statistics were calculated for demographic, general health status and LBP history variables. Percentage of agreement and a kappa coefficient 35 were used to examine the inter-tester reliability of the therapists to classify the subjects with LBP. The 5 possible response categories for classifying a subject’s LBP problem were the proposed MSI LBP classifications.
Summary information regarding subject characteristics is provided in Table 2. Thirty subjects (mean age 31.1 ± 12.9 years, 21 F: 9 M) with chronic, recurrent LBP were examined. Eighty percent of the subjects reported symptoms in the back region only7, the remaining 20% reported symptoms in the low back and into the lower extremity. Subjects reported minimal LBP-related disability based on the Oswestry Disability Index (13.6±7.5). On average, the subjects reported 3.8±3.4 acute flare ups36 in the previous 12 months.
The first author performed the examination first for 16 of the 30 (53%) examinations. The therapists’ responses are summarized in Table 3. Overall percent agreement for the LBP classification assigned was 83% with a kappa value of.75 (95% CI=.51 to.99; z=6.17, P<.0001). One subject was unable to be classified by either therapist. The therapists disagreed on 5 subjects. One subject was unable to be classified by one of the therapists and was classified as lumbar rotation by the other therapist. Two subjects were classified as lumbar extension with rotation by one therapist and lumbar rotation by the other therapist. Two subjects were classified as lumbar flexion with rotation by one therapist and lumbar rotation by the other therapist.
A prerequisite for testing the usefulness of a classification system for LBP is for examiners to be able to reliably classify the proposed LBP problems. In the current study, two experienced physical therapists trained in a standardized examination for classifying LBP based on the MSI model, were able to determine the LBP classifications for a sample of people with non-specific LBP with substantial agreement.29 The current study extends our previous findings 25, 28 by demonstrating that, with training, a physical therapist not involved in the development of the examination can reliably classify people with LBP. These findings also demonstrate that substantial reliability can be attained using a test-retest design instead of a simultaneous observation design as used previously.25 Such findings extend the generalizability of our initial reliability testing and suggest that the examination could be used by other experienced physical therapists reliably given the appropriate training. Training in this study included self study of a procedure manual, supervised practice of examination procedures and application of the classification rules to 10 individuals as well as discussion. We believe, however, that learning the specific classification rules was key to attaining a high level of reliability.
We demonstrated similar reliability values as those reported for application of the MK system to classify.37, 38 Two recent studies assessing the inter-tester reliability37, 38 of the MK system involved physical therapists who had received extensive training and reported a minimum of 5 years of clinical experience using the MK system. All therapists were credentialed in the examination procedures of the MK system. The first study by Kilpikoski et al38 used test-retest methods similar to our study and reported that 2 physical therapists obtained an overall percent agreement for LBP classification of 95% and a kappa value of 0.6. Razmjou et al37 used simultaneous testing to assess reliability of 2 physical therapists and reported agreement of 93% and a kappa value of 0.7.
The reliability reported in the current study is better or similar to that reported for reliability of physical therapists using the TBC system. Direct comparison of the current results to those of the TBC system is not possible, however, due to differences in therapist characteristics.39, 40 Heis et al39 examined the reliability of four experienced therapists who were newly trained to apply the TBC classification system. Data from one therapist was not included in the final analysis due to low agreement with the other therapists in the study. The authors reported that the agreement of the three remaining therapists was 55% with a kappa value of 0.46. Fritz et al40, also reported on a test-retest design of the reliability of therapists to classify using the TBC system. The therapists had an average of 5.5 years (6 month to 15 years) experience using the TBC system. Agreement of the seven therapists in making a classification decision was 65% with a kappa value of 0.56. Therapist training was not described. Table 4 provides a summary of the results of studies of the reliability of different cohorts of therapists to classify using the three classification systems (MK, TBC MSI),
We have now reported on the inter-tester reliability of physical therapists classifying LBP in two independent samples.25, 28 The methods in the current study were more rigorous than our prior work and yet the obtained agreement is higher in the current study than that obtained in our previous study.25, 28 The improvement in agreement is likely due to the therapists having more explicit rules for classifying than those available for our original reliability study (Appendix A). The prior study was the first attempt to test the measurement properties of the test items used in the examination, and the primary goal of the study was to examine the ability of therapists to make reliable judgments about individual items from the examination. The rules provided for classifying during the original study were more general than our present rules, and during training less emphasis was placed on learning and applying the rules for assigning a LBP classification than on making judgments about individual test items.25, 28 Information obtained from our original reliability study25, 28 and subsequent studies18–20, 22, 23, 32 have allowed us to develop more specific guidelines for making judgments during individual test items, and to develop more detailed rules for classification. Clarification of criteria for judgments during the examination and development of more specific classification rules likely contributed to our improved therapist agreement in the current study.
Currently, to assign a classification with the MSI system, symptoms must be either produced or increased with some test items during the examination. One subject reported no change in symptoms during either examination. A second subject reported no change in symptoms during the first examination and reported one test as symptom-provoking during the second examination. Following the rules for classifying, both therapists did not assign a classification to the first subject. For the second subject, the first therapist did not assign a classification while the second therapist was able to assign a classification. Thus, a limitation of our current criteria for classification is that an examiner may not be able to classify subjects with a low level of symptom irritability during the examination. After analysis of reliability was completed, the charts for the two subjects described were examined. In the instances where a classification was not assigned, each therapist recorded what she believed the patient’s LBP classification would be based solely on judgments of signs with tests of movement and alignment across the examination. In both subjects, the therapists agreed upon the classification, even though there were little to no tests that evoked symptoms. The criterion of symptom reproduction during the examination, therefore, may represent a limitation in the classification rules. Based on the example from the current study it may be possible that the classification rules could be modified to permit classification based on the signs during tests of movement and alignment made across the examination in the absence of symptom production.
We did not display perfect agreement to classify the LBP problems present in our sample. The therapists disagreed on the classification of five subjects. To determine the nature of our disagreements, we reviewed the data from the examination forms of the subjects for which there was disagreement. The first disagreement is described in the previous paragraph. Two additional classification disagreements were due to the therapists’ interpretation of symptoms during individual examination items. Specifically, two subjects described “pressure” in their low back region with a number of the items. One therapist interpreted the “pressure” as the subjects’ symptoms; the other therapist did not. Thus, the classification disagreement in these two cases was a result of the therapists’ interpretation of symptom behavior. In one subject, the therapists did not agree on the patient’s symptom report on a number of test items. The differences could have been a result of subject variability or misinterpretation of symptoms by the examining therapist. On inspection of the examination data from the fifth subject the therapists agreed on the responses to individual items across the examination. One therapist, however, chose a classification inconsistent with the rules. Thus, one therapist misapplied the rules to classify the subject’s LBP problem.
We consider the use of a test-retest design in this study to add to the strength of our findings. In our previous work,25, 28 both therapists were present during the assessment of each patient. One therapist performed the examination, while the second therapist observed. The simultaneous observation method used in the previous study was intended to remove any variability in patient status or in therapist methods that could affect the results of a test-retest study design. Since the prior study was our first attempt to examine any of the measurement properties of the examination and classification system, the primary question we asked was whether, when the therapists see and hear the same responses could they make the same judgments. The use of the simultaneous observation method, therefore, could have positively affected our prior inter-tester agreement.26 In the current study, each therapist performed the examination independently. Despite possible variability in patient status and variability in methods between the two therapists, our inter-tester reliability was substantial.29
The current study has limitations. First, the extent to which the current sample is representative of all individuals with LBP is not known. Our sample of subjects was recruited from the community through advertisements, flyers placed in physician and physical therapy clinics and a University web-based volunteer registry. The subjects in our sample, therefore, may not represent all individuals with LBP who would present to a medical facility for treatment. The subjects, however, had similar Oswestry scores and pain location and severity as patients who typically are referred to our clinical setting. In addition, our subjects had chronic, recurrent LBP and minimal disability as indexed by the scores on the Oswestry Disability Index. We do not know if we would have similar results in subjects with an acute onset of LBP or with higher levels of disability. Future studies are needed to assess the use of the examination in subjects with acute LBP or higher levels of LBP-related disability.
A second potential limitation is the truncated distribution of the LBP classifications identified in the study sample. There were no subjects classified as lumbar flexion or lumbar extension in the current sample. Such a finding might also suggest that our study population may not represent all patients with LBP. We do know based on prior data22, 41 as well as those of others42 that the prevalence of lumbar flexion and lumbar extension problems appears to be less than that of the other proposed classifications. Although the percent agreement was 83%, the skewed distribution of subjects across categories may have contributed to an attenuation of the kappa value.
A third potential limitation is the fact that both examinations for each subject were performed within the same day. We chose to perform both examinations on the same day, however, to ensure stability of subject responses. Stability of the subjects’ behavior is an important assumption of a test-retest design so that any differences between test sessions is due to variability in therapist methods and not a result of true change in the subject over time.43 We also examined people within the same day to make the study more feasible for subjects to participate. A potential disadvantage of repeated testing in the same day is that subjects’ symptoms could be increased during the second examination compared to the first examination. Any differences in subjects between the two sessions, however, did not substantially affect our reliability as evidenced by the kappa value (k=.75) obtained.
Finally, the generalizability of our findings to other examiners may still be somewhat limited. Both therapists were experienced in treating musculoskeletal pain problems and had practiced applying the concepts of the MSI model for LBP to patients. The first author had used the examination and treatment principles in her clinical practice across 7 years. The second author had primary responsibility in developing the examination and used the procedures extensively in prior studies. We do not know if we would find similar reliability in examiners with less clinical experience or less experience applying the principles of the model that is the basis for the MSI classification system. Our primary purpose with the current study, however, was to examine what therapists’ reliability to classify would be when we used a more rigorous study design (test-retest design) and when someone who was not involved in the original development of the examination was tested. The current study suggests that the reliability to classify people with LBP under more stringent conditions is actually better than that attained in our earlier reliability study. An appropriate follow-up to the current work would be to examine the inter-tester reliability of novice, but trained examiners. Such work is currently underway. After a two day instructional course, 13 examiners with no experience to moderate experience with the MSI classification system classified written cases of data from people with LBP. Agreement among therapists was excellent with an overall kappa of 0.81 (CI: 0.78–0.83, p<0.01) (unpublished data).
Funded by the US National Institutes or Health (NIH), grant number: 52833.
The protocol used for the current study was approved by the Human Studies Committee of Washington University.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
I affirm that I have no financial affiliation (including research funding) or involvement with any commercial organization that has a direct financial interest in any matter included in this manuscript, except as disclosed in an attachment and cited in the manuscript. Any other conflict of interest (ie, personal associations or involvement as a director, officer, or expert witness) is also disclosed in an attachment.
Marcie Harris-Hayes, Assistant Professor, Program in Physical Therapy, Washington University School of Medicine, Campus Box 8502, St. Louis, MO 63108.
Linda R. Van Dillen, Associate Professor, Program in Physical Therapy, Washington University School of Medicine, Campus Box 8502, St. Louis, MO 63108.