|Home | About | Journals | Submit | Contact Us | Français|
Classification systems for glenohumeral instability (GHI) are opinion based, not validated, and poorly defined. This study is designed to methodologically develop and test a GHI classification system.
A systematic literature review identified 18 systems for classifying GHI. The frequency characteristics used was recorded. Additionally 31 members of the American Shoulder and Elbow Surgeons responded to a survey to identify features important to characterize GHI. Frequency, Etiology, Direction, and Severity (FEDS), were found to be most important. Frequency was defined as solitary (one episode), occasional (2–5x/year), or frequent (>5x/year). Etiology was defined as traumatic or atraumatic. Direction referred to the primary direction of instability (anterior, posterior, or inferior). Severity was defined as either subluxation or dislocation.
Fifty GHI patients completed a questionnaire at their initial visit. One of six sports medicine fellowship trained physicians completed a similar questionnaire after examining the patient. Patients returned after two weeks and were examined by the original physician and two other physicians. Inter- and intra-rater agreement for the FEDS classification system was calculated.
Agreement between patients and physicians was lowest for frequency (39%; k=0.130) and highest for direction (82%; k=0.636). Physician intra-rater agreement was 84– 97% for the individual FEDS characteristics (k=0.69 to 0.87)). Physician inter-rater agreement ranged from 82–90% (k=0.44 to 0.76).
The FEDS system has content validity and is highly reliable for classifying GHI. Physical examination using provocative testing to determine the primary direction of instability produces very high levels of inter- and intra-rater agreement.
Level II, Development of Diagnostic Criteria with Consecutive Series of Patients, Diagnosis Study.
A number of authors have proposed different methods to classify glenohumeral joint instability.1,4–7,10,13,19,22,24,27–29,31,35,36,39,47 These classification systems are based on expert opinion and, to date, no method has been assessed for validity and reliability, nor gained widespread acceptance. As a result, diagnoses for certain forms of instability (e.g. multidirectional instability, bidirectional instability, subtle instability and voluntary instability) have multiple and sometimes discordant definitions.9,17,19,23 This lack of consensus has produced a pot-pouri of descriptive terms for this condition which confuses clinicians3 and the literature.23,30 This is reflected by the most commonly used system to classify instability in the United States, the ICD-9 codes, which has been shown to have poor reliability.40
Without established, validated, well-defined diagnostic criteria for classifying glenohumeral joint instability, comparing studies and compiling data in systematic reviews or meta-analyses is difficult and replete with error. The purpose of this work is to 1.) Methodically develop a system of classifying glenohumeral joint instability with content validity, and 2.) Determine the reliability of this classification system.
Content validity exists when a classification system reflects characteristics that are important for the condition it classifies. Two methods were used to identify important content in developing the classification system: a systematic review of the literature, and a survey of the membership of the American Shoulder and Elbow Surgeons (ASES).
Using the PubMed search engine, with search terms “shoulder” and “instability” and “classif$” or “definition”, 142 manuscripts were identified. The titles and abstracts were reviewed seeking any manuscript that might include a classification system for instability. Inclusion criteria included clinical studies, manuscripts anatomically confined to the glenohumeral joint, and review manuscripts. We identified 25 manuscripts which were retrieved and reviewed. The cited literature from these manuscripts was also reviewed to identify any other potential manuscripts that might include classification systems. Additionally, eleven chapters from nine textbooks on the shoulder and shoulder instability were reviewed.1,5–7,9,14,28,32,45,47 This search identified 18 different proposed classification systems for glenohumeral instability. The individual characteristics for each system were compiled in table format. Those characteristics of the classification system that occurred most frequently (in >50% of classification systems) were considered to be the most important and would be included when developing a classification system.
To further assure content validity, the membership of the American Shoulder and Elbow Surgeons (ASES) was surveyed at the annual closed meeting in 2005. This survey was developed with the assistance of epidemiologists and a sample of ASES members (Appendix 1). The survey asked respondents to rank a number of features with regard to their importance in defining and diagnosing glenohumeral joint instability. A seven-point Likert scale was used with 1= not at all important, and 7= extremely important. The average values for responses from each question were determined. Items graded as extremely important (average score > 6.0) were used in developing the classification system for glenohumeral joint instability.
The literature search identified 18 different proposed methods for describing glenohumeral joint instability.1,4–7,10,13,19,22,24,27–29,31,35,36,39,47 The individual and distinct characteristics for each classification system were extracted and listed in table format (Table 1). Of the many different features used to characterize glenohumeral joint instability, etiology, direction, severity and frequency were used most commonly in more than 60% of classification systems. Other features including voluntary instability were used in less than 40% of the proposed classification systems. With these data, the authors developed a classification system for glenohumeral instability called the “FEDS” Classification System. “FEDS” is an acronym for Frequency, Etiology, Direction and Severity. These features were the most commonly cited characteristics in the previously proposed systems for classifying instability.
To further sub-classify the FEDS criteria, the membership of the ASES was surveyed during their 2005 Annual Closed Meeting. Of 130 members who attended the 2005 Annual Closed Meeting, 31 surveys were returned (23.8%). Salient findings from this survey included 90.3% of respondents believe instability is poorly defined in the literature (Appendix 1). The patient’s history and physical examination in the office were rated as extremely important features for assessing patients with glenohumeral joint instability. With regard to the history, a history of trauma, and the patient demonstrating the position of the arm that reproduces the symptoms were most important. For the physical exam, finding a position of the arm that reproduces symptoms, provocative tests, and reproduction of symptoms during translation testing were extremely important features. In addition the physical exam in the office was rated as extremely important in determining the direction of the instability.
None of the radiographic techniques scored as extremely important in identifying the type of glenohumeral joint instability. None of the examination under anesthesia findings was scored as extremely important in identifying the type of glenohumeral instability. Because the history and physical examination using provocative tests were considered extremely important, these elements were selected to be included in the classification system, whereas findings from imaging, examination under anesthesia, and surgery were not.
A panel of five shoulder experts (defined as fellowship-trained academic shoulder specialists with >10 years experience) reviewed the results of the ASES survey and during open discussion came to consensus regarding methods to sub-classify the individual FEDS criteria (Figure 1). As the ASES survey rated history and physical examination as extremely important the sub-classification details must rely on the history and physical examination. As imaging and findings at surgery were not rated as extremely important, they were not used. With regard to etiology-a history of trauma was rated as extremely important and as such was used as the distinguishing criteria. With regard to direction, physical examination in the office using provocative testing was extremely important, and was the criteria used. By consensus the group chose to describe a primary direction for instability as the literature suggests that the concept of multidirectional instability may be difficult to define has poor agreement.3,17,23,30 With regard to severity, requiring assistance to reduce the shoulder was the only historical information that the group identified to gage severity of the instability. As such any patient with an episode requiring assistance to reduce the shoulder would be placed in the dislocation class. With regard to frequency, the group agreed that a solitary episode is managed differently than someone with occasional episodes, which may be managed differently than someone with frequent episodes. Frequency was therefore divided into solitary (one episode), occasional (2–5x/year), and frequent (>5x/year). A denominator of one year was chosen due to the seasonal nature of athletics.
Institutional Review Board approval was obtained prior to initiating this effort. The sample size for reliability testing of the FEDS classification system was derived from Walter et al,44 whereby 40 subjects would be required in a study using 3 raters and a minimal level of reliability of 0.5 to achieve a significance level (alpha) of 0.05 and give the study 80% power (beta = 0.20). All patients who presented to our institution with a shoulder complaint were asked “Do you have or have you had the feeling that your shoulder is slipping, falling out, dislocating, or is loose?” If the patient answered yes to that question, they were considered to have a chief complaint of shoulder instability and were eligible for the study. From December 2005 through February 2007, 50 patients presenting to our institution with the chief complaint of shoulder instability were enrolled.
After informed consent was obtained, patients completed a simple survey (Figure 2), which allowed patients to use the FEDS classification system to classify their instability. One of six fellowship trained sports medicine physicians then completed a similar survey (Figure 3). Prior to initiating the study, physicians were instructed in how to determine the primary direction of instability by reviewing a video that demonstrated the following provocative physical exam tests: apprehension test, posterior jerk test, sulcus sign, as well as translation testing in the awake patient in the anterior, posterior and inferior directions with reproduction of symptoms as the outcome of interest. Physicians were instructed to ask the patients which of the tests reproduced their instability symptoms. Patients who had symptoms for more than one direction were asked to identify the one direction that was the best at reproducing their symptoms. This determined the primary direction of instability. Physicians classified patient instability using this and the rest of the FEDS system (Figure 3).
Both inter-rater and intra-rater agreement was determined by having patients return to the office after a period of 2–4 weeks, where they completed the survey again. The treating physician completed the physician survey at that visit. In addition two other physicians examined the patient and completed surveys. Three reliability assessments were made: 1.) Patient vs. physician inter-rater agreement. As much of the FEDS classification derives from the history, patients and physicians should agree. We hypothesized that patients might have difficulty identifying the direction of the instability. 2.) Treating physician vs. him or herself agreement. Intra-rater reliability was obtained by comparing the treating physician survey from the initial visit to the survey at the second visit. 3.) Treating physician vs. other physicians. Inter-rater reliability was tested by comparing the treating physician’s second evaluation survey to the surveys completed by the two other examining physicians. Observed agreement, kappa statistics and strength of associations18 were calculated.
No patient who was eligible declined to participate, as such this represents a consecutive series of patients. Of the 50 patients that enrolled in the study, 48 completed all study surveys (96%).
Because information from the FEDS classification is largely based on the history, patients may be capable of classifying their own instability. Inter-rater reliability in this circumstance ranged from slight for frequency (39% agreement, k=0.13) to substantial for direction (82% agreement, k-0.548) (Table 2).
Intra-rater agreement was very high for the FEDS classification with agreement ranging from 84–97%, and kappas ranging from 0.687 to 0.874 (Table 3). The highest agreement was seen for direction and etiology.
Different physicians showed high agreement using the FEDS classification. Agreement ranged from 82%–90%, with kappas ranging from 0.437 to 0.764, with substantial to moderate strength of agreement (Table 4).
Currently there is no accepted classification system for glenohumeral joint instability, which leads to confusion in the literature. McFarland et al compared four different classification systems for patients with instability and found great variation, particularly with regard to multidirectional instability,23 leading the editors of the Journal of Bone and Joint Surgery to opine that McFarland’s article was a “…provocative call to action”, and “Until the criteria for diagnosis are clearly defined, investigators will be unable to contribute in a compelling way to understand the condition since they cannot know whether studies are comparing ‘apples and oranges’.30 This was supported by work by Chahal et al,3 where physicians had poor agreement when asked to classify patient scenarios of glenohumeral joint instability.
This difficulty may stem from the fact that much of the historic literature in orthopaedics is treatment-based, whereas all patients who received a particular treatment are reviewed retrospectively. This is exemplified by Neer’s classic paper on multidirectional instability.25 Neer included all patients who had an inferior capsular shift, yet his patient population was diverse with a variety of instability features (17% atraumatic, 73% traumatic; 73% with anterior symptoms, 73% with posterior symptoms; 5% dislocations, and 95% subluxations). These treatment-based studies with a mixed population of patients leads to confusion defining and classifying diagnoses, and as a result leads to confusion regarding which treatments are effective. Ideally research should be condition-based, where a clearly defined group of patients is isolated and different treatments are compared. A valid and reliable classification system for glenohumeral joint instability will allow for this kind of condition-based research.
The FEDS system for classifying instability meets this challenge. It has content validity based on published literature and a survey of experts in the field. It has been shown to be very reliable as well. It is simple to use and does not require expensive diagnostic imaging or examinations under anesthesia. Interestingly, none of the individual components of the FEDS system (frequency, etiology, direction or severity) demonstrated higher agreement among physicians than the others. Also of interest was the unexpected finding that patients and physicians agreed on the direction of instability 82% of the time, with substantial kappa strength, suggesting that patients may be able to accurately describe the direction of their instability.
It could be argued that this classification system is limited as it does not include some of the accepted instability descriptors historically used to classify patients, namely voluntary instability, subtle instability, and multidirectional instability.
Voluntary instability is a concept best explored by Rowe in 1973.34 In this landmark study, Rowe collected a group of patients who were able to demonstrate their instability to the clinician. Rowe administered psychological profile testing to these patients and determined that those who scored poorly did not do as well with surgical intervention. These data suggest there are two populations of people who can demonstrate their shoulder instability. Some are reluctant, but can show their instability to the treating physician, typically with pain or discomfort, a group we call demonstrable instability. Others can demonstrate their instability for secondary gain or other issues, which we call volitional instability. However, because physicians do not perform psychological testing on their patients, this concept has led to a great amount of confusion with a variety of other descriptors for this condition in the literature including “habitual instability”14 (which has erroneously included voluntary and involuntary by some authors11) and “involuntary positional instability”.38 Because these definitions are confused in the literature and it is difficult to distinguish which patients may have psychological issues without psychological testing,19,34 we believe the term “voluntary instability” is not particularly accurate in classifying glenohumeral joint instability.17 We would suggest using the FEDS classification system to describe these patients and using the terms “demonstrable” or “volitional” as subcategories only if psychological profile testing is used to distinguish these patients.
Carter Rowe also described the “Dead Arm” Syndrome in 1987.33 Many of his patients were aware of their arm slipping, others were not. He considered all to have instability and performed instability surgery to treat them. In the FEDS classification system, only those who feel as if their arm is slipping would be considered to have instability. The problem is, as Rowe noted, pain is not specific for instability. Many of Rowe’s patients had “signs and symptoms of bursitis, biceps tendonitis, nerve impingement, cervical spine referred pain, and thoracic outlet syndrome”.33 As such, it is not clear if these patients truly had instability. We cannot include these patients in a classification of instability without severely diluting the accuracy of the diagnosis. Similarly, Frank Jobe, in 1989 created a term for an athlete with shoulder pain called “subtle instability”12 (also known as “occult instability” as described by Garth et al8). In this condition the patient may not have symptoms of the shoulder subluxing or dislocating. Yet excessive laxity presumably leads to other pathologies and other symptoms like pain. Jobe used an instability operation to treat these patients and reported good success.12 We would argue that the term “subtle instability” is a poor choice, and that perhaps “presumptive excessive laxity” would have been better, as these patients have symptoms of pain and not a sensation of a loose, slipping, or dislocating shoulder.17 We believe that as our understanding of the pathomechanics of the thrower’s shoulder develops, a unique system for classifying different grades of pathology in the painful shoulder of the athlete will evolve.
In 1980, Neer described the condition of multidirectional instability, which gained widespread acceptance.25 We purposefully decided to avoid the concept of “multidirectional instability”, and instead focused on the primary direction of symptoms when describing the direction of the instability. We did this for the following reasons: 1.) the term “multidirectional instability” has been used by different authors to mean different things.23 As a result the literature is very confusing,3,13,20,40 and it is doubtful that a consensus for this term will ever be reached. 2.) Neer originally described the condition of multidirectional instability as having the sine qua non feature of an increased sulcus sign.25 His patients would not be neglected in the FEDS system, which would classify these patients in the primary direction inferior groups. In our opinion, the FEDS classification would provide better resolution, as the other important features would segregate these patients with consistently less variation. 3.) it could be argued that every form of shoulder instability could have excessive translations in multiple planes as biomechanical research and clinical studies suggest that the capsule of the glenohumeral joint behaves as a circle and that injuries are unlikely to produce damage in only one part of the capsule.26,41,45–46,48 These points argue for the elimination of the concept of multidirectional instability and argue for the concept of a primary direction of the instability. Interestingly provocative physical examinations tests looking for a reproduction of the patient’s symptoms for instability, including the anterior apprehension test, the sulcus sign, and translation tests that reproduce symptoms have been found to be sensitive, specific, and have high predictive values, with reasonable inter-examiner reliability.2,21,37,42,43,48 Therefore, these features are the best available to evaluate patients with shoulder instability. In the FEDS system, they are used in a comparative fashion to identify the primary direction of instability by finding which provocative test is most uncomfortable or most closely reproduces the patient’s symptoms.
One potential criticism is the timing of the second visit for intra-rater reliability. Patients returned for repeat evaluation between 2 and 4 weeks after their initial evaluation. This interval was chosen as it is likely narrow enough to prevent changes in the status of the instability (e.g. a second event which could be more severe), and wide enough to prevent patient or physician recall which could influence the outcome.15,16
The sub-classifications of the FEDS system were based on a consensus of a group of shoulder experts as there is little data available to provide guidance. For example the frequency of instability was divided into solitary, occasional (2–5/year), and frequent (>5/year) somewhat arbitrarily. When clinical data becomes available, if a clinically meaningful threshold exists between occasional and frequent episodes, the FEDS classification system can be modified. With regard to severity, we defined a dislocation as requiring assistance in reducing the shoulder. Any patient who has required assistance at any time would be classified as a dislocation. Anatomically a dislocation is a complete dissociation of the humeral head from the glenoid. This would require radiographs in all patients to confirm the severity of the injury. Using the requirement for assistance to reduce the shoulder is a surrogate definition, yet likely correlates well with the anatomic definition. Future studies that correlate radiographic, magnetic resonance imaging, and surgical pathology to the FEDS definitions for instability may be required to validate this concept.
Another criticism is that anatomic features are not part of the general FEDS classification system. For example, a Frequent, Traumatic, Anterior Dislocation with a large bony defect likely requires a different surgical procedure than a patient without that pathology. We would argue that many features (anatomical, generalized laxity, activity level, occupation) may influence outcome, but before we can study their influence on outcome, we must first clearly define the population under study. As such, a researcher would first identify a population of Frequent, Traumatic Anterior Dislocation patients and then study the influence of these features as subtypes and assess their impact on outcome.
Another limitation is that our survey of the American Shoulder and Elbow Surgeons had a relatively low response rate (23.8%), which may bias the result. We attempted to remedy this problem by using only those features that were rated as extremely important with a high average score (>6.0 of 7.0 points) on the Likert scale. Interestingly the results of the literature search looking at the frequency of characteristics reflect the survey results in that features that are derived from the history and physical examination are not only the most commonly used features in previous studies, but are also those considered as extremely important by the respondents suggesting there is some content validity to the survey results.
Finally another potential criticism of the FEDS classification may be that it is capable of classifying patients into too many groups. The FEDS system has 36 potential classes of shoulder instability (Table 5), and each class represents a distinctly defined diagnosis. While this seems excessive, it is important to note that 15 classes would be extremely uncommon (e.g. atraumatic dislocations). It is clear that the system does have enough breadth to include other commonly described types of instability (Table 5).
The FEDS classification system relies on the history, focusing on the patient’s perception of his/her disorder- a primary tenet of outcomes research. The physical exam, which uses provocative testing (which has high validity and reliability), is used to reliably identify the primary direction of the instability. Imaging and surgical information (which may not be obtained for many patients) is not required. This classification system has content validity from two distinct sources, and high inter- and intra-observer reliability. As such, we would recommend classifying glenohumeral joint instability using the FEDS system. A methodically developed and reliable system for classifying instability should reduce confusion in the literature, and provide a basis for condition-based research on this collection of disorders.
We gratefully acknowledge the assistance of Kurt P. Spindler, MD, Andrew Gregory, MD, Paul Rummo, MD, and Gene Hannah, MD who assisted in the examination of patients. The authors would also like to acknowledge William Mallon, MD, Peter MacDonald, MD, and Michael Pearl, MD for their assistance and insight.
Funding: This research was funded in part by a grant from the Mid-American Orthopaedic Association Resident Research Grant, the Vanderbilt Orthopaedic Arthur Brooks Fund for Resident Education and Research, and by Grant Number 5 K23 AR052392-03 from the National Institute of Arthritis and Musculoskeletal and Skin Diseases, and from a Pfizer Scholars Grant in Clinical Epidemiology. Funds were used for personnel support, printing of materials, and data collection.
IRB approval was obtained by the Vanderbilt University Medical Center Institutional Review Board, IRB #051044
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Disclaimer: None of the authors, immediate family members, nor any research foundation with which they are affiliated received financial payments or other benefits from any commercial entity related to the subject of this article.