|Home | About | Journals | Submit | Contact Us | Français|
Infiltration of the rotator cuff musculature with fatty tissue is a well-known feature of tears in the rotator cuff tendon. In a rabbit model, sectioning of the supraspinatus tendon resulted in increased adipose tissue in the muscle belly on histologic examination after 4 weeks . The following year, Goutallier et al.  reported results of CT scans of the shoulder in patients undergoing repair of the rotator cuff. The injured shoulders showed greater fatty infiltration than normal, asymptomatic shoulders, which the authors classified in a spectrum of five stages .
With the advent of MRI to evaluate the rotator cuff, Fuchs et al.  compared CT and MRI assessment of fatty infiltration. Although interobserver agreement was good to excellent between musculoskeletal radiologists for CT and MRI evaluation individually, the correlation between CT and MRI grading of infiltration was only moderate. Fuchs et al.  attributed this to the improved ability of MRI to distinguish between healthy muscle and fibrous tissue. Despite imperfect correlation with CT, the adaptation by Fuchs et al. for MRI is widely used [8, 10, 11] owing to the shift to MRI as the predominant imaging study for rotator cuff injuries.
The Goutallier classification serves primarily as a prognostic tool, which may help clinicians anticipate the potential benefits of various interventions. Fatty infiltration has been shown to be a poor prognostic factor for repair of the cuff tendons [4–6, 10, 14–17, 30]. Goutallier et al.  reported functional Constant and Murley score and radiographic (MRI and/or CT scans) outcomes for 220 shoulders undergoing rotator cuff repair; the rate of recurrent tear was greater in patients with fatty infiltration of the muscle. A longer-term study by Goutallier et al.  showed a strong correlation between the Constant and Murley score (a functional outcome score for the shoulder) at latest followup and preoperative fatty infiltration. In another study , preoperative fatty infiltration and muscle atrophy were found to be independent predictors for final American Shoulder and Elbow Surgeons and Constant and Murley scores after rotator cuff repair. Interestingly, fatty infiltration was a better predictor of outcome than either tear size or recurrence in that series. A meta-analysis investigating outcomes of rotator cuff repair showed a strong correlation between fatty infiltration and rate of retear, but available published data were too heterogeneous to determine whether clinical outcomes were affected .
Owing to its prognostic importance, surgeons have used the Goutallier classification to guide treatment options for rotator cuff tears . The classification also has been used to define conditions for inclusion or exclusion in studies . In addition, the simple numerical scale provides a means for clinicians to concisely communicate the severity of infiltration.
The original Goutallier classification had five stages, ranging from Stage 0 (normal muscle) to Stage 4 (more fat than muscle) (Table 1) [12, 13]. Fatty infiltration was characterized by areas of decreased radiodensity using noncontrast CT scans. Since the original description, the classification has been adapted for use with MRI (Fig. 1). The modification of Fuchs et al.  used the same grading system as Goutallier et al. , although a system using only three stages also was proposed (Table 1). Slabaugh et al.  proposed a simplified system as well, but combined Stages 2 and 3 rather than 3 and 4 based on their statistical analyses and proposed clinical significance of each stage.
Goutallier et al.  used axial CT cuts in a soft tissue window to grade each tendon. The supraspinatus was assessed at the section of the muscle with the greatest surface area between the scapular spine and the rest of the scapula. For the infraspinatus and subscapularis tendons, two sections were evaluated: a superior section at the level of the lateral attachment of the scapular spine and an inferior section at the lowest point of glenohumeral articulation.
For the MRI adaptation of the classification, there has been controversy regarding the ideal technique to use for grading. Fuchs et al.  used a T1-weighted turbo spin-echo sequence for grading; all muscles were evaluated at the most lateral parasagittal image on which the scapular spine was in contact with the scapular body. The superior and inferior parts of the subscapularis and infraspinatus muscles were evaluated separately and a mean was calculated like with the CT method. A similar method using MR arthrography was reported in another study using an oblique sagittal T1-weighted image selected at the level of the coracoid base . More recently, Schiefer et al.  reported high intraobserver and interobserver agreement when investigators had full access to the examinations without specifying a definitive plane or image.
Fuchs et al.  reported interobserver agreement between two raters, a musculoskeletal radiologist and a radiology fellow. Weighted kappa values for agreement ranged from 0.68 to 0.83, depending on the tendon selected; no intraobserver agreement was reported. Use of the proposed three-stage system did not result in meaningfully different results. Similar kappa values were reported in another study (intraobserver mean kappa, 0.69; interobserver mean kappa, 0.50) . Lesage et al.  reported agreement among five independent raters using the original CT classification of Goutallier et al. for 56 shoulders. The interclass correlation coefficient was good for interobserver agreement (0.75), but a high level of intraobserver reliability was found only when limited to the three senior raters (0.78). This improved reliability highlights that the level of experience of the observer can affect the validity of a classification scheme.
Fuchs et al.  reported superior intraobserver agreement for MRI grading on a five-point scale compared with CT grading (Table 2). Another group  reported good interobserver and interobserver agreement among six raters using the MRI adaptation of the Goutallier et al. scale, including three shoulder surgeons and three musculoskeletal radiologists. The surgeons showed higher intraobserver agreement and interobserver agreement compared with radiologists. However, other studies have shown only moderate or poor intraobserver agreement (Table 2) [22, 27, 31, 32]. The reasons for the high variation among studies reporting reliability of this scale are unclear, but differences in methodology and specificity of which sequences and sections were used for grading may have played a role. Studies that only used a single representative image for grading tended to have lower interobserver reliability [22, 31], as did studies with 10 or more raters [31, 32].
Despite attempts to standardize the grading system, there is lack of quantitative data in the classification of Goutallier et al. Recent efforts to more precisely quantify the amount of fatty infiltration have shown improved reliability with measurement of Hounsfield units on CT images  and T2-sequence mapping  or two-point Dixon sequence measurements  for MRI scans. In addition, the relatively wide range of five stages has been cited as a potential reason for low reliability  and has resulted in simplified three-stage classifications. These simplified classifications have not met with widespread use and may cause confusion when communicating imaging findings. The radiation exposure of CT and cost considerations of MRI may limit the use of these modalities, particularly in a research setting. Ultrasound is a less-expensive modality that has comparable diagnostic performance to MRI for detecting fatty infiltration; however, results can be operator-dependent .
The physical cross-section of the examined muscle also may differ substantially depending on how far a torn tendon has retracted. Meyer et al.  showed, on MRI studies, that the superficial portion of the supraspinatus tendon reacts differently to tendon retraction than the deep portion, suggesting architectural changes in muscle belly fibers. These findings were confirmed in an experimental sheep model in which retracted tears had a greater amount of fatty infiltration . A recent clinical study compared immediate postoperative MRI scans after rotator cuff repair with preoperative scans . Interestingly, postoperative imaging showed an immediate improvement in Goutallier scores and tendon atrophy scores, suggesting that repositioning the tendon had a substantial effect on both.
Subjectivity of the grading system is an additional limitation, which some researchers have attempted to improve on by using quantitative analysis of cross-sectional imaging. One study used quantitative MRI measurements of the fat fraction in rotator cuff tendons and compared these with the Goutallier score . Increasing fat fraction correlated well with a higher Goutallier score, aside from Grades 3 and 4, for which there was no difference. This method used manual outlining of rotator cuff muscle area on each MRI slice, which is time-consuming and may introduce some methodologic bias.
The classification of Goutallier et al. and its modifications have shown prognostic utility and often are used to guide treatment and define clinical study inclusion. A wide range of reliabilities for the classification has been reported, with most studies showing moderate or good agreement. Quantitative assessments, such as measurement of fat fraction volume, have the potential to improve the score’s clinical precision and use.
Each author certifies that he or she, or a member of his or her immediate family, has no funding or commercial associations (eg, consultancies, stock ownership, equity interest, patent/licensing arrangements, etc) that might pose a conflict of interest in connection with the submitted article.
All ICMJE Conflict of Interest Forms for authors and Clinical Orthopaedics and Related Research ® editors and board members are on file with the publication and can be viewed on request.