|Home | About | Journals | Submit | Contact Us | Français|
To compare values for Kellgren-Lawrence (K-L) grade, joint space narrowing (JSN), and osteophytes (OST) in anteroposterior (AP) extended and fixed-flexion posteroanterior (PA-FF) radiographs obtained during a single clinic visit (the first follow-up of the Johnston County Osteoarthritis Project (JoCo OA)).
All films (n=1664 bilateral knees) were read by an experienced musculoskeletal radiologist (JBR). For each subject, AP and PA-FF films were read in one sitting. K-L grades (from 0–4), JSN and OST (from 0–3) were assessed using standard atlases. Descriptive statistics were calculated for demographic and clinical variables. AP and PA-FF results were compared by contingency table methods to obtain frequencies for K-L, JSN, and OST grades, using percentage agreement and kappa coefficients (κ). Results from the right and left knees were similar; data for the right knee are presented.
There was substantial agreement between AP and PA-FF reads for radiographic OA (rOA), defined as K-L grade ≥2 (89% agreement; κ=0.73 with 95% CI 0.69–0.76). Substantial agreement was also seen for tibial OST and medial JSN; slightly lower κ was observed for femoral OST and lateral JSN.
The requirements of large observational cohort studies are different than those of clinical trials, and sensitivity is less of an issue because of longer follow up times. In cohort studies such as the JoCo OA, there is substantial agreement by K-L grade for AP and PA-FF radiographs, allowing incorporation of older films in longitudinal analyses.
Longitudinal cohort studies of osteoarthritis (OA) have been essential to our understanding of OA, providing insights regarding risk factors and modification, natural history of disease and population variations (1–3). As these studies are expensive and time-consuming to conduct, there is a great need for readily available, cost-effective technologies to identify and to follow OA over many years. In contrast, clinical trials require a very sensitive method to show a difference between drug and placebo in the shortest possible time. While most studies of radiographic techniques have focused on improving sensitivity for clinical trials, longitudinal studies provide challenges to researchers as well.
Conventional radiography, which is relatively inexpensive and widely available, remains the most widely used and accepted method of establishing the diagnosis of radiographic OA (rOA). Anteroposterior (AP) weight-bearing radiographs are commonly obtained in clinical practice and are often used as a criterion for study entry (4), while other techniques, often utilizing posteroanterior (PA) views and fluoroscopic positioning to maximize sensitivity to change, are used for follow up (5–7). As fluoroscopy adds expense, time, and the need for trained technologists, several investigators have proposed non-fluoroscopy based methods to achieve reproducible, reliable measures of rOA incidence and progression (8, 9). One such technique, utilizing “fixed-flexion,” has shown reproducibility comparable to fluoroscopically-guided radiographs (8, 10).
The Johnston County Osteoarthritis Project (JoCo OA) is a large, population-based cohort study in rural North Carolina that has been collecting data since the early 1990’s. Baseline radiographs, obtained from 1991 to 1997, included weight-bearing AP extended radiographs of the knees. Because of subsequent evidence suggesting superiority of PA views, the protocol was changed at the first follow up time point (1999–2004), when both the AP view and a posteroanterior, fixed-flexion (PA-FF) view using the SynaFlexer™ positioning device were obtained. The aim of the current analysis was to compare the AP and PA-FF radiographs, obtained at the same clinic visit and read together, by Kellgren-Lawrence (K-L) grade, overall categorization of rOA, semiquantitative joint space narrowing (JSN), and osteophytes (OST). Our hypothesis was that the AP and PA-FF views would show strong agreement for K-L grading, with lesser agreement, as suggested in prior studies, for JSN and OST (5–7).
This cross-sectional sample includes individuals enrolled in the JoCo OA, an ongoing, biracial, population-based study in rural North Carolina. Details of this study have been reported elsewhere (11), and it has been approved by the Institutional Review Boards of the Centers for Disease Control and Prevention and the University of North Carolina. Briefly, this study involved civilian adults aged ≥ 45 years who resided in Johnston County, NC, who were recruited by probability sampling with oversampling of African Americans. The sample for the current analysis included individuals who participated in the first follow-up of the study, conducted between 1999 and 2004 (n = 1733); 69 individuals were excluded from the current analysis because of lack of complete radiographic data for both knees, leaving 1664 for the current analysis. There were 37 right knees and 21 left knees with total joint replacements, leaving 1627 right knees and 1643 left knees for analysis of K-L grades. As JSN and OST reads were added later, these measures were available on a subset of films (n=518). All participants underwent knee radiography using two methods:
All films were read by a single experienced musculoskeletal radiologist (JBR) previously shown to have high intra- and inter-rater reliability (κ=0.89 and 0.86, respectively) (2). Each participant’s AP and PA-FF films were read simultaneously in one sitting. K-L grades were assessed using a standard atlas as previously described (12). JSN was graded for both the medial and lateral tibiofemoral joint compartments using the Burnett atlas (0–3); OST were graded for medial and lateral compartments and for femoral and tibial aspects, from 0–3 in each site (13).
Descriptive statistics were calculated for demographic and clinical variables including race, gender, age, and body mass index (BMI). AP and PA-FF results were compared by contingency table methods to obtain frequencies for K-L, JSN, and OST grades. Crude percentage agreement was calculated, and kappa coefficients (κ) were determined, to assess agreement exceeding chance alone (14). When using the range of K-L grades (0–4) or JSN and OST (0–3), weighted κ was used to grant partial credit to results differing by one or more categories of agreement (as opposed to no credit for simple κ). Unweighted κ was used to compare dichotomous K-L grade (K-L <2 compared to K-L ≥ 2). Since the results were very similar for the right and left knee, only data from the right knee are presented.
The overall sample consisted of 65% women and 73% whites, with a BMI of 30.2 ± 6.2 kg/m2 and a mean age of 66.0 ± 9.9 years. The smaller subsample for JSN and OST was slightly older (67.1 ± 9.4 years) with a greater proportion of whites (81.2%), but was otherwise similar to the overall sample.
Sixty-seven percent of right knees (1099/1627) were categorized as not having rOA (K-L grade <2) by both methods. K-L grade ≥ 2 was identified in 21.6% of knees (352/1627) by both methods. Overall agreement as to the presence or absence of rOA for the two methods was 89.2%, with κ=0.73 (95% CI 0.69–0.76), indicating substantial agreement for the two methods (Table). For the 10.9% of knees in which the two methods differed, 4.9% (79/1627) of knees were graded as <2 in the PA-FF view but ≥ 2 in the AP view, while 6.0% (97/1627) were graded as <2 in the AP view and ≥ 2 in the PA-FF view.
For comparison by individual K-L grades between the two views, a slightly greater percentage of knees were given a K-L grade=0 on the AP view compared to the PA view (41.5% and 39.5%, respectively, Figure). K-L grade=1 was assigned to a similar percentage of knees on both views (AP 32.0%; PA 32.9%). More knees were given a K-L grade=2 on the AP (14.5%) compared to PA (11.4%) view, while more knees were graded as K-L=3 or 4 by the PA view (10.5 or 5.7%) than the AP view (8.7 or 3.3%, respectively). Weighted κ for individual K-L grades was 0.69 (95% CI 0.66–0.71, Table).
Agreement between AP and PA-FF views was substantial for medial and lateral tibial OST (Table). Medial tibial OST received the same grade on both views in 82.6% of knees, while another 17% of knees were assigned grades within one level on the two views (i.e. OST=0 on one view and OST=1 on the other view). A trivial number of knees (0.2%) differed by more than one grade between views (i.e. OST=0 on one view and OST=2 on the other view). Lower κ and wider confidence intervals were noted for the less frequent medial and lateral femoral OST. Agreement was high for medial JSN, with weighted κ =0.68; 77.8% of knees were assigned the same JSN grade on both AP and PA-FF views, and another 21.1% of knees were within one grade (i.e. JSN=0 on one view and JSN=1 on the other), while only 1.2% of knees were assigned grades that differed by more than one level (i.e. JSN=0 on one view and JSN=2 on the other). Lateral JSN was less frequent and had slightly lower kappa statistics with wider confidence intervals (Table).
In this analysis of more than 1,600 AP and PA-FF knee radiographs taken at a single clinic visit and read together by a single, experienced musculoskeletal radiologist, we found substantial agreement between the two views for categorizing rOA (K-L < 2 versus K-L ≥ 2), for individual K-L grades, and for the most commonly identified radiographic features, tibial OST and medial JSN. This level of agreement by κ rivals that of inter- and intra-reader reliability of the K-L grade itself in other large studies (1, 3). Therefore, for the purposes of categorizing joints by affected status (rOA versus not rOA), the results from these views are comparable. This result suggests that for broad comparisons of rOA prevalence in a longitudinal cohort study such as the JoCo OA, both AP and PA-FF views can be incorporated into analyses spanning multiple time points. However, for specific analyses of individual radiographic features, the baseline AP films are not as readily comparable to PA-FF views.
We were unable to identify any prior large head to head studies of AP compared to PA views of the knee for K-L grade, as the focus has generally been on joint space width. There are, however, studies showing the superiority of weight-bearing compared to non-weight-bearing views (15), and of flexion views compared to extended views in either projection (5, 6) for the detection of JSN. In the current analysis, the AP and PA-FF films were read simultaneously by a single experienced reader, which although a strength of our study, may limit the generalizability of our findings to other studies. Because the films were read in this manner rather than independently, it is not possible to determine whether one view is superior to the other in the current analysis. However, a few observations can be made. There was substantial agreement between the views for OST, despite concerns that uncontrolled rotation in AP views may obscure OST. Most of the observed difference in grades for K-L and JSN were in the most severe categories, in which PA-FF views tended to be graded higher than AP views. This suggests that AP views underestimate JSN, which is in agreement with the literature (5–7).
Other radiographic techniques may improve the sensitivity and reproducibility of measurements by improving the alignment of the tibial plateau. Compared to AP extended views, fluoroscopically positioned semi-flexed PA knee radiographs produce higher semiquantitative JSN scores and smaller mean joint space width measurements, and have a greater sensitivity to change over one year (6, 7). While the use of fluoroscopically positioned films has demonstrated the greatest sensitivity to change (6, 9), this technique requires costly equipment and trained technologists, making it logistically challenging for large, longitudinal OA studies (8). Therefore, it is likely that standardized single views such as the PA-FF discussed herein will continue to be utilized for both clinical and research purposes.
Analyses of AP extended weight-bearing knee radiographs, commonly used in clinical practice and screening for trial inclusion, agree substantially with PA-FF knee radiographs for semiquantitative grades (K-L, OST, and JSN) in this cross-sectional study.
This work funded in part by: Nelson: American College of Rheumatology Research and Education Foundation Clinical Investigator Fellowship Award, NIH Loan Repayment 1 L30 AR056604; Renner/Jordan: Centers for Disease Control and Prevention/Association of Schools of Public Health S043 and S3486; Renner/Schwartz/Jordan: NIH/NIAMS Multipurpose Arthritis and Musculoskeletal Diseases Center grant 5P60AR49465; Shreffler: NIH/NIAMS 5R01AR053989