Using a sample of knees in which both BLOKS and WORMS scales were scored separately by the same readers, we have compared the validity of BLOKS and WORMS scales and found modest differences between the scales in terms of their performance. The clear cut differences were that the WORMS BML scoring which is based solely on the size or volume of all BMLs in a given subregion had consistent superiority over BLOKS approach to BML scoring. First, WORMS BML scores predicted cartilage loss better than the size variable in the BLOKS scale. Further, the BLOKS scale for BML's includes measures such as adjacency and cyst percentage which were tedious to score and did not appear to provide any predictive value with respect to cartilage loss. Scoring the change in size of individual BMLs was especially challenging when, as often occurred, BML's split or merged in follow-up images, a problem not encountered in WORMS scoring. It has to be noted that cyst assessment using WORMS which is separate from the BML score was not part of this analysis.
The other main finding was that the meniscal scoring of BLOKS appeared consistently superior to that of WORMS in several ways. First the inclusion of signal abnormality which is scored on the BLOKS scale but does not have an equivalent score in WORMS predicted cartilage loss based on the small amount of data we collected. This finding would be missed and many menisci with this abnormality would be characterized as completely normal if the WORMS scale for meniscal scoring were used. Meniscal scores in BLOKS predicted cartilage loss better than WORMS scores. Also, specific types of tears were differentiated in BLOKS and not in WORMS. A deficiency of the BLOKS meniscal scale is that it does not differentiate between partial and full maceration (or partial vs. full meniscal loss). Also, BLOKS defines subregions for cartilage and BMLs scores that are not concordant with the location of meniscal damage, making it difficult to match a particular meniscal lesion geographically with a lesion in cartilage or bone. This latter deficiency can be rectified by using the WORMS regions to score cartilage and bone and the former can be solved by adding a partial maceration meniscal loss to the BLOKS meniscal scale.
We should note that our comparisons are based on small numbers. For malalignment and cartilage loss (where we suggested that BLOKS had an edge), a change in reading of 4 knees (of over 100) would have negated our findings. A similar small difference was seen for BML's and cartilage loss where we suggested that WORMS was preferable to BLOKS. Since BLOKS and WORMS yield very similar readings (see Part I paper) and even though we read over 100 knees, it is clear that a major differentiation of BLOKS and WORMS would probably require reading 500 knees using the same approach as ours, an effort that would be highly expensive and time consuming. We note that we combine the data analysis with reader insights which may be as valuable as the analysis.
One other study has compared BLOKS and WORMS BML scoring (4
) and reported that BLOKS BML size score correlated better with pain than did WORMS BML score. However, that study measured WORMS on nonfat suppressed sagittal MRI's and BLOKS on fat suppressed coronal MRI's from the same study (BOKS) which may have invalidated the comparison.
It is less clear which of the two scales is preferable for scoring cartilage loss. On the one hand, the BLOKS scale clearly performed better in terms of detecting effects of malalignment. On the other hand, the WORMS cartilage scale was slightly superior in agreeing with joint space loss. We note that the WORMS scale for cartilage scoring clearly is not a linear scale. For example, going from 2 to 3 on the WORMS scale (we did not use 1 as it does not denote a loss in cartilage substance) is not at all the same as going from 5 to 6 whereas BLOKS scores are in more of a linear format. When we looked at cartilage loss as a consequence of meniscal lesions() or BML (), we found that the WORMS scale better detected the effects of these risk factors, one piece of evidence that WORMS, at least for cartilage loss, performed better than BLOKS.
We used x-ray joint space loss as the standard against which to assess MRI cartilage loss mostly because this was the standard for progression that was available to us. Since x-rays are insensitive to cartilage loss relative to MRI and JSN may reflect other factors besides cartilage loss, it could be argued that this was a poor choice (6
). The poor sensitivity of cartilage loss scales, however, probably does not speak to the insensitivity of MRI in detecting cartilage loss, but rather reflects issues in the assessment of progressive joint space narrowing using x-rays. Changes in knee positioning relative the central ray of the x-ray beam between serial radiographs can cause changes in the appearance of the joint space such that even with adjudication, films adjudicated as showing joint space loss, may not have such loss. This has been shown recently for joint space width assessed on serial knee radiographs from OAI that, despite using standardized positioning techniques, had inconsistent positioning over time (9
). Further, it has been shown that joint space loss is can be due to meniscal change, such as extrusion and not cartilage loss(10
) In addition, x-rays acquired under weight bearing conditions and MRI acquired in the supine, non-weight bearing position may give different results in terms of loss over time. Since MRI is more sensitive to cartilage loss than x-ray, the imperfect specificity of MRI may not represent false positives but rather true positives missed by the x-ray measure. Finally, even though the JSN is an imperfect standard against which to compare MRI, the performance of BLOKS and WORMS cartilage scores were assessed in relation to the same JSN measures.
One important limitation of our study was the absence of quantitative longitudinal data on cartilage loss, data derived from segmenting cartilage and reporting on its thickness and volume. (Currently, quantitative measures of bone marrow lesions and meniscal damage are not widely available.) There is controversy as to whether quantitative cartiage measures provide superior information to semiquantitative data, especially in early or mild osteoarthritis (9
). but it was not within the capability of this study to examine this question.
We selected knees likely to progress in this study and as a consequence most of these knees had meniscal pathology and bone marrow lesions. Effective comparisons between WORMS and BLOKS might have been easier if there were more variability in knee findings.
The wide range of moderate-sized areas of cartilage loss (e.g. BLOKS: 10-75%) of these frequently used categories may limit the sensitivity of both methods to detect longitudinal change, especially for BLOKS which has much larger tibiofemoral subregions than WORMS. This is also true for BML scores where the grade 2 scores have even larger ranges. To increase sensitivity for scoring progression in longitudinal studies, allowing for “within-grade” changes or using categories with a smaller range of affected area may be useful.
In summary, we recommend an amalgamated MRI reading system for OAI, a scoring system with elements from WORMS and BLOKS. For menisci, it is our view that the BLOKS system is superior and for BML's that WORMS is better. We recommend further that the WORMS regions be used so that investigations into whether certain lesions are in the same small regions as other lesions. For cartilage scoring, the two systems produced comparable results and the use of one or the other may be based on other considerations including ease of scoring and psychometric properties of the scales.