This work was approved by the National University of Health Sciences institutional review board (IRB) for protection of human research subjects; this IRB operates under CFR 45, Subsection 46. One hundred and twelve (112) subjects completed the larger (main) study. Each subject received 2 MRI scans (Hitachi MRP 5000, 0.2 Tesla MRI unit) on 2 separate occasions (initial scans and after 2 weeks of chiropractic care) for a total of 448 films. The first scan of each MRI appointment was taken in the neutral (supine) position and the second was taken following an intervention (side posture spinal adjusting, side posture positioning, no intervention control). Z Joint measurements were to be made of the left and right L4/L5 and L5/S1 levels (4 measurements per scan) of the 448 scans in the study if the results of the reliability study indicated the measurements could be made reliably. The study radiologist (JC) chose the specific image to be measured for each Z joint from 5 images of each segmental level using a rigorous method designed to identify the image that demonstrated the Z joint space to best advantage. This method was used in previous studies4
and for the 112 subjects of the larger study. Those images selected for each of the four levels were magnified at 2X and printed together on one sheet of 14 in x17 in x-ray film. The MRI scans were coded using random numbers so that all investigators (including the radiologist) had no knowledge of whether the scan was from the first or second MRI appointment or if the scan was the first or second taken during an appointment. Five subject scans (20 Z joints) were randomly selected for use in this reliability study.
Three observers were chosen from students enrolled in a complementary and alternative medicine (CAM) professional program at the National University of Health Sciences. The students had completed the first year gross anatomy course, including spinal anatomy. Using gross anatomical sections and MRI scans that corresponded to those sections, the students received additional tutoring in cross sectional anatomy of the spine from an anatomist specializing in the spine (GC).17
Particular attention was paid to instruction regarding the cross sectional anatomy of the Z joints and their appearance on MRI.
Training Protocol for Observers
The primary radiologist of the project (JC) had served in the same capacity in the previous studies. He chose two representative patient scans from the project to be used in a tutorial for the observers’ measurements. The 3 observers then met with the radiologist who demonstrated the measurements on the two scans (8 measurements, 2 scans each with 4 Z joints). The measurement made of each Z joint was the shortest anterior-posterior (A-P) distance between the superior and inferior articular processes at the center of the Z joint space (). More specifically, each measurement began from the point of the superior articular facet closest to a point bisecting a line passing between the medial and lateral extremes of the joint. Measurements began and ended at the point of markedly low signal intensity adjacent to the joint space, passing through the region of intermediate signal intensity sometimes associated with Z joint spaces as seen on MRI scans. Measurements were made on a GTCO Calcomp Drawing Board III (backlit) digitizer (Source Graphics, Anaheim, CA) and points digitized using the tablet were converted to distance by Excel Distance/Length Digi digitizing software (Logic Group, Austin, TX). The observer measured each joint space 5 times. If 3 of the 5 measurements were within 0.1mm of each other, the 3 values were averaged and recorded electronically. If 3 of 5 measurements did not agree, the measurements were repeated until 3 measurements within 0.1mm were attained. Once the observers expressed a clear understanding of the measurements, each scheduled a time to make them alone during the following week. After the 3 observers had completed their “solo,” each met separately a second time with the radiologist. During this session, the radiologist discussed any difficulties the observer had experienced and the radiologist worked to clarify the observer’s understanding of the measurements. The observer then made another set of measurements under the observation of the radiologist to ensure that the observer was measuring according to the previous instructions. This process continued until the radiologist was satisfied that the observer was ready to begin the reliability study.
Illustration (A) and MRI scan (B) showing the central anterior to posterior (A-P) measurement of the zygapophysial (Z) joints that were made from the left and right L4/L5 and L5/S1 Z joints in this study.
Utilizing the 5 images chosen by the radiologist (different from the scans used in the training sessions), the observers made the 20 L4/L5 and L5/S1 Z joint measurements from the subject scans on two separate occasions, separated by at least 4 weeks. The observers had no access to the measurements of one another or their previous measurements.
Data was analyzed by calculating intra-class correlation coefficients (ICCs).16
For intra-observer reliability, ICCs were calculated comparing the measurements of the first and second sets of measurements of each observer. Inter-observer reliabilities were calculated using ICCs of all three observers for the measurements of the first measurement session, second measurement session, and for the means of measurements of first and second sessions for each observer (overall reliability). ICCs assess all three observers simultaneously.
Reliability was also calculated using the methods of Bland and Altman,15
who suggest that the reliability be estimated by examining the distribution of the difference in repeated measures. Consequently, for each observer (intra-observer reliability) the measures recorded from each Z joint during the first 3 measures (first session) were subtracted from the mean of the second three (second session). The resulting values were graphically plotted against the average of the means of the first and second session measurements. In addition, the overall mean difference between the two sets of measurements was also calculated. These methods were also used for the pooled data of the three observers for each measurement session (inter-observer reliability). Acceptable reliability was set at a mean difference less than the absolute value of 0.4 mm; the value determined by the investigative team to be the minimum clinically relevant difference that could be assessed by MRI.
The training and measurement protocols were all successfully completed with two separate sets of measurements, separated by a minimum of 4 weeks, completed by the 3 observers.
The values for the measurements made by the three observers are shown in . ICCs comparing the measurements of the first and second sets of measurements of each observer (intra-observer reliabilities) were: 0.95 (95%CI: 0.87-0.98), 0.83 (0.62-0.92), and 0.92 (0.83-0.96) for each of the 3 observers. Comparisons of the measurements of the three different observers (inter-observer reliabilities) were: 0.90 (0.82-0.95), 0.79 (0.61-0.90), and 0.84 (0.75-0.90) for the measurements of the first measurement session, second measurement session and overall reliability (means of measurements of first and second sessions for each observer). ICCs assess all three observers at once.
Measurements of Zygapophysial (Z) Joint Space from MRI Scans
The Bland and Altman (1986) method for assessing the mean difference between the first and second measurements resulted in values of -0.04 mm (±1.96 SD= -0.37 – 0.29), 0.23 (-0.48 – 0.94), 0.25 (-0.24 – 0.75), and 0.15 (-0.44 – 0.74) for each of the 3 observers and the overall agreement, respectively. A mean difference of <±0.4 was considered clinically acceptable. shows the Bland and Altman plots15
of the differences between the first and second sets of measurements plotted against the mean of the same values.
Figure 2 Bland and Altman (1986) plots showing the differences (Y axis) between Measurement 1 and Measurement 2 (separated by a minimum of 4 weeks) plotted against the averages (X axis) of the same two measurements. The plots include the values of Observers 1-3 (more ...)