|Home | About | Journals | Submit | Contact Us | Français|
The gold standard for evaluating bone mineral density is dual energy x-ray absorptiometry (DEXA). Prior studies have shown poor reliability using analog wrist X-rays in diagnosing osteoporosis. Our goal was to investigate if there was improved diagnostic value to visual assessment of digital hand X-rays in osteoporosis screening. We hypothesized that similar to analog counterparts, digital hand X-rays have poor correlation and reliability in determining bone mineral density (BMD) relative to DEXA.
We prospectively evaluated female patients older than 65 years who presented to our hand clinic with digital hand and wrist X-rays as part of their evaluation over six months. Patients who had a fracture and were without DEXA scans within the past two years were excluded. Five fellowship-trained hand surgeons, blinded to DEXA T-scores, evaluated the x-rays over two assessments separated by four weeks and classified them as osteoporotic, osteopenic, or normal BMD. Accuracy relative to DEXA T-score, interobserver and intraobserver rates were calculated.
Thirty four patients met the inclusion criteria and a total of 340 x-rays reviews were performed. The assessments were correct in 169 cases (49%) as compared to the DEXA T-scores. A mean weighted kappa coefficient of agreement between observers was 0.29 (range 0.02-0.41) reflecting a fair agreement. The first and second assessment for all five physicians was 0.46 (range 0.19-0.78) reflecting a moderate agreement. Grouping osteoporosis and osteopenia together compared to normal, the accuracy, interobserver and intraobserver rates increased to 63%, 0.42 and 0.54 respectively.
Abnormally low BMD is a common occurrence in patients treated for upper extremity disorders. There is poor accuracy relative to DEXA scan and only fair agreement in diagnosing osteoporosis using visual assessments of digital x-rays.
Decreased BMD is the most important predictor of fracture risk (1-4). Approximately thirty million Americans are affected or at risk of complications from osteoporosis and nearly 1.5 million fractures are attributed to osteoporosis every year (3-5). Epidemiological studies predict that approximately one third of women over 50 years of age will experience a fragility fracture (3-5). With the availability of proven osteoporosis treatments and cost effectiveness of fracture prevention diagnosis and treatment of osteoporosis has become a clinical priority. The gold standard for population screening of osteoporosis is dual energy x-ray absorptiometry (DEXA) measuring bone mass of the lumbar spine, hip and total body (6,7). However, DEXA scans can be cost prohibitive, time consuming, and are not available in some communities (8,9). Simpler, inexpensive, readily available screening tests, such as radiography, might improve the awareness of patients at risk for osteopenia and osteoporosis or the need for further evaluation with DEXA.
As awareness of osteoporosis in the general population increases, patients frequently inquire whether their hand surgeon can diagnose osteoporosis from hand x-rays obtained during routine office visits. With the advent of digital x-ray imaging, X-ray resolution and overall image quality has improved (10). Digital films permit the user to focus on an area of interest and manipulate the image parameters (i.e. zoom, contrast and brightness). Other advantages of digital radiography include cheaper costs, decreased radiation exposure, post-image processing, and no need for film development (11-13). With improved image quality, decreasing costs and the ability to manipulate, digital x-ray might have utility as a screening tool for osteoporosis.
The purpose of this study is to evaluate the use of visual assessment of digital x-rays of the hand and wrist to diagnose osteoporosis compared to BMD measured by DEXA scan. We hypothesize that, despite improvements in image quality, digital X-rays, similar to their analog counterparts, still have inadequate accuracy, reliability and physician agreement to determine BMD.
After obtaining IRB approval, we prospectively evaluated all female patients older than 65 years who presented to our hand clinic over a six month period and who had posteroanterior and lateral hand and wrist digital X-rays as part of their evaluation.
All X-rays were performed by certified radiology technicians in the office using digital radiography. X-rays were taken in standard position. For the PA and lateral of the hand, at least 2.5cm of the distal forearm was included. For the PA and lateral of the wrist, patients were positioned so that the entire distal radius could be visualized. Peak kilovoltage range was between 50-60 kVp and miliamperage between 3-4 mAs. Patients were excluded from this study if they had a fracture on X-ray or if they had not had a DEXA scan within the past 2 years.
The hand and wrist x-rays were randomized by one of the non-reviewing authors using the institution’s Picture Archiving and Communication system (PACs, Sectra AB, Linköping, Sweden). In this system, the X-rays are obtained as full DICOM images and stored using a wavelet, lossless compression system at a 1:3 compression ratio. No post-processing modifications were additionally made. The images were evaluated by five fellowship trained hand surgeons who graded the bone density as normal, osteopenic or osteoporotic based on visual assessment only. The reviewers were blinded to patient DEXA T-scores. Reviewers were allowed to manipulate the images using PACs functions such as zoom, contrast, and brightness adjustments. However, the reviewers were asked to not make any measurements or perform other quantitative assessments since our goal was to assess the reliability of bone quality only. Reviewers were not provided any measurement techniques prior to evaluation. X-ray images were viewed on diagnostic grade clinician office desktops approved for PACs viewing. The X-rays were re-randomized and then re-evaluated four weeks after the initial assessment using the same procedure.
A power analysis was performed for an expected physician agreement rate of 50%. Using a 95% confidence interval for an expected agreement of 50% (range 0.4-0.6) we determined that at least 30 patients would be adequate to assess inter and intraobserver agreements.
The accuracy of the cumulative diagnoses were calculated by comparing the results to the lowest of the measured DEXA T-scores, including lumbar spine, femoral neck and total hip. All DEXA results except for two were based on the femoral neck and total hip. BMD classification was based on the World Health Organization system where a T score <-2.5 is considered osteoporotic, <-1 and > -2.5 is considered osteopenic, and > -1 is normal. The interobserver and intraobserver rate of agreement were calculated using a weighted kappa statistic. The weights were based on the respective ratios of the average normal, osteopenic and osteoporotic DEXA scores. To more closely replicate the test as a screening tool for “normal” versus “not normal” BMD, a separate analysis was performed grouping osteoporosis/osteopenia together. Test characteristics including positive and negative predictive values were calculated for the “normal” versus “not normal” group. All statistical analysis was performed using Microsoft Excel (Microsoft Excel. Redmond, Washington: Microsoft 2013.)
The observers performed a total of 340 assessments (34 patients, 5 physicians and 2 assessments per physician). The average age of the patients was 71.4 (range: 64-88). Twelve patients (35%) had a normal BMD by WHO classification; fifteen (44%) were osteopenic; seven (21%) were osteoporotic.
Forty nine percent of the assessments were correctly classified compared to the corresponding DEXA T-score. Only one patient x-ray was correctly diagnosed by all physicians over both assessments (with a diagnosis of normal). Two x-rays were never correctly classified (both diagnoses were osteoporosis). Analysis of the incorrect assessments revealed a tendency to underestimate and overestimate BMD at similar rates (49% and 51%, respectively). When the observations were inaccurate, reviewers were off two grades (normal vs. osteoporosis and vice versa) in 17% of the evaluations and one grade (normal vs. osteopenia, osteopenia vs. osteoporosis and vice versa) in 83% of the evaluations. When queried on the top three aspects, reviewers cited the following qualitative criteria in making their assessment: metacarpal cortical thickness [4 reviewers], distal radial cortical thickness [4 reviewers], overall apparent density of the distal radius [3 reviewers], metacarpal shape [2 reviewers], proximal phalanx cortical thickness [1 reviewers], and osteoarthritic changes [1 reviewer].
A mean weighted kappa coefficient of agreement between observers was 0.29 (range 0.02-0.41) indicating fair agreement based on the Landis and Koch interpretations (14). The mean intraobserver agreement between the first and second assessment for all five physicians was 0.46 (range 0.19-0.78) indicating moderate agreement. The kappa values for all physician assessments are listed in Table 1.
When the osteopenic and osteoporotic patients were grouped together, there was an improvement in the accuracy, as well as interobserver and intraobserver agreement reflecting a moderate agreement (63%, 0.42 and 0.54 respectively.) The average positive and negative predictive values of the “normal” versus “not-normal” BMD assessment for all physicians were 67% and 45%. The sensitivity and specificity of the visual assessment compared to DEXA results were 68% and 52%.
The objective of this study was to evaluate the ability of fellowship trained hand surgeons to assess for osteoporosis in the office using a qualitative assessment of digital x-rays of the hand and wrist. While analog X-rays of the hand and wrist have been studied previously, digital x-rays have not. In 2001, Olschewski et al studied the accuracy of qualitative assessment of analog radiographs of both fractured distal radii and uninjured wrists for diagnosing osteoporosis (15). For the uninjured wrist, they found intraobserver agreement of 76% and an interobserver kappa score range of 0.43-0.56 with a sensitivity and specificity of 61% compared to the DEXA results. They concluded that analog radiographs were unreliable for evaluating osteoporosis. Since then, no other studies have evaluated the utility of standard or digital x-rays to predict osteoporosis based on visual assessment.
Since 1960, quantitative evaluations of hand radiographs have been used to investigate the association between cortical thickness and BMD (16). Webber, et al found positive correlations between distal radius bicortical thickness and femoral bone density on DEXA using PA radiographs of the wrist. Similarly, positive correlations have been reported between radiograph-measured proximal humeral cortical thickness and DEXA scores (17-19).
Other imaging modalities including X-ray radiogammetry and quantitative CT have also been investigated as alternative diagnostic tool to address population screening (20,21). Radiogammetry (DXR) is the quantitative assessment of skeletal quality using a formulaic conversion of cortical measurements and bone geometry of the metacarpals and forearm to output a corresponding BMD. Advancements in digital imaging have prompted a renewed interest in this technique. A 2002, a prospective study found that BMD of the hand and wrist as computed by DXR using digital radiographs were associated with fracture risk of the hip, radius and spine (22). Similarly, Ward et al demonstrated a moderately good correlation of 0.56 between BMD of the distal forearm and metacarpals with DEXA BMD of the spine, femoral neck, and hip (23). Dhainaut et al also found a positive correlation of 0.65 between hand BMD by DXR compared to femoral neck BMD by DXA (24). Despite these alternative attempts to evaluate BMD and fracture risk, DEXA remains the gold standard in osteoporosis screening.
In our study, we found that using a visual, qualitative assessment of digital radiographs to screen for osteoporosis is only moderately accurate with fair reproducibility. Experienced, fellowship trained hand surgeons were able to differentiate between normal and abnormal BMD in 63% of the assessments but have a much more difficult time differentiating between normal and osteopenia as well as between osteopenia and osteoporosis. Most importantly, to be used as a screening tool, a test should have a high negative predictive value to minimize the false negative rate. Our study showed a negative predictive value of 45%, meaning 55% of patients screened in the office would be inappropriately assured that further workup with DEXA scan was unnecessary.
A few weaknesses of our study are worth noting. Our study size, while amplified by multiple physicians and observations, was relatively small. A larger patient cohort and a more powerful study would involve randomizing individuals of various age groups without any upper extremity complaint to prevent any selection or age related bias. In order to maximize the number of osteoporotic X-rays in our cohort, we limited our selection to female patients older than 65 years of age. Our results and conclusions therefore might not be applicable to male or younger female patients. Furthermore, including patients who already had a DEXA scan may have biased our results towards osteopenia and osteoporosis.
There is a limited role in using exclusively qualitative evaluations of digital hand and wrist films in the office for making clinical recommendations regarding osteoporosis. In the search to identify a better screening tool, future studies might look at the accuracy of basic quantitative hand and wrist digital x-ray measurements for predicting BMD.