|Home | About | Journals | Submit | Contact Us | Français|
Evaluation of the accuracy of ultrasound has yielded heterogeneous results. Our objective was to summarize the evidence on the accuracy of ultrasound compared to venography in asymptomatic patients, taking into account the variation due to threshold differences. Searches of journal table of contents, computer databases (Medline, Embase, Biomed, Cochrane) and conference proceedings were performed. A study was eligible if it prospectively compared ultrasound to venography for the diagnosis of DVT in asymptomatic patients. Data of studies selected for inclusion were extracted independently by two authors. High quality studies with consecutive patient enrollment, blind evaluation of the two techniques, and absence of verification bias are summarized as Level 1, while those not fulfilling one or more of these criteria are considered Level 2. Original study authors were contacted to confirm accuracy and to provide missing data. A pooled estimate of the accuracy of ultrasound was obtained according to the method of Moses and coworkers. This method gives a summary diagnostic odds ratio (DOR). The DOR is a single indicator of test performance. It varies between 0 and infinity and exceeds 1, only when ultrasound is more often positive in patients with DVT relative to those without DVT. Higher DOR indicates better discriminatory test performance. Thirty one studies were rated as potentially unbiased and graded as Level 1. The mean prevalence of DVT as determined by venography was 22%. In Level 1 studies, the odds of positive ultrasound in proximal veins was 379 times higher (95% confidence limits 65, 2,200) and in distal veins 32 times higher (7.5, 135) among patients with DVT than those without. Our results suggest that, particularly for proximal veins, ultrasound is accurate for the diagnosis of DVT in asymptomatic postoperative orthopedic patients. More research is needed in other clinical settings.
Deep venous thrombosis (DVT) is a common disease with potentially serious consequences such as pulmonary embolism (1). The incidence of DVT in the general population is reported to be between 1.6 and 1.8 per 1000 people per year (2-4). Symptomatic DVTs are less frequent, comprising only 3.4% of the total (5). Venography in asymptomatic surgical patients at high or highest risk, however, has found an incidence of between 4-20% in proximal (thigh) and 20-80% in calf veins. In orthopedic surgery (highest risk), the incidence of DVT systematically screened by venography ranges between 50-60% (5). As a consequence of our inability to identify patients with asymptomatic DVTs, who will develop clinical venous thromboembolism, antithrombotic drugs are used as a prophylactic measure in all high-risk patients (6). These drugs are reportedly efficacious in the treatment of asymptomatic DVTs, but this risk reduction is accompanied by a high risk of adverse drug events, including major hemorrhage. For instance, the rate of major bleeding after total hip replacement in patients treated with low molecular weight heparin is estimated at 5% (5). Moreover, the rate of recurrence of asymptomatic DVT remains high (4-19%), even with prolonged out-patient treatment (5).
To balance this high risk of morbidity, the decision to treat a patient, or to withhold anticoagulant after discharge, should be based on accurate diagnostic tests with a high sensitivity and specificity. Diagnostic tests are also helpful for clinical management (e.g. localization of DVTs) and in monitoring disease, during or after treatment. Because of the inaccuracy of clinical examination (7), diagnostic techniques such as I-fibrinogen scanning, venography, plethysmography and ultrasonography have been developed.
Venography, first described in 1963 (8), is generally accepted as the gold standard in detecting DVT even in asymptomatic patients, although it has not been properly evaluated in this role because of the absence of a true reference. It allows the direct visualization of the veins from the calf to vena cava after opacification by contrast media. Contrast is injected into any superficial vein on the dorsum of the foot. Presence of a filling defect or abrupt termination of the opaque column are used as criteria for DVT (9). Nevertheless, venography presents several limitations: pain; induction of DVTs in up to 2% (especially when using ionic contrast agents (6)); general reliability, as indicated by kappa agreement coefficients of observers as low as 0.57 (though some are as high as 0.90) (10-14); and lack of reliability in special cases because of patient contraindication, patient refusal, or technical reasons (6, 15).
Because of these limitations venography has never been accepted as a systematic screening tool, and non-invasive diagnostic tests like ultrasound have been evaluated to replace venography, particularly in the context of routine screening for DVT (16). Ultrasonography uses five different imaging techniques to evaluate the reflective properties of tissues: Brightness modulation (B-Mode), Doppler, Duplex (combination of B-mode and Doppler), Color Doppler and Triplex (combination of Duplex and Color Doppler). Several criteria have been evaluated for the diagnosis of DVT. The lack of compressibility of the vein due to the (presumed) presence of thrombosis is the most commonly accepted cut-off criteria or threshold.
Previous reviews have attempted to summarize the available evidence on the accuracy of ultrasound for the diagnosis of DVT (16-18). Only English-language articles indexed principally in Medline were included, and sensitivity and specificity were summarized using the pooled data. These reviews have found, however, significant heterogeneity between the selected studies. No attempt was made to explain the heterogeneity or the use of new, more appropriate methods developed for summarizing results of diagnostic test evaluation. These methods have the advantage of taking into account the variation in the thresholds used by different observers to determine which patients are diseased. When there is a tradeoff between sensitivity and specificity due to differences in judgment, pooled estimates of sensitivity and specificity can be misleading and should be avoided (19). Furthermore, one of the previous reviews (16) did not identify multiple publications (20, 21), and accuracy for all DVTs was calculated including studies that did not evaluate distal veins with ultrasonography (22-26).
The main objective of the present study is to summarize results of all available evidence on the accuracy of ultrasound compared to venography for the diagnosis of DVT in asymptomatic patients using modern meta-analytic methods.
A comprehensive review of the literature combining online and conventional library searches as well as searches to find unpublished studies was performed. All studies evaluating the accuracy of ultrasound compared to venography in asymptomatic patients were included, regardless of the language of publication. Database searches included: Medline from 1966 to April 2003 using the PubMed® interface, Embase® and Pascal Biomed® from 1989 to 2002 using the Silver Platter® interface, Science Citation Index (SCI) from 1979 to 2002 using the Web of Science interface, Database of Abstracts of Reviews of Effectiveness (DARE) through the University of York website by November 2002 and the Cochrane database through the Cochrane Library. Reference lists of all included studies, two systematic reviews (16, 18), and two reviews (27, 28) related to our meta-analysis were also reviewed for other potential studies that met inclusion criteria. We used the following terms (MeSH and free text) in our strategies: thrombosis, venous thrombosis, thrombophlebitis, phlebography, venography, phlebogram, venogram, ultrasonography, ultrasound, Doppler, CUS, duplex.
Fifty-six journals that had appeared at least once in the first Medline search were selected for conventional searching (BK, SS) for the year 1998. Published short communications and abstracts were identified by searching SCI from 1979 to 2000 and by inspecting the proceedings of the thirteenth through eighteenth congresses of the International Society of Thrombosis and Haemostasis published between 1993 to 2001.
Manufacturers of Ultrasound Acuson (Siemens), Biosound, Diasonic, Quantum QAD, GE Ultrasonic, ATL Ultramark (Philips), Aloka, Toshiba, Hitachi and investigators of all comparative studies were contacted by e-mail or mail for any unpublished studies. The Dissertation Abstracts International (DAI) database was searched for information about doctoral dissertations and master’s theses from 1980 to 2002.
A study was eligible if it prospectively compared ultrasound to venography for the diagnosis of deep venous thrombosis in lower limbs of asymptomatic patients. A study was excluded if venography was not the reference standard or lower limbs and deep veins were not evaluated. Eligibility was independently assessed by two authors (BK and SS). Differences were resolved by consensus.
Two authors independently reviewed all citation abstracts retrieved and excluded irrelevant studies according to prespecified exclusion criteria. All quality criteria and accuracy data were extracted blindly from the original papers by two authors (BK, SS). In addition, we asked all authors of included studies to confirm the accuracy of the extracted covariates and to supply any missing data. All covariates were selected a priori. The influence of these covariates on diagnostic accuracy was explored and will be reported in a separate paper.
The statistical units were patients in 24 studies and lower limbs in 23. For each study we extracted a 2×2 table of positive results in diseased (True Positive, TP), negative results in non diseased (True Negative, TN), positive results in non diseased (False Positive, FP) and negative results in diseased patients (False Negative, FN). When data were available a 2×2 table was created for each anatomical localization, i.e. all DVT (proximal + distal veins), proximal (thigh + popliteal) and distal (calf) DVT.
were calculated from original data. The diagnostic OR is a single indicator of test performance. It varies between 0 and infinity. When diagnostic OR is equal to 1 the test has no discriminatory value. A diagnostic OR > 1 indicates that the test is more often positive in diseased relative to non diseased patients, higher the diagnostic OR better the discriminatory power (29).
Because of criticisms of quality scores (30), we examined quality items instead. However, we also summarized results separately for studies according to generally accepted quality criteria (31). Level 1 studies included consecutive patients, blindly evaluated the two techniques and verified the diagnosis with venography in all patients. In Level 2 studies one or more of the latter characteristics were missing or not clearly present.
Pooled estimates of sensitivity and specificity were obtained by the method of Moses and coworkers (see the appendix) (32). Unweighted least squares linear regression was used to estimate a summary diagnostic OR and plot summary Receiver Operating Characteristic curves (sensitivity against (1-specificity)), because it can better reflect the between-study variability of test accuracy (19, 32). Results from a weighted summary ROC could bias estimates and present unresolved methodological problems for diagnostic tests with equal sample sizes, because the highest weight (inverse of the variance) may be assigned to less accurate studies (19). We used an empirical Bayes smoothing method to fill in cells with zero frequency (33), because a continuity correction could considerably influence results (32).
The Q-statistic was used to quantify heterogeneity between sensitivities or specificities and the t-test to detect if the diagnostic ORs were constant across the studies (slope of the regression model close to 0) for different levels of threshold (see the appendix). We explored methodological (study quality) and clinical heterogeneities using the meta-regression technique (34).
From a total of 2,000 identified citations, 233 compared ultrasound to venography, and 53 examined asymptomatic patients. Eight articles were excluded because four were considered multiple publications (20, 35-37), two studies examined only superficial veins in preoperative patients (38, 39) one was retrospective (40), and in one study (41) it was not possible to distinguish data for symptomatic versus asymptomatic patients. Three abstracts (42-44) not published in full were also included. For 19 articles (42%) authors provided missing data and verified the accuracy of extracted sensitivities and specificities.
Table 1 shows the 45 articles included in our systematic review. Forty-two studies were published in English, 2 in French (45, 46), and 1 in Italian (47). Two papers reported two sets of separate results in two different groups of patients, and were therefore considered as 4 independent studies (48, 49). Thus, a total of 47 studies including 4,914 patients were reviewed. The mean prevalence of DVT was 22% (range 5-65%). One study enrolled patients in internal medicine (50) and 46 studies in postoperative surgery (41 orthopedic, 5 in other surgery). The mean age of participants (reported by 34 studies) was 66 years and the sex ratio (28 studies) was 1.5 females to 1 male. The failure rate was 10% for venography (37 studies) and 5% for ultrasound (reported by 31 studies). Thirty-nine studies reported the accuracy of DVT screening in proximal veins, 29 in all veins and 22 in distal veins. Twelve studies (21-26, 49, 51-55) examined distal veins with venography, but not ultrasonography. These studies were not considered for further analysis of the accuracy in all veins. Three studies (47, 56, 57) examined DVT in proximal and distal veins, but it was not possible to distinguish the results of distal from proximal veins. Finally, two studies failed to identify any isolated DVT in proximal veins with venography (58, 59) and were not included in the analysis of accuracy in proximal veins.
Thirty-one studies (69%) included consecutive patients, assessed the two techniques independently and avoided verification bias (patients were examined with both venography and ultrasonography). Twenty-nine Level 1 studies evaluated patients after orthopedic surgery, one after neurosurgery (60), and one after thoracic and abdominal surgery (61).
We found large variability between studies for both sensitivity and specificity. We did not detect variability of the diagnostic OR, however, due to threshold (e.g. definition of compressibility) differences for Level 1 studies (p ≥ 0.05 for the test of b = 0) in all anatomical sites. The summary diagnostic ORs obtained with the unweighted model are shown in Table 3. The summary diagnostic OR of Level 1 studies minimizing the potential for bias is 379 (95% confidence limits 65, 2200) in proximal, 32 (7.5, 135) in distal veins and 29 (7.4, 114) in all veins. The summary diagnostic OR in Level 2 studies estimated are probably overstated but nonetheless suggest large values for the true diagnostic OR. Figures Figures11 through through33 show the summary ROC curves for proximal, distal and all veins. These figures show that many of the TPR and 1-FPR points fall close to the summary ROC curves and suggest high accuracy for ultrasound relative to venography in proximal veins.
Summary results obtained from any systematic review can be misleading if publication bias is present (62). Even without empirical evidence of publication bias in studies evaluating diagnostic tests, it could be important. Although we did not identify any unpublished studies, funnel plots of diagnostic OR vs. the number of subjects did not suggest the presence of publication bias in our review.
Only 69% of selected studies appeared to be designed to minimize bias. Seventeen studies did not clearly state enrollment of consecutive patients (leaving a potential for spectrum bias) and four did not mention if patients were examined independently. Four authors (49, 53, 59, 63) confirmed that five of 17 studies enrolled consecutive patients and two (50, 64) had evaluated their results in a blinded manner.
When the between-studies variability of sensitivity and specificity is large, threshold differences are a likely source of heterogeneity, and the summary ROC method is more appropriate than classic pooling because it can control for this threshold variation. Some of the observed heterogeneity in our review could be explained by differences in the threshold of compressibility used. Compressibility could depend on, for example, the anatomical site, accessibility of the veins, or the extension and onset (recent or prolonged) of the thrombosis. The judgment of the examiner and the number of anatomical sites examined by ultrasonography could also be a source of variability (65). Interobserver agreement (kappa), reported by three studies (44, 50, 66) included in our systematic review, ranged from 0.56 to 0.85. A stringent cut-off would decrease sensitivity and increase specificity and vice versa. Pooling sensitivities and specificities ignores threshold differences and led to underestimation of the accuracy (19). For instance, for proximal veins in Level 1 studies, the mean sensitivity and specificity were 0.64 (0.62, 0.65) and 0.98 (0.974, 0.983) respectively, with a large between-study heterogeneity (p value < 0.001). These results are close to pooled sensitivity (0.62, CI (0.54, 0.70) and specificity (0.97, CI (0,96, 0,98)) obtained by previous reviews (16, 18). With the summary ROC method (Fig. 2), however, a specificity of 0.98 corresponds to a sensitivity of 0.82.
Three Level 2 studies evaluating DVT in all veins have used the continuous wave Doppler ultrasonography. This technique, less sensitive and specific than new methods, has been abandoned. When these studies were excluded, the diagnostic OR increases slightly in Level 2 studies, but did not influence our conclusions based on Level 1 studies.
We also explored clinical and methodological heterogeneity between studies. The radiologists’ experience with ultrasonography, the level of quality, the ultrasound technique, and other covariates were included in a meta-regression model. Our results show that ultrasonography tends to be more accurate when used after orthopedic surgery compared to other clinical settings (P = 0.059) in proximal veins, and that other selected study characteristics do not explain a large part of the observed heterogeneity. Venography is an imperfect reference standard and could influence accuracy; this influence could be measured by adding the prevalence of the disease relative to venography in the regression model. From our results, however, it is not clear whether differences in prevalence are a source of heterogeneity.
As we used the summary ROC method to summarize the accuracy of selected studies, our results depend on the validity and limitations of this method (67). The diagnostic OR calculated by the summary ROC method for Level 1 studies is large for proximal veins in asymptomatic patients. Other than two studies performed after neurosurgery and abdominal or thoracic surgery (60, 61), all Level 1 studies were performed on postoperative orthopedic patients. The performance of ultrasonography measured in one of these 2 studies (60) appears to diverge from the other results in distal and all veins (Figs. (Figs.11 and and3).3). Discarding these outlier observations, however, did not influence our results. The summary diagnostic OR is 404 (58, 2800) when postoperative orthopedic studies are exclusively considered.
Results of meta-analysis expressed by a summary diagnostic OR may not appeal to practicing clinicians. In Level 1 studies evaluating proximal veins, the diagnostic OR is 379 (65, 2,200). This means that for ultrasonography, the odds of positivity among patients with DVT is 379 times higher than those without DVT. The diagnostic OR cannot be used directly to calculate the probability of disease from a test result, and it is only possible to specify a summary estimate of specificity for a given sensitivity and vice versa. Despite these limitations, the summary ROC model is useful when both sensitivity and specificity show large variability and threshold differences might be a source of heterogeneity (68). The clinical usefulness of ultrasonography could be evaluated by calculating positive (probability of having a DVT when ultrasonography is positive, PPV) and negative (probability of being disease-free when ultrasonography is negative, NPV) predictive values, using summary diagnostic OR (69). Figure 4 shows that during situations of low prevalence (10%), the PPV is good (>80%) when ultrasonography is read at a false positive rate (1-specificty) of no more than 5%. When prevalence is high (30%) PPV is good (>80%) when images are read with a false positive rate as high as 10%. The NPV is good even in low prevalence situation and with a high false positive reading rate of ultrasonography.
Ultrasound may be used for detecting asymptomatic DVT before starting antithrombotic treatment or for excluding DVT before administering a potentially life-threatening drug. Our results suggest that ultrasound is a useful technique for postoperative orthopedic patients. Nonetheless, more research is needed to explore if ultrasound is useful in other clinical settings. The usefulness of systematic screening by ultrasound in preventing thromboembolic events has been questioned by a clinical trial (70) and a cohort study (71) in orthopedic surgery. The prevalence of DVTs in these studies, however, was low (under 3%) and it is not surprising that the number of patients needed to be tested to avoid one event (72) was high in these settings. Venography has been found cost-effective in a cohort study of patients undergoing hip replacement where the prevalence of DVT was 23% (73).
Venography and ultrasonography are usually used to detect asymptomatic DVTs as surrogate outcomes for symptomatic DVTs and to prevent pulmonary embolism. One could argue that it would be of greater clinical significance to rule out asymptomatic DVTs than rule them in, therefore sensitivity is more important than specificity. Most asymptomatic DVTs, however, develop in calf veins, and it is not known whether or not detecting and treating those DVTs is clinically beneficial. Some studies have shown that most asymptomatic DVTs resolve spontaneously (5), and it is rare that they become symptomatic or lead to pulmonary embolism or postphlebitic syndrome, but other studies mainly based on autopsies have yielded conflicting results (66). Furthermore, because bleedings have major clinical impact in patients treated with anticoagulants (74), ruling out DVT will avoid unnecessary treatment.
It should be emphasized that the relevant therapeutic objective is more the prevention of pulmonary embolism than of DVT. Because of their high rate, asymptomatic DVTs detected by imaging techniques are largely used as surrogate outcome in clinical trials. When the accuracy of a diagnostic test used to detect asymptomatic DVTs is not perfect, the treatment benefit is systematically underestimated (75). Moreover, the clinical relevance of using asymptomatic DVT as a surrogate for venous thromboembolism has not been clearly established.
In conclusion, results from our systematic review suggest that ultrasound is accurate in proximal veins for the diagnosis of asymptomatic DVT in patients hospitalized for orthopedic surgery. More research is needed to evaluate its accuracy in other clinical settings. Managing and monitoring asymptomatic DVTs with ultrasonography is, however, more complex and should be evaluated by appropriately designed studies for the reasons enumerated above.
We wish to thank Evelyne Gauthier for her assistance in sending letters to authors. We specially thank Sander Greenland, Francois Gueyffier and André Van Tran Minh for their helpful comments and Dr. J. Jund and Dr. M. Agassarian for their help in reviewing citation’s abstracts and extracting the data. We are very grateful to the following authors who completed their data: Dr. M. T. Barrellier, Dr. R. W. Barnes, Dr. C. L. Barnes, Dr. M. M. W. Beaumont-Koopman, Dr. J. J. Cronan, Dr. B. L. Davidson, Dr. A. Elias, Dr. G. Elliot, Dr. M. K. Eskandari, Dr. J. S. Ginsberg, Dr. E. Kalodiki, Dr. I. Lausen, Dr. D. Leutz, Dr. M. Mantoni, Dr. M. Monreal, Dr. B. Morgan, Dr. E. Oger, Dr. S. Rose and Dr. E. Vanninen.
Financial support: Dr. Kassai was supported by a grant from Pharmacia & Upjohn.
The model proposed by Moses states D = bS + a where: