|Home | About | Journals | Submit | Contact Us | Français|
There is controversy regarding the sensitivity of fecal occult blood tests (FOBT) for detecting colorectal cancer. Many of the published studies failed to correct for verification bias which may have increased the sensitivity.
A meta-analysis of published studies evaluating the sensitivity and specificity of chemical-based FOBT for colorectal cancer was performed. Studies were included if both cancer and control subjects underwent confirmatory testing. We also included studies that attempted to correct for verification bias by either performing colonoscopy on all subjects regardless of the FOBT result or by using longitudinal follow-up. We then compared the sensitivity, specificity, and other diagnostic characteristics of the studies that attempted to correct for verification (n=10) vs. those that did not correct for this bias (n=19).
The pooled sensitivity of guaiac-based FOBT for colorectal cancer of studies without verification bias was significantly lower than those studies with this bias [0.36 (95% CI 0.25-0.47) vs. 0.70 (95% CI 0.60–0.80), p=0.001]. The pooled specificity of the studies without verification bias was higher [0.96 (95% CI 0.94–0.97) vs. 0.88 (95% CI 0.84–0.91), p<0.005]. There was no significant difference in the area under the summary receiver operating characteristic curves. More sensitive chemical-based FOBT methods (e.g., Hemoccult® SENSA®) had a higher sensitivity but a lower specificity than standard guaiac methods.
The sensitivity of guaiac-based FOBT for colorectal cancer has been overestimated as a result of verification bias. This test may not be sensitive enough to serve as an effective screening option for colorectal cancer.
Colorectal cancer is the second most common cause of cancer deaths in the United States1 but mortality may be reduced by screening.2 While screening colonoscopy is increasing in the United States, FOBT was the dominant method of screening for 16% of the population in 2006.3 The most common method for FOBT in the United States utilizes a guaiac-impregnated card.4,5 Other more “sensitive” chemical tests include re-hydration of the guaiac cards, adding an enhancer to the developer,2 or using toluidine instead of the guaiac reagent.6 Despite its widespread use, the diagnostic characteristics of the chemical-based FOBT have been difficult to estimate. Allison2 reported that the sensitivity of guaiac-based FOBT for colorectal cancer ranged between 21-81%. Other review articles have also reported a wide range of sensitivities.7–11 The lack of a precise estimate of sensitivity has resulted in confusion among clinicians when applying FOBT for colorectal cancer screening.
One reason for the wide range of sensitivities may relate to verification bias (also known as work-up bias). Many of the studies performed confirmatory testing such as colonoscopy in most of the FOBT positive subjects but only in a subset of FOBT negative patients. The FOBT negative patients who did not undergo confirmatory testing were excluded from the study sample. The failure to perform confirmatory testing in a large proportion of patients with a negative screening test will exclude patients with disease and a negative test, artificially increasing the sensitivity.12–16 The effect of verification bias on specificity depends on the study design. Many studies with this bias excluded potential subjects without disease who have a negative screening test because they did not undergo confirmatory testing. This type of exclusion reduces the number of “true negatives”, artificially decreasing the specificity.12–16 In contrast, other studies with verification bias assumed that subjects who have a negative screening test are free of disease even though they have not undergone confirmatory testing. This type of inclusion increases the number of “true negatives” and artificially increases the specificity.12,17
Two methods have been used to exclude verification bias. In the first method, all patients referred for colonoscopy (usually for screening or surveillance purposes) are also tested with FOBT. In the second approach, patients who are screened for FOBT tests are then followed longitudinally for a defined period of time. Those patients who are not diagnosed with colorectal cancer after a reasonable observation period are then assumed not to have had colorectal cancer at the time of FOBT screening.
We thus performed a meta-analysis of studies evaluating the sensitivity and specificity of chemical-based FOBT for detecting colorectal cancer. We wished to compare the diagnostic characteristics of studies that attempted to minimize verification bias to those studies that did not correct for this bias. We also reviewed the studies evaluating more sensitive chemical-based FOBT (e.g., re-hydration of the guaiac cards, adding an enhancer to the developer, or using toluidine).
All relevant published studies relating to fecal occult blood testing for diagnosing colorectal cancer were identified by computer-assisted search of the MEDLINE database using Ovid’s SilverPlatter platform (Ovid Technologies, New York, New York). We included all domestic and international reports written in English that were published until December 2008. We used a long time frame as we wanted to identify as many studies as possible with and without verification bias. Studies of FOBT were retrieved using key words that included “fecal or faecal or stool” and “guaiac or occult or Hemoccult or Haemoccult or SENSA” and “colorectal or colon” and “malignancy or cancer or carcinoma” and “colonoscopy or barium”. We also performed a search of published studies of “fecal occult blood” using the Cochrane Library’s Central Register of Controlled Trials (John Wiley & Sons, Hoboken, NJ). Studies were included if they reported both a sensitivity and specificity of a chemical-based FOBT for colorectal cancer and subjects classified as not having colorectal cancer had undergone a confirmatory test. The confirmatory tests consisted of the following: colonoscopy, barium enema with sigmoidoscopy, or longitudinal follow-up of controls (usually at least for 2 years). Studies were excluded for the following reasons: only a fecal immunochemical test was used, lack of a confirmatory testing of control subjects, fewer than three cancer or control subjects, grouping cancer and polyps together to determine the sensitivity without a separate subgroup determination, and failure to report sufficient raw data to determine the number of true positives and true negatives. Studies that primarily investigated patients who were post-resection for colorectal cancers were also excluded as FOBT may have different diagnostic characteristics for recurrent cancers. In cases of multiple papers from the same institution, the dates for study inclusion were evaluated to ensure that the patients were not overlapping. Both authors reviewed all of the potential written reports to determine final eligibility.
We performed separate analyses of studies that used standard guaiac tests (i.e., not using an enhancer or re-hydration) and those studies that use more “sensitive” chemical-based FOBT.
The chemical methods used to improve the detection of stool heme pseudoperoxidase activity include: re-hydration of the specimen cards, adding an enhancer to the developer (e.g., Hemoccult® SENSA®), or using toluidine instead of the guaiac reagent.
We analyzed studies for potential verification bias. Studies that failed to further investigate potential subjects who were FOBT negative with a confirmatory test were considered to have this type of bias. Other studies attempted to minimize verification bias by two approaches: performing colonoscopy on all patients regardless of FOBT result or following screened patients longitudinally for a defined period of time to determine that they have not developed colorectal cancer.
Sensitivities and specificities were calculated using standard formulas.18 Subjects with confirmed colorectal cancers were considered to have disease while subjects without known colorectal cancer were considered not to have disease. In studies that required colonoscopy for all subjects, we also excluded patients with benign neoplastic polyps from the control group. We also excluded subgroups of subjects with known conditions associated with chronic lower GI bleeding (e.g., colitis, angiodysplasias). In cases of unstable estimators which can occur if the rate of an event is either 0 or 1.0, 0.5 were added to each cell in the two-by two tables (number of true positives, false positives, true negatives, and false negatives) to calculate the sensitivity and specificity in order to avoid an undefined value.19 This correction resulted in a slight increase in the estimate of the pooled sensitivities and specificities which were calculated using the inverse of the variance of the rates.18 A random effects model was used to pool the variables and calculate their 95% confidence intervals (CI).20 The homogeneity of the pooling was tested using Cochran’s Q statistic.21 We compared the diagnostic characteristics of those studies with verification bias to those that attempted to correct for this bias. Paired forest plots for displaying sensitivity and specificity were constructed using the format described by Leeflang et al.22 with the modifications proposed by Hyde et al.23 Stepwise multiple regression using SPSS for Windows version 10 (SPSS Inc., Chicago, Illinois) was also performed to determine the effect of verification bias and other study design factors on sensitivity and specificity.18,24,25
Summary receiver operating characteristic (sROC) curves for the FOBT studies were constructed as previously described.26–29 In this method, the sensitivity and false positivity are transformed into their logistic form (also called the logits) defined as the natural log of the positivity rate/(1-positivity rate). Because some studies either reported a sensitivity or specificity of 100%, 0.5 were added to each cell of the two-by two tables for all of the studies in order to avoid an undefined logistic transformation.28 Linear regression was performed using the difference of the logits of sensitivity and specificity as the dependent variable and the sum of the logits as the independent variable. The slope and y-intercept were then used to construct the sROC curve. A slope close to zero results in a symmetric sROC curve while a non-zero slope will make the curve asymmetric and S-shaped.30 The exact areas under the curves (AUC) for the sROC function and their standard error were calculated as described by Walter30 using MathCad 7 (MathSoft, Inc., Cambridge, MA) to perform numerical integration. In sROC analysis, an AUC close to 1.0 indicates a test with a high accuracy while an area close to 0.5 signifies a test with a poor accuracy. Statistical comparisons of the AUC of studies with and without verification bias were performed.18
Using the specified search criteria, 833 articles were identified by the Medline Search while an additional 134 articles were identified by the Cochrane Library. After reviewing the titles and abstracts, 130 full text articles were retrieved. Ninety-four articles did not meet the inclusion criteria (as shown in Fig. 1) while 36 studies were included our meta-analysis.
The studies that evaluated the standard guaiac-based FOBT that did not correct for verification bias are listed in Table 1.31–50 In the last study, the data was presented in two separate publications. These 19 studies included 713 subjects with colorectal cancer and 4181 controls. Both control-and cancer-subjects underwent a confirmatory test, usually a colonoscopy though some of the older studies also utilized barium enema and sigmoidoscopy, particularly if FOBT negative. Most studies included subjects who submitted 3 stool specimens31–34,36,37,39,41,44,46–50 while some studies included subjects who only submitted one specimen35,42,43 or the number was not specified.38,40,45 Almost all of the studies reported that the specimens were collected at home while one study did not specify the location.38 The studies listed in Table 1 recruited subjects by two methods: inclusion of patients undergoing colonoscopy for a variety of indications (e.g., FOBT positivity, GI symptoms, colon cancer screening, polyp surveillance)31–39,41,44–47 or selection of a group of known colorectal cancer patients and controls undergoing colonoscopy.40,42,43,48–50 Both of these methods contain potential verification bias as not all of the FOBT negative patients in the population underwent a confirmatory test.
The studies that attempted to reduce verification bias are listed in Table 2.51–60 These 10 studies included 185 subjects with colorectal cancer and 31,804 controls. Seven studies reduced verification bias by performing colonoscopy on all of the potential subjects51,54–56,58–60 while 3 studies used longitudinal follow-up for at least 2 years.52,53,57 All of the 10 studies specified that the specimens were collected at home.
Tables 3 and and44 include studies that evaluated more sensitive chemical-FOBT tests either by adding an enhancer (e.g., Hemoccult® SENSA®), re-hydration of the specimen card, or using toluidine instead of guaiac. Table 333,61-63 lists the studies with verification bias. Table 451,53,57,64–66 includes the studies that attempted to correct for verification bias either by colonoscopy of all subjects regardless of FOBT status51,65 or longitudinal follow-up of subjects who did not undergo colonoscopy.53,57,64,66
Paired forest plots displaying the sensitivities and specificities of studies with and without verification bias are shown in Figs. 2 and and3,3, respectively. The pooled sensitivities and specificities of the standard guaiac-based FOBT (combining the studies in Tables 1 and and2)2) were 0.60 (95% CI 0.50–0.70) and 0.91 (95% CI 0.90–0.93), respectively with a high degree of heterogeneity for both results (p<0.001). The pooled sensitivity of the studies with verification bias (Table 1 and Fig. 2) was greater than those studies that corrected for this bias listed in Table 2 and Fig. 3 [0.70 (95% CI 0.60–0.80) vs. 0.36 (95% CI 0.25–0.47), p=0.001]. The pooled specificity of the studies with verification bias was lower than those correcting for this bias [0.88 (95% CI 0.84–0.91) vs. 0.96 (95% CI 0.94–0.97), p<0.005, Figs. 2 and and3].3]. Even in the subgroup of studies without verification bias, heterogeneity was still significant for both sensitivity (p=0.025) and specificity (p=0.001). When looking at the subgroup of studies in which 3 stool specimens had been submitted, studies with verification bias (n=13) still had a higher sensitivity [0.74 (95% CI 0.64–0.84) vs. 0.35 (95% CI 0.23–0.47), p=0.0006] and lower specificity [0.87 (95% CI 0.82–0.91) vs. 0.96 (95% CI 0.95–0.97), p=0.001] compared to those without this bias (n=9). When comparing studies using longitudinal follow-up vs. universal colonoscopy, there was no significant difference in either sensitivity [0.41 (95% CI 0.31–0.51) vs. 0.33 (95% CI 0.17–0.50), p=0.55] or specificity [0.97 (95% CI 0.95–0.99) vs. 0.94 (95% CI 0.92–0.97, p=0.17].
We also used stepwise multiple regression to measure the effects of the study design parameters on sensitivity and specificity. The presence of verification bias significantly increased sensitivity (p<0.0001) and decreased specificity (p=0.015). In contrast, sensitivity and specificity were not associated with number of FOBT specimens, the year of publication of the study, the primary method of cancer detection (endoscopic screening vs. referral based on symptoms or FOBT positivity) (p>0.25).
We also looked at the effect of study design on overall accuracy using sROC analysis. The sROC curves for studies with and without verification bias are shown in Fig. 4. The sROC curve for studies without verification bias was S-shaped due to the negative slope obtained from the linear regression of the logistical transformations. The non-zero slope is probably related to the heterogeneity of the studies. The exact AUC (± standard error) for the sROC function for studies with verification bias was higher than those without this bias but the comparison did not achieve statistical significance (0.91±0.01 vs. 0.69±0.26, p=0.37).
Table 3 lists the studies investigating more sensitive chemical-based FOBT which contained verification bias while Table 4 lists the studies which attempted to control for verification bias. The presence vs. absence of verification bias did not significantly affect sensitivity [0.79 (95% CI 0.58–0.99) vs. 0.68 (95% CI 0.52–0.84), p=0.56] or specificity [0.80 (95% CI 0.69–0.92) vs. 0.90 (95% CI 0.87–0.93), p=0.19]. Furthermore, there was no significant difference in the AUC of the sROC function between studies with vs. without this bias (0.88±0.04 vs. 0.91±0.04, p=0.60).
In the subgroup of the three studies which used Hemoccult II® SENSA® and minimized verification bias, the pooled sensitivity and specificity were 0.73 (95% CI 0.62–0.84) and 0.91 (95% CI 0.84–0.97). As compared to studies of standard guaiac-based FOBT without verification bias, Hemoccult II® SENSA® had a significantly higher sensitivity (0.36 vs. 0.73, p<0.001) but a lower specificity which did not reach statistical significance (0.96 vs. 0.91, p=0.22).
Our review shows that prior studies of chemical-based FOBT have reported a wide range of sensitivities for detecting colorectal cancer. In addition, the pooled sensitivity of FOBT in the studies that corrected for verification bias was significantly lower than in those with this bias (0.35 vs. 0.70, p=0.001). Studies without verification bias also had a higher specificity (0.95 vs. 0.88, p=0.03). We found no significant effect of verification bias on the AUC of the sROC function due to the large standard error for studies without verification bias. Furthermore, the AUC is a composite statistic reflecting both sensitivity (which is increased by verification bias) and specificity (which may be decreased by verification bias).67
Studies have used two different approaches to correct for verification bias: performing colonoscopy on all patients regardless of FOBT result or following screened patients longitudinally for a defined period of time. Both methods have some limitations. Many of the studies using screening or surveillance colonoscopy will exclude symptomatic patients and will be attempting to detect asymptomatic colorectal cancer. The sensitivity of FOBT for pre-clinical colorectal cancers may be lower than for symptomatic cases. Studies using longitudinal follow-up have other limitations as the optimal duration of follow-up has not been standardized. A two-year interval may fail to include some colorectal cancers with a long clinical latency. However, a longer observation interval may also include neoplasias that were not malignant at the time of the FOBT but then progressed to cancer during the follow-up period. An insufficient observation period may artificially increase the sensitivity of FOBT while a prolonged observation period will decrease the sensitivity. Finally, some patients with colorectal cancer at the time of screening may never present with clinical signs because they succumb to other diseases or have been lost to follow-up.
Another limitation of this meta-analysis relates to the heterogeneity of the FOBT studies. In order to have an adequate sample size, we had to include reports with a variety of study designs and different populations. Multiple regression was also used to correct for differences in study design (e.g., the year of publication, the primary method of cancer detection, number of FOBT specimens submitted) and still demonstrated a significant effect of verification bias on the sensitivity. Another source of heterogeneity may relate to the prevalence of benign gastrointestinal disorders such as internal hemorrhoids or gastritis which can result in recurrent GI bleeding. Thus, FOBT may have a higher sensitivity but lower specificity for colorectal cancer in populations with a higher consumption of alcohol or use of aspirin. Finally, the studies also had a wide variation in the number of cancer and control subjects. Most meta-analysis techniques give more weight to larger studies which generally have a lower variance.
Spectrum bias could also be a factor for the higher sensitivity in studies with verification bias as these studies probably had more advanced cases of colorectal cancer. We were not able to correct for this factor as almost all of the included studies did not report the FOBT sensitivity by either tumor stage or method of detection. Nevertheless, the sensitivity of FOBT was relatively high in a study with verification bias which primarily included subjects referred for screening.34 Regardless of the source of bias, we believe that the sensitivity of FOBT is low when used to screen for asymptomatic subjects for colorectal cancer.
Almost all of the studies analyzed the diagnostic characteristic of FOBT at a single point in time. As it is recommended that the test be performed annually for colorectal cancer screening, it is possible that serial determinations may improve its sensitivity. However, the improvement in sensitivity from serial testing cannot be readily determined from the studies included in our meta-analysis. Furthermore, it is possible that detecting a colorectal cancer after it takes 1-2 years for the FOBT to become positive may decrease its resectability.
Two systematic reviews of FOBT have reported that this test reduced colorectal CA mortality by only 14–16% as compared to no formal screening.68,69 Even after adjusting for patient compliance, the reduction in mortality was only 23%.69 Two possible explanations for the modest benefits of FOBT screening include a low sensitivity for colorectal cancer or failure of FOBT screening to detect most cancers when they are amenable to curative resection. We feel that the low sensitivity of FOBT is an important factor limiting its effectiveness. As described by Ransohoff and Lang’s aphorism,70 the sensitivity of a screening test determines the benefit of testing while the specificity determines the effort of screening. While this test has may have limited sensitivity for screening, the relatively high specificity should prompt further confirmatory testing.
A variety of methods have been developed to improve the detection of stool heme pseudoperoxidase activity. Generally, these tests increase the sensitivity but decrease the specificity for colorectal cancer. After excluding studies with verification bias, the pooled specificity of Hemoccult II® SENSA® was 91%. Assuming appropriate confirmation of a positive test, a health care program would have to anticipate that at least 9% of screened patients would require a colonoscopy during the initial screening. While this type of screening may result in more colonoscopies than guaiac-based FOBT lacking an enhancer, this approach would probably use less endoscopic resources than screening colonoscopy, at least in the short-run. Fecal immunochemical tests may have a similar sensitivity but higher specificity than Hemoccult II® SENSA®.64 As fecal immunochemical tests are significantly more expensive than guaiac-based FOBT,5 further comparative and cost-effective studies are needed.
In conclusion, prior studies have overestimated the sensitivity of chemical-based FOBT for screening as a result of verification bias and possibly spectrum bias. Many professional societies71,72 recommend a sensitive FOBT as a first-line option for colorectal screening while two gastroenterological societies73,74 consider this test as an alternative to the preferred option of screening colonoscopy. Thus, FOBT is an important screening modality for patients who refuse screening colonoscopy or in areas of inadequate endoscopic resources. We do not feel that most guaiac-based FOBT methods perform as “a highly sensitive test for colorectal cancer” as recommended by the American Cancer Society72 and the United States Preventive Task Force.71 Further studies are needed to determine whether non-endoscopic tests such as CT colonography, fecal immunochemical tests, or stool DNA tests can serve as adequate screening methods for colorectal cancer.
Funding Source Department of Veterans Affairs.
Conflict of interest None disclosed.
Author participation Both authors had access to the data and a role in writing the manuscript.