|Home | About | Journals | Submit | Contact Us | Français|
Objective To evaluate whether including a test for faecal calprotectin, a sensitive marker of intestinal inflammation, in the investigation of suspected inflammatory bowel disease reduces the number of unnecessary endoscopic procedures.
Design Meta-analysis of diagnostic accuracy studies.
Data sources Studies published in Medline and Embase up to October 2009.
Interventions reviewed Measurement of faecal calprotectin level (index test) compared with endoscopy and histopathology of segmental biopsy samples (reference standard).
Inclusion criteria Studies that had collected data prospectively in patients with suspected inflammatory bowel disease and allowed for construction of a two by two table. For each study, sensitivity and specificity of faecal calprotectin were analysed as bivariate data to account for a possible negative correlation within studies.
Results 13 studies were included: six in adults (n=670), seven in children and teenagers (n=371). Inflammatory bowel disease was confirmed by endoscopy in 32% (n=215) of the adults and 61% (n=226) of the children and teenagers. In the studies of adults, the pooled sensitivity and pooled specificity of calprotectin was 0.93 (95% confidence interval 0.85 to 0.97) and 0.96 (0.79 to 0.99) and in the studies of children and teenagers was 0.92 (0.84 to 0.96) and 0.76 (0.62 to 0.86). The lower specificity in the studies of children and teenagers was significantly different from that in the studies of adults (P=0.048). Screening by measuring faecal calprotectin levels would result in a 67% reduction in the number of adults requiring endoscopy. Three of 33 adults who undergo endoscopy will not have inflammatory bowel disease but may have a different condition for which endoscopy is inevitable. The downside of this screening strategy is delayed diagnosis in 6% of adults because of a false negative test result. In the population of children and teenagers, 65 instead of 100 would undergo endoscopy. Nine of them will not have inflammatory bowel disease, and diagnosis will be delayed in 8% of the affected children.
Conclusion Testing for faecal calprotectin is a useful screening tool for identifying patients who are most likely to need endoscopy for suspected inflammatory bowel disease. The discriminative power to safely exclude inflammatory bowel disease was significantly better in studies of adults than in studies of children.
The incidence of inflammatory bowel disease is on the increase in both adults and children.1 2 The disorder includes two major forms of chronic intestinal inflammation: Crohn’s disease and ulcerative colitis. Suspicion is raised in patients with persistent (≥4 weeks) or recurrent (≥2 episodes in six months) abdominal pain and diarrhoea. Additionally, rectal bleeding, weight loss, or anaemia increase the probability of the condition.3 4 Pathognomonic signs or symptoms do not exist. Endoscopic evaluation with histopathological sampling are generally considered indispensable in the investigation of patients with suspected inflammatory bowel disease.3 4 Many patients consider endoscopy and the required bowel preparation to be uncomfortable.5 In a relatively large proportion of people with suspected inflammatory bowel disease the results of endoscopy will be negative.6 A third of adults with bleeding related symptoms have no abnormalities on endoscopy, and this proportion increases to half with non-bleeding symptoms such as diarrhoea, abdominal pain, and weight loss. Identification of low risk patients would reduce the number of unnecessary invasive endoscopic procedures. Conversely, doctors would like to be able to identify those with a sufficiently high likelihood of inflammatory bowel disease to justify urgency for endoscopy.
Use of a simple, non-invasive, and cheap screening test to make a presumptive diagnosis of inflammatory bowel disease would help to reach these goals. Determination of calprotectin levels in stools could be a good screening method. Calprotectin is a major protein found in the cytosol of inflammatory cells.7 The protein is stable in stool samples for up to seven days at room temperature and one sample of less than 5 g is sufficient for a reliable measurement.8 These qualities allow for stool sample collection at home and potential delays in transport to the laboratory.
Since 2000, faecal calprotectin has been evaluated in numerous diagnostic studies in both adult and paediatric populations. Many of these studies included healthy people on one side of the patient spectrum and patients with known inflammatory bowel disease on the other. Both extremes give cause to overestimation of diagnostic accuracy relative to the practical situation, where screening is necessary because it is difficult to clinically distinguish between those who do and those who do not need urgent endoscopy. The doctor is then left with little guidance about the usefulness of faecal calprotectin as a screening test. We carried out a meta-analysis to evaluate whether adding faecal calprotectin testing to the investigation of patients with suspected inflammatory bowel disease reduced the number of unnecessary endoscopies.
Eligible studies were those that assessed the diagnostic accuracy of faecal calprotectin testing in patients with inflammatory bowel disease suspected on clinical grounds. Data collection had to be done prospectively with stool sampling (index test) before endoscopic evaluation including histopathological verification of segmental biopsies (reference standard).
We searched for diagnostic studies published in Medline and Embase up to October 2009. The search strategy for Medline was (“Leukocyte L1 Antigen Complex”[Mesh] OR “calprotectin”[tw]) AND (“Inflammatory Bowel Diseases”[Mesh] OR “inflammatory bowel disease”[tw] OR “inflammatory bowel diseases”[tw] OR “IBD”[tw] OR “Crohn”[tw] OR “Colitis”[tw]). For Embase we used (“calgranulin”/exp OR “calprotectin”/exp) AND (“enteritis”/exp OR “inflammatory bowel disease”/exp OR “inflammatory bowel diseases”/exp OR “ibd” OR “crohn” OR “colitis”/exp) AND [embase]/lim.
We restricted our search to studies published in English only. Duplicate articles identified in both Medline and Embase were manually deleted using Reference Manager, version 11 (Thomson Reuters, Philadelphia, PA). For further relevant studies we checked the reference lists of identified trials.
The first selection was carried out by one reviewer (PFvR), on the basis of the title and abstract. The full paper of each potentially eligible study was then obtained. Two reviewers (PFvR, EVdV) independently assessed eligible studies for inclusion. Disagreements were resolved by discussion. The following characteristics were extracted from each selected study: age range, prevalence of inflammatory bowel disease in the study population (pretest probability); percentage of patients with Crohn’s disease and percentage with ulcerative colitis in the group of confirmed cases with inflammatory bowel disease; reference standard; faecal calprotectin assay; cut-off value for faecal calprotectin; and data for construction of a two by two table. Authors were contacted in cases where information was missing to construct a two by two table.
Study quality was assessed using the QUADAS (QUality Assessment of studies of Diagnostic Accuracy included in Systematic reviews) checklist.9 Each item is scored as “yes,” “no,” or “unclear.” We did not calculate summary scores because their interpretation is problematic and potentially misleading.10 From the QUADAS checklist we chose seven of the best differentiating items (box).
If patients had suspected inflammatory bowel disease on the basis of their clinical presentation, we scored the studies as “yes.” We scored studies as “no” that excluded patients with “other somatic bowel disorders than inflammatory bowel disease or irritable bowel syndrome.” Studies that recruited a group of healthy controls and a group known to have inflammatory bowel disease were also scored as “no,” because diagnostic test accuracy is likely to be overestimated in such a design. If information was insufficient to make a judgment we scored the study as “unclear.”
No reference standard in the diagnosis of inflammatory bowel disease is 100% sensitive or 100% specific. However, the Porto criteria that have been formulated by the European Society for Paediatric Gastroenterology, Hepatology and Nutrition approach an optimal diagnostic strategy for patients with suspected inflammatory bowel disease.3 The investigation involves endoscopy of both the upper and the lower gastrointestinal tract, with biopsies from each segment of the gastrointestinal tract. To obtain a score of “yes,” the studies had to have a reference standard that consisted of at least ileocolonoscopy including histology. When the ileum was not intubated or no biopsies were taken, we coded the study as “no.” If colonoscopy and histology were done but information on ileal intubation was insufficient we scored the study as “unclear.”
Ideally, faecal sampling is done shortly before endoscopy, before preparation of the bowel. A delay of up to one month was not considered problematic as it is unlikely that mucosal inflammation spontaneously disappears within this period. We therefore scored studies with a delay of less than one month as “yes” and those with a delay of more than one month as “no.” If insufficient information was provided we scored the study as “unclear.”
Partial verification bias occurs when not all of the study group receives confirmation of the diagnosis by endoscopy. When it was clear from the study that all patients who collected faeces for measurement of calprotectin level had their disease status verified by endoscopy, we scored this item as “yes.” Studies scored “no” if some of the patients did not undergo the reference standard and the selection of patients to receive the reference standard was not random.
Differential verification bias occurs when the performance of the faecal calprotectin test is verified by a different reference standard. If patients had inflammatory bowel disease verified by the same type of endoscopy we scored this item as “yes.” If some patients received verification by sigmoidoscopy instead of another procedure, such as colonoscopy, we scored this item as “no,” as there is a risk of missing right sided colitis.
Faecal sampling for measurement of calprotectin level was carried out before endoscopy, and analysis was mostly done by laboratory technicians who had no information on the endoscopy results. However, this design precluded that faecal calprotectin results were sometimes known to the endoscopist before endoscopic evaluation. This could influence the interpretation of macroscopic abnormalities seen during endoscopy. In that case we scored this item as “no.” If insufficient information was provided we scored the study as “unclear.”
When it was clear what happened to all patients who entered the study, we scored this item as “yes.” When withdrawals were not explained, we scored this item as “no.”
We calculated sensitivity and specificity for each study and analysed these as bivariate data by methods for diagnostic meta-analysis.11 This approach accounts for possible within study negative correlation between sensitivity and specificity. We present the data as forest plots and receiver operating characteristic curves. Forest plots display the diagnostic probabilities of individual studies, the corresponding 95% confidence intervals, and squares with area proportional to study weight in the meta-analysis. The receiver operating characteristic curves show individual study data points as circles, with size proportional to study weight, the 95% confidence and 95% prediction regions around the pooled estimate, and the hierarchical summary curve resulting from the hierarchical summary receiver operating characteristic model. We carried out predefined subgroup analyses for adults and for children. The z test (two sided at 5% level of significance) was used to separately compare the pooled estimates of sensitivity and specificity of the two groups. Finally, we calculated the average likelihood ratio of the positive and negative test result for both subgroups. Computations were carried out with the library DiagMeta of the R-package (www.r-project.org/),12 and with STATA (version 11), in particular the metandi commands.
The study includes results of electronic searches up to 14 October 2009. A total of 179 papers were identified, of which 99 were retrieved for full text review. Of these, 66 were excluded as they were unrelated to diagnostic accuracy studies or did not use endoscopy as the reference standard. Of 33 diagnostic accuracy studies that compared faecal calprotectin testing with endoscopy as the reference test, 13 focused on the desired patient spectrum and were included in the final analysis (fig 11).). Table 11 lists the characteristics of the 33 studies in which endoscopy was used as the reference standard and explains why 20 were unsuitable for inclusion.
The final analysis included six studies in adults and seven in children and teenagers (age range 10 months to 19.9 years). The faecal calprotectin test was used in a total of 670 patients in the adult studies and 371 in the remainder. Inflammatory bowel disease was confirmed in 32% (n=215) of the adults and in 61% (n=226) of the children and teenagers (table 22).). The methodological quality of the studies in children and teenagers was better than that of the studies in adults (fig 22).). All studies used a prospective study design and enrolled consecutive outpatients with suspected inflammatory bowel disease. Selection bias in three adult studies was caused by the post hoc exclusion of patients.19 22 23 In four adult studies endoscopy was suboptimal as the ileum was not intubated or histology was not done.18 19 23 25 In three studies in children and teenagers withdrawals were not explained.17 20 24 Three studies excluded patients with gross rectal bleeding, as this symptom would usually prompt endoscopic evaluation without preliminary stool testing.15 18 23 Partial and differential verification was appropriately reported and bias was prevented in all but two adult studies.19 25 Blinding of index test results was reported in all but three studies in children and teenagers.17 20 24
Per age group analyses—Figure 33 presents the forest plots of sensitivity (true positive rate) and 1−specificity (false positive rate) for the 13 studies. Figure 44 presents the diagnostic values of the studies in a hierarchical summary receiver operating characteristic graph for adults and for children and teenagers. For adults the sensitivity was 0.93 (0.85 to 0.97) and specificity 0.96 (0.79 to 0.99), and the corresponding values for children and teenagers were 0.92 (0.84 to 0.96) and 0.76 (0.62 to 0.86). The difference between specificities of the two groups was significant (P=0.048).
Post-test probability of inflammatory bowel disease—On the basis of the pooled estimates of sensitivity and specificity, the average likelihood ratio of the positive and negative test result was calculated for adults and for children and teenagers. The use of faecal calprotectin testing changed the post-test probability of inflammatory bowel disease in both subgroups (fig 55).). In adults with suspected inflammatory bowel disease and a pretest probability of 32% an abnormal test result for calprotectin concentration increases the probability of inflammatory bowel disease to 91% (95% confidence interval 77% to 97%), whereas a normal test result for calprotectin concentration reduces the probability to 3% (1% to 11%). In children and teenagers with suspected inflammatory bowel disease the pretest probability is 61%. An abnormal test result for calprotectin increases the probability to 86% (78% to 92%), whereas a normal test result for calprotectin reduces the probability to 15% (7% to 28%).
In our meta-analysis we included six studies in adults and seven in children and teenagers, which were selected for their methodological robustness. In these studies data collection was done prospectively in a consecutive series of patients with suspected inflammatory bowel disease. All included studies used the fully paired design where patients first undergo faecal calprotectin testing and then endoscopy. In the adult studies the pooled sensitivity of faecal calprotectin testing was 0.93 (95% confidence interval 0.85 to 0.97) and the pooled specificity was 0.96 (0.79 to 0.99). The corresponding values in the studies in children and teenagers were 0.92 (0.84 to 0.96) and 0.76 (0.62 to 0.86).
The lower specificity in the studies of children and teenagers was significantly different from that in the adult studies. Five adult studies that included a relatively large proportion of patients with irritable bowel syndrome had significantly higher specificity.19 21 22 23 25 This gastrointestinal syndrome, characterised by chronic abdominal pain and altered bowel habits in the absence of any organic cause, was hardly diagnosed (7%, 27/371) in the study population of children and teenagers. According to British Society of Gastroenterology guidelines, patients with irritable bowel syndrome without alarm features (including anaemia, weight loss, and age >50 years) do not need endoscopic evaluation, because of a low likelihood of identifying organic disease.46 The five adult studies that included a large proportion of patients with irritable bowel disease did not report the presence of alarm features. Absence of alarm symptoms is likely to overestimate the specificity of faecal calprotectin. In theory, the inclusion of infants and young children (under 5 years) in four of the studies could be a reason for lower specificity.14 17 20 24 At this age stool samples are usually collected from a nappy. This sampling technique could increase the level of faecal calprotectin because water is absorbed by the nappy.47 As most young patients with newly diagnosed inflammatory bowel disease are teenagers, we do not think that this mechanism played an important role. The choice of the faecal calprotectin cut point could be another reason for the higher specificity in adults. However, we found no difference between the groups in a subgroup analysis comparing studies using a cut point of ≤50 μg/g and of >50 μg/g. Prevalence of inflammatory bowel disease had no effect on specificity, just as with the exclusion of patients with rectal bleeding. Quality items also had no significant influence on diagnostic characteristics.
We aimed to determine whether faecal calprotectin can serve as a screening test to reduce the number of people undergoing invasive endoscopy. To move from the evidence gathered in this meta-analysis to a recommendation for a screening strategy we used the comprehensive and transparent GRADE approach.48 Recognising that the diagnostic accuracy of faecal calprotectin is a surrogate for outcomes important to patients is central to this approach. Screening patients by measuring faecal calprotectin levels is of value only if it results in improved outcomes for patients. For this reason we infer the effect of faecal calprotectin screening on patient outcome from the pooled sensitivity and specificity. Key questions are whether the numbers of false negatives (missed cases) and false positives (cases without inflammatory bowel disease who go on to have endoscopy) are acceptable when faecal calprotectin is introduced as a screening test. In the “new” diagnostic pathway patients only with suspected inflammatory bowel disease and an abnormal faecal calprotectin result will be sent urgently for endoscopy (fig 66).). Table 33 shows the implications of the testing scenarios. In a hypothetical population of 100 adults with suspected inflammatory bowel disease (and an overall mean prevalence of 32%) three patients without the disease would go on to have endoscopy and two patients with the disease would be missed. Faecal calprotectin screening would reduce the number of adults requiring endoscopy by 67%. In a hypothetical population of 100 children and teenagers with suspected inflammatory bowel disease (and an overall mean prevalence of 61%) nine without the disease would go on to have endoscopy, five with the disease would be missed, and faecal calprotectin screening would reduce the number requiring endoscopy by 35%.
The clinical consequences of missing patients with inflammatory bowel disease should be balanced against patients without the disease who go on to have endoscopy. A false negative faecal calprotectin test result would lead to a failure to introduce effective treatment in a timely manner, with the resultant continuation of symptoms. A false positive test result means that people endure an invasive procedure. A considerable proportion of the patients with a false positive test result will, however, prove to have a gastrointestinal condition different from inflammatory bowel disease (table 44)) for which endoscopy is inevitable. Complications of endoscopy, related to the invasiveness of the procedure itself (colonic perforation or tear) or to anaesthesia, are also important considerations, although they are rare. Several retrospective studies have reported the incidence of a small perforation after colonoscopy to be in the range 0.032% (1 in 3115 patients) to 0.9% (1 in 111).49 50 51
We consider faecal calprotectin a useful screening tool for identifying those patients who are most likely to need endoscopy for inflammatory bowel disease. Adding calprotectin testing to the diagnostic pathway, however, also resulted in delayed diagnosis in 6% (2 in 32 patients) of the adults and 8% (5 in 61) of the children and teenagers. Health professionals may be interested in finding ways to ease the pressure on overstretched endoscopy centres with long waiting lists. Increased faecal calprotectin levels may indicate a need for urgent endoscopy, whereas normal calprotectin levels are less likely to be associated with intestinal inflammation and further investigations can be tailored appropriately. The only exception to this rule is the presence of persistent rectal bleeding, which would justify urgency for endoscopy comparable to that for increased levels of faecal calprotectin.
A total of 22 narrative reviews have been published in recent years on the use of testing for faecal calprotectin levels in the diagnosis of intestinal inflammation or flare-up of inflammatory bowel disease (fig 1), but all were based on non-systematic methods.52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 One systematic approach summarised the findings of all available studies to 2006.74 The reviewers found higher sensitivity and specificity for the diagnosis of inflammatory bowel disease than we did. The meta-analysis of that review, however, had several methodological limitations. Sensitivity and specificity were pooled separately contrary to general recommendations and the reviewers included studies that featured a control group with healthy people, which leads to overestimation of diagnostic accuracy.75
We reviewed the diagnostic accuracy of faecal calprotectin levels according to the most recent insights and methods for diagnostic meta-analyses. The results can, however, be biased by the use of the reference standard. Although we included studies that used endoscopy with histopathological verification of segmental biopsies, we included two studies in adults that did not sample intestinal mucosa.18 25 It is possible that some patients were misclassified because of a normal macroscopic appearance of the mucosa, whereas microscopic evaluation would have shown abnormalities typical of the disease. However, even ileocolonoscopy combined with histology is not an ideal method. The gastrointestinal tract can only be partly visualised with conventional endoscopy. The reported pooled sensitivity of faecal calprotectin could thus be slightly overestimated.
We tried to reduce spectrum bias by including only studies with a patient population representative of patients seen in usual clinical care. None of the studies used a well defined set of clinical findings (clinical prediction rules) or flow chart that identifies patients with a high probability of inflammatory bowel disease.
Because of the limited number of studies included in this meta-analysis we were not able to assess the diagnostic accuracy of faecal calprotectin at different cut-off values. Most of the included studies used the cut-off as advised by the manufacturer (50 μg/g).16 17 19 20 21 22 24 Others based the cut-off on their own receiver operating characteristic curves,23 25 or on the 95th centile of the normal range in children and teenagers.14 15
We could not control for time between calprotectin testing and the reference standard. Ideally faecal sampling was done shortly before endoscopy, but a delay of up to one month was not considered problematic. One study in children and teenagers did not meet this requirement, with over 50% of the faeces samples being collected up to three months after endoscopy.17
We suspected a possible overlap of two patient cohorts described by one research group,21 22 and therefore contacted the authors by email. They replied that there was no overlap of the two patient cohorts as these were different study protocols. Each of them had been approved by the local institutional review board.
Finally, we restricted our search to studies published in English only. This could have been a potential source of bias.
The value of faecal calprotectin for screening of patients with suspected inflammatory bowel disease was evaluated in tertiary care facilities, with the exception of one secondary level hospital.19 The Fagan plot (fig 5) presents the predictive values corresponding to the prevalences in this tertiary level context. The plot readily facilitates reading off predictive values corresponding to a lower prevalence in primary care. For example, on decreasing the prevalence (pretest probability) in adults from 32% to 5%, the positive predictive value of the faecal calprotectin test decreases to about 55% whereas the negative predictive value increases above 99.8%. (This assumes that likelihood ratios remain constant across the spectrum of care.) The emphasis in tertiary care is usually on “ruling in”: increasing the probability of inflammatory bowel disease to carry out more expensive, time consuming, and invasive procedures; establish a firm diagnosis; and start appropriate treatment. At tertiary care level a diagnostic test with a high positive likelihood ratio is preferred. In primary care, where the prevalence of inflammatory bowel disease is low, the emphasis is on “ruling out”: lowering the probability of the target disease to provide reassurance, or to adopt a “watchful waiting” strategy. In these instances tests with a low negative likelihood ratio are preferred. In view of the above we are reserved about the utility of faecal calprotectin in primary care practice, and we certainly discourage its use to screen asymptomatic patients.
Measuring faecal calprotectin levels is a useful screening tool for identifying patients who are most likely to need endoscopy for suspected inflammatory bowel disease. The discriminative power to safely exclude the disease (specificity) is significantly better in studies of adults than in studies of children and teenagers. At a tertiary care level faecal calprotectin levels can contribute important information and guide patient management. The pooled sensitivity and specificity, however, should be interpreted with caution. Despite a strict selection of studies based on proper patient recruitment and study design, heterogeneity was considerable.
We thank S van der Werf (medical librarian, University Medical Center, Groningen) for help with the design of the optimal search strategy for Medline and Embase.
Contributors: PFvR and EVdV conceived and designed the study; acquired, analysed, and interpreted the data; and drafted the manuscript. VF analysed and interpreted the data, provided statistical expertise, and critically revised the manuscript. All authors had full access to all of the data (including statistical reports and tables) in the study and can take responsibility for the integrity of the data and the accuracy of the data analysis.
Funding: This review received no funding.
Competing interests: All authors have completed the Unified Competing Interest form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare that: (1) they did not receive financial support for the submitted work; (2) they have no relationships with companies that might have an interest in the submitted work in the previous 3 years; (3) their spouses, partners, or children have no financial relationships that may be relevant to the submitted work; and (4) they have no non-financial interests that may be relevant to the submitted work.
Ethical approval: Not required.
Data sharing: No additional data available.
Cite this as: BMJ 2010;341:c3369