|Home | About | Journals | Submit | Contact Us | Français|
This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Recent studies strongly suggest that due to the limitations and risks of biopsy, as well as the improvement of the diagnostic accuracy of biochemical markers, liver biopsy should no longer be considered mandatory in patients with chronic hepatitis C. In 2001, FibroTest ActiTest (FT-AT), a panel of biochemical markers, was found to have high diagnostic value for fibrosis (FT range 0.00–1.00) and necroinflammatory histological activity (AT range 0.00–1.00). The aim was to summarize the diagnostic value of these tests from the scientific literature; to respond to frequently asked questions by performing original new analyses (including the range of diagnostic values, a comparison with other markers, the impact of genotype and viral load, and the diagnostic value in intermediate levels of injury); and to develop a system of conversion between the biochemical and biopsy estimates of liver injury.
A total of 16 publications were identified. An integrated database was constructed using 1,570 individual data, to which applied analytical recommendations. The control group consisted of 300 prospectively studied blood donors. For the diagnosis of significant fibrosis by the METAVIR scoring system, the areas under the receiver operating characteristics curves (AUROC) ranged from 0.73 to 0.87. For the diagnosis of significant histological activity, the AUROCs ranged from 0.75 to 0.86. At a cut off of 0.31, the FT negative predictive value for excluding significant fibrosis (prevalence 0.31) was 91%. At a cut off of 0.36, the ActiTest negative predictive value for excluding significant necrosis (prevalence 0.41) was 85%. In three studies there was a direct comparison in the same patients of FT versus other biochemical markers, including hyaluronic acid, the Forns index, and the APRI index. All the comparisons favored FT (P < 0.05). There were no differences between the AUROCs of FT-AT according to genotype or viral load. The AUROCs of FT-AT for consecutive stages of fibrosis and grades of necrosis were the same for both moderate and extreme stages and grades. A conversion table was constructed between the continuous FT-AT values (0.00 to 1.00) and the expected semi-quantitative fibrosis stages (F0 to F4) and necrosis grades (A0 to A3).
Based on these results, the use of the biochemical markers of liver fibrosis (FibroTest) and necrosis (ActiTest) can be recommended as an alternative to liver biopsy for the assessment of liver injury in patients with chronic hepatitis C. In clinical practice, liver biopsy should be recommended only as a second line test, i.e., in case of high risk of error of biochemical tests.
One of the major clinical problems is how to best evaluate and manage the increasing numbers of patients infected with the hepatitis C virus (HCV) . Liver biopsy is still recommended in most patients [2,3]. However, numerous studies strongly suggest that due to the limitations [4-6] and risks of biopsy , as well as the improvement of the diagnostic accuracy of biochemical markers [8,9], liver biopsy should no longer be considered mandatory.
Among the non-invasive alternatives to liver biopsy , several studies have demonstrated the predictive value of two combinations of simple serum biochemical markers in patients infected with HCV: FibroTest (FT; Biopredictive, Paris, France; HCV-Fibrosure, Labcorp, Burlington, USA) for the assessment of fibrosis; and ActiTest (AT; Biopredictive, Paris, France) for the assessment of necroinflammatory activity (necrosis) [8,9,11-21]. Similar results have not been obtained with other diagnostic tests [10-17]. Since September 2002 these tests (FT-AT) have been used in several countries as an alternative to liver biopsy. In a recent systematic review, it was concluded that these panels of tests might have the greatest value in predicting fibrosis or cirrhosis . It was also stated that biochemical and serologic tests were best at predicting no or minimal fibrosis and at predicting advanced fibrosis/cirrhosis, and were poor at predicting intermediate levels of fibrosis .
The aim of this study was to summarize the diagnostic value of these tests by an overview of the scientific literature and to respond to the following frequently asked questions by performing original new analyses: 1) what is the range of the FT-AT diagnostic values across the different studies? 2) What are the base evidence comparisons between FT-AT and other published biochemical markers? 3) Are there differences in diagnostic values according to HCV genotype or viral load? 4) Are there differences between the FT-AT diagnostic values according to stages and grades? – In other words, is FT better at predicting no or minimal fibrosis (F0 vs F1) or advanced fibrosis/cirrhosis (F3 vs F4) than at predicting intermediate levels of fibrosis (F1 vs F2)? And 5) what is the conversion between FT-AT results and the corresponding fibrosis stages and necrosis grades?
For 12 groups of patients detailed in 6 publications [8,11,12,14,19,26], it was possible to assess the prevalence of significant fibrosis and the FT area under receiver operating characteristics curve (AUROC) values, as well as the sensitivity and specificity for the 4 different FT cut offs (Table (Table1).1). For the diagnosis of significant fibrosis by the METAVIR scoring system, the AUROC ranged from 0.73 to 0.87, significantly different from random diagnosis in each study (Table (Table1),1), in meta-analysis (mean difference in AUROC = 0.39, random effect model Chi-square = 529, P < 0.001) (Figure (Figure1,1, upper panel), or after pooling data in the integrated database (Table (Table2).2). For the cut off of 0.31, the FibroTest negative predictive value for excluding significant fibrosis (prevalence 0.31) was 91% (Table (Table22).
For four groups of patients detailed in two publications [8,11], it was possible to assess the prevalence of significant necrosis and the AT AUROC values, as well as the sensitivity and specificity for 4 different AT cut offs (Table (Table3).3). For the diagnosis of significant necrosis by the METAVIR scoring system, the AUROC ranged from 0.75 to 0.86, significantly different from random diagnosis in each study (Table (Table3),3), in meta-analysis (mean difference in AUROC = 0.29, random effect model Chi-square = 556, P < 0.001), or after pooling data in the integrated database (Table (Table4).4). For the cut off of 0.36, the ActiTest negative predictive value for excluding significant necrosis (prevalence 0.41) was 85% (Table (Table22).
In four studies there was a direct comparison in the same patients of FT versus other biochemical markers, including hyaluronic acid , the Forns index , the APRI index  and the GlycoCirrhoTest . All the comparisons were in favor of FT (Table (Table1)1) (Figure (Figure1,1, lower panel), except for the GlycoCirrhoTest, which has a similar AUROC (0.87 vs 0.89 for FT) .
A total of 1,570 subjects were included in the integrated database. Of these, 1,270 were patients with chronic hepatitis C who tested PCR positive before treatment and who had had a liver biopsy and METAVIR staging and grading performed. Of these patients, 453 were from our center [11,14], including 130 patients coinfected with HCV and HIV . Eight hundred and seventy (870) patients were from a multicentre study with a total of 398 patients assessed at inclusion and 419 at the end of follow-up six months after treatment; 352 being investigated twice. Three hundred (300) healthy blood donors were also included .
There was no difference between the AUROC of FT-AT for the diagnosis of significant fibrosis (F2F3F4) (Figure (Figure2A)2A) and significant necrosis (A2A3) (Figure (Figure2B)2B) between 4 classes of genotype (1, 2, 3 and the rarer genotypes 4, 5, 6 grouped together). There was also no difference between the AUROC of FT-AT of patients with high or low viral loads for the diagnosis of significant fibrosis (Figure (Figure2C)2C) or significant necrosis (Figure (Figure2D2D).
Among the 13 published studies of FT (detailed in Table Table1),1), 9 studies estimated FT and 4 studies compared FT to other non-invasive tests. Among the 9 studies estimating FT, 5 were performed by the same single center (non-independent center), two were performed in totally independent centers, and two were performed in multiple centers, including the non-independent center. The AUROCs for the diagnosis of F2F3F4 versus random AUROCs at 0.50, were all significant and similar between these 3 groups in a meta-analysis: mean difference in AUROC = 0.29 (random effect model Chi-square = 549, P < 0.001), including 0.24 for independent, 0.25 for mixed and 0.36 for dependent studies. In the Callewaert et al.  study the AUROC of FT for the diagnosis of F4 was 0.89.
The AUROCs between different stage combinations are given in Table Table5.5. Between two contiguous stages (one stage difference), the AUROCs were not significantly different and ranged from 0.63 to 0.71. Between patients with a two-stage difference, the AUROCs were not significantly different and ranged from 0.75 to 0.86. Between patients with a three-stage difference, the AUROCs were not significantly different and ranged from 0.87 to 0.95. Between patients with a four- or five-stage difference (blood donors versus F3 or F4, and F0 versus F4), the AUROCs were not significantly different and ranged from 0.95 to 0.99.
The AUROCs between different grade combinations are given in Table Table6.6. Between two contiguous grades (one grade difference), the AUROCs were not significantly different and ranged from 0.60 to 0.70. Between patients with a two-grade difference, the AUROCs were not significantly different and ranged from 0.75 to 0.86. Between patients with a three-grade difference, the AUROCs were not significantly different and ranged from 0.87 to 0.95. Between patients with a four-grade difference (blood donors versus F3 and F0 versus F4), the AUROCs were not significantly different and ranged from 0.95 to 0.99.
FT-AT is a continuous linear biochemical assessment of fibrosis stage and necroinflammatory activity grade. It provides a numerical quantitative estimate of liver fibrosis ranging from 0.00 to 1.00, corresponding to the well-established METAVIR scoring system of stages F0 to F4 and of grades A0 to A3. Among the 300 controls, the median FT value (± SE) was 0.08 ± 0.004 (95th percentile, 0.23) and the median AT value was 0.07 ± 0.004 (95th percentile, 0.26). Among the 1,270 HCV-infected patients, the FT conversion was 0.000 – 0.2100 for F0; 0.2101 – 0.2700 for F0–F1; 0.2701 – 0.3100 for F1; 0.3101 – 0.4800 for F1–F2; 0.4801 – 0.5800 for F2; 0.5801 – 0.7200 for F3; 0.7201 – 0.7400 for F3–F4; and 0.7401 – 1.00 for F4. (Figure (Figure3A).3A). The AT conversion was 0.00 – 0.1700 for A0; 0.1701 – 0.2900 for A0–A1; 0.2901 – 0.3600 for A1; 0.3601 – 0.5200 for A1–A2; 0.5201 – 0.6000 for A2; 0.6001 – 0.6200 for A2–A3; and 0.6201 – 1.00 for A3 (Figure (Figure3B).3B). The conversions are summarized in Figure Figure44.
Based on the limitations of liver biopsy and the present overview of the diagnostic value of FT-AT, it seems that these non-invasive markers should be used as a first line assessment of liver injury in patients with chronic hepatitis C.
Liver biopsy has three major limitations, which are the risk of adverse events [2,3,7], sampling error [4-6], and inter- and intra- pathologist variability . An overview of published studies summarizes the risks of liver biopsy as pain (around 30%), severe adverse events (3/1,000) and death (3/10,000) [2,3,7]. Sampling variation is the major cause of variability [4-6]. In a study of patients with chronic hepatitis C that included only good quality biopsies, 30 of 124 patients (24.2%) had a difference of at least one grade, and 41 of 124 patients (33.1%) had a difference of at least one stage between the right and left lobes . In 18 patients (14.5%), an interpretation of cirrhosis was made in one lobe, whereas stage 3 fibrosis was made in the other . Recently, Bedossa et al.  observed very high coefficients of variation (55%) and high discordance rates (35%) for fibrosis staging in biopsies measuring 15 mm in length. The variability significantly improved in biopsies measuring 25 mm in length but was still very high with a 45% coefficient of variation and 25% discordance rate; the minimal variability was reached for biopsies, which were 40 mm in length .
Liver biopsy has also potential advantages. Biopsy could be of diagnostic value for other unrecognized liver disease. These events are probably rare in practice, as we observed no such a case in a prospective study of 537 consecutive patients with chronic hepatitis C . For FT-AT it must be realized that the same predictive values were observed for patients coinfected with HIV , and in patients with other causes of liver fibrosis such as chronic hepatitis B , alcoholic liver disease  or non-alcoholic steato-hepatitis .
It is possible that biochemical markers such as those described here may provide a more accurate (quantitative and reproducible) picture of fibrogenic and necrotic events occurring within the liver than hepatic biopsy. The greater accuracies of FT-AT, when assessed with biopsy specimens greater than 15 mm versus smaller biopsies, suggest that some discordance between FT-AT and histology were due to biopsy specimen sampling error . Several case reports have observed false negatives of liver biopsy versus biochemical markers [8,9,11]. The error was attributable to biopsy because there were overt clinical signs of cirrhosis such as esophageal varices, low platelet counts or a dysmorphic liver on ultrasound. In a recent prospective study we estimated that 18% of discordances between FT-AT and histology were attributable to biopsy failure (mostly due to small length) and 2% to FT-AT failure .
The present work allowed frequently asked questions to be answered, the first being whether the diagnostic values of FT-AT had been confirmed in all studies performed to date. A major strength of the studies pertaining to FT-AT is that they were carried out on a large number of patients with chronic hepatitis C, and the results were reproducible in different populations, including patients coinfected with HIV. There was a small variability in the AUROCs, both for the diagnosis of significant fibrosis (0.73 to 0.87) and significant necrosis (0.75 to 0.86).
A weakness of this study was that the same group, which developed these tests, performed most of the published studies. However the independent published studies found the same significant diagnostic values than non-independent or multicentre studies. Several recent independent studies confirmed the predictive value of FT-AT [26,30].
The second question concerned the comparison of FT-AT to other tests. In their recent review, Gebo et al.  concluded that panels of markers might have the greatest value in predicting the absence or no more than minimal fibrosis on biopsy, and in predicting the presence of cirrhosis on biopsy (Evidence Grade B). They pointed out that five studies [11,32-35] used large panels of markers and achieved the greatest predictive values. Among these 5 studies were the first FT-AT study  and another study developed by the same group (combining age and platelets) . A recent study compared FT-AT to the age and platelets index in the same patients and found that FT-AT was significantly better . Three studies directly compared FT-AT, to hyaluronic acid , the Forns index  and the Wai index  in the same patients. FT-AT had higher diagnostic values (the AUROC was significantly higher). FT was in particular more sensitive for discriminating between F1 and F2, and more linearly correlated to stages when compared to those 3 other markers [12,16,17]. An additional weakness of the Forns index is the inclusion of cholesterol, which varies greatly in patients with genotype 3 . The limitations of these three comparisons [12,16,17] are that they were retrospective and were performed by the same group. These comparisons, however, had no evident sources of bias. The comparison with the Forns Index  included all patients of the Imbert-Bismut et al. study (n = 323) , as the parameters belong to the routine biochemical tests. The comparison with the APRI index included 249/323 patients (77%) without any difference between included or non-included patients when all characteristics were compared . The comparison with hyaluronic acid  included a total of 165 out of the 244 (68%) randomized patients pre-included. The 165 included patients did not differ from the 79 non-included patients according to the main characteristics. Among the 165 patients, the fibrosis index was assessed in 461 samples and hyaluronic acid in 457 samples .
Recently, a study using profiles of serum protein N-glycans found that a profile has a similar AUROC than FT for the diagnosis of compensated cirrhosis. When combined with FT this marker had 100% specificity and 75% sensitivity for the diagnosis of compensated cirrhosis, which is not significantly different from the 92% specificity and 67% sensitivity of the FT . This study was independent and prospectively designed for taking FT as the comparison test. Only 24 patients with cirrhosis were included and no details were given concerning the causes of discordance between biopsy and biochemical markers.
However FT-AT is the only panel of markers identified by an independent overview , which has been compared in the same patients with most of the other proposed markers. No studies were found that compared FT-AT with a panel of extra-cellular matrix markers . Compared to other panels, FT-AT also allowed an estimation to be made not only of the fibrosis stage but also the necroinflammatory (histological) activity.
The present analysis of the integrated database demonstrated that the diagnostic value of FT-AT did not depend on HCV genotype or viral load. However, because of the small number of patients included, studies in genotype 4, 5 and 6 would be useful.
The present analysis also answered another frequently asked question concerning the predictive values for the intermediate stages of fibrosis. Contrary to the initial hypothesis, the diagnostic values of FT-AT for consecutive stages of fibrosis and grades of necroinflammatory activity were the same for both moderate and extreme stages and grades. Our interpretation is that the same overlap exists between all stages, which is mainly related to the sampling error of the biopsy. It is very reassuring that the medians of FT-AT are linearly associated with stages and grades (Figures 3A,3B). The linearity of this association became even more evident as a larger number of patients were included (data not shown).
Finally, the integrated database allowed a simple conversion system to be proposed to clinicians between liver injury as estimated by the FT-AT and that as estimated by liver biopsy (Figure (Figure4).4). One conventional way to express the diagnostic values of FT-AT was summarized using the cutoffs of the distribution by stages and grades (Tables (Tables22 and and4).4). The negative predictive value of FT for excluding significant fibrosis was excellent for the 0.31 cutoff (91%), as was the negative predictive value for excluding significant activity at the 0.36 cutoff of AT (85% negative predictive value). The positive predictive value of the 0.72 cutoff of FT for significant fibrosis was also high at 76%. This, however, may appear lower than the negative predictive value. There is a technical explanation owing to the prevalence of significant fibrosis, which was only 0.31 in this population. According to the excellent specificity (above 0.95), the positive predictive value increased rapidly in populations with more fibrosis (data not shown). We recently observed that the main reason for this was probably because most of the so-called false positives of the FT were in fact false negatives due to the small sampling size of liver biopsies [5,9]. The same comments can be made concerning the positive predictive value of AT for significant necrosis with 77% at the 0.60 cutoff. Again, it is probable that a large proportion of so-called false positives of AT were in fact false negatives due to liver biopsies which were too small. The ideal study would be one using biopsies measuring 40 mm in length, as two samples of 20 mm each during laparoscopy. Only this very high quality biopsy can be considered as a true gold standard. Obviously this type of biopsy cannot be performed routinely as first line, but it could be recommended for clinical research.
Based on these results, the use of the biochemical markers of liver fibrosis (FibroTest) and necrosis (ActiTest) can be recommended as an alternative to liver biopsy for the first line assessment of liver injury in patients with chronic hepatitis C. In clinical practice, liver biopsy should be recommended only as a second line test, i.e., in case of high risk of error of biochemical tests or in transplanted patients. For clinical research, only very high quality liver biopsy (as two samples of 20 mm each) can be considered as a gold standard for validation of new alternatives.
We did a search for all publications and communications between February 2001 and March 2004 with the key words "FibroTest" and "ActiTest" in Medline and in the abstract books of hepatology, gastroenterology, internal medicine and infectious diseases annual meetings. Only publications or abstracts concerning FT-AT in chronic hepatitis C were included.
For each study we assessed the diagnostic value for the diagnosis of significant fibrosis (bridging fibrosis or stages F2, F3, F4 according to the METAVIR scoring system) and significant necroinflammatory activity (moderate or severe necrosis, grades A2 or A3 according to the METAVIR scoring system) by the area under the receiver operating characteristics curve (AUROC).
For several databases it was possible to re-analyze the individual data and we looked at the sensitivity and specificity according to different thresholds (0.10, 0.30, 0.60 and 0.80). When FT-AT was compared to other biochemical tests, we also assessed the corresponding sensitivity and specificity according to several thresholds.
We selected studies using direct comparisons of diagnostic values in the same patients. The AUROCs were compared for the diagnosis of significant fibrosis (F2F3F4) and significant necrosis (A2A3).
Patients were included in an integrated database if they belonged to a published population of patients with chronic hepatitis C. Liver biopsy was scored using the METAVIR scoring system and FT-AT was assessed using the recommended pre-analytical and analytical procedures [18,20]. A published population of 300 prospectively analyzed blood donors was included as a control group .
Using the integrated database, we compared the AUROCs of FT-AT for the diagnosis of significant fibrosis (F2F3F4) and significant activity (A2A3) between 4 classes of genotype (1, 2, 3 and the rarer genotypes 4, 5, 6 grouped together). For viral load, only those assessed in the same laboratory were included in the comparison between AUROCs, and the median was used to define low and high viral loads (3,800,000 copies/ml) .
Using the integrated database, we compared the diagnostic values according to different stages or grades. We compared the AUROCs for all possible combinations of stages and grades, including combinations with blood donors. This allowed, for example, a comparison to be made of the diagnostic value of FT for discriminating between F1 and F2 after excluding all other stages of the database.
In the integrated database, liver biopsies were processed using standard techniques. A pathologist who was unaware of the biochemical markers evaluated fibrosis stage and necrosis grade according to the METAVIR scoring system [22,23].
Fibrosis was staged on a scale of 0 to 4: F0 = no fibrosis, F1 = portal fibrosis without septa, F2 = few septa, F3 = numerous septa without cirrhosis, F4 = cirrhosis. The grading of activity by the METAVIR system (based on the intensity of necroinflammatory activity, mainly on necrosis) was scored as follows: A0 = no necroinflammatory activity, A1 = mild activity, A2 = moderate activity, A3 = severe activity [22,23].
We used the previously validated FT-AT [8,9,11-21]. FT-AT is a non-invasive blood test that combines the quantitative results of six serum biochemical markers [alpha2-macroglobulin, haptoglobin, gamma glutamyl transpeptidase (GGT), total bilirubin, apolipoprotein A1 and alanine aminotransferase (ALT)] with the patient's age and gender in a patented artificial intelligence algorithm (USPTO 6,631,330) to generate a measure of fibrosis stage and necroinflammatory grade in the liver.
Corresponding stages and grades were calculated from median scores and 95% confidence intervals were observed in 1,270 patients and 300 healthy blood donors. The AUROC was used as a measure of discrimination, estimated using the empirical (non-parametric) method by DeLong et al. , and were compared using the paired method by Zhou et al. . All analyses are performed on the NCSS software (Kaysville, Utah) .
TP and MM conceived the study, performed the statistical analysis, and wrote the manuscript. FIM, BH and DM carried out biochemical analyses. RP, DT, VR, and YB participated in the coordination of the study, and drafted the manuscript. AM participated in the design and coordination of assays in the control group. All authors read and approved the final manuscript.
Thierry Poynard has grants from the Association pour la Recherche sur le Cancer (ARECA) and from the Association de Recherche sur les Maladies Virales Hépatiques