|Home | About | Journals | Submit | Contact Us | Français|
Pathologists participating in the NIH-sponsored Biliary Atresia (BA) Research Consortium (BARC) developed and then evaluated a standardized system for histological reporting of liver biopsies from infants with cholestasis.
A set of 97 anonymous liver biopsy samples was sent to 10 pathologists at BARC centers. A semi-quantitative scoring system that had 16 histologic features was developed and then used by the pathologists, who had no knowledge of clinical history, imaging results, or laboratory data. Inter-observer agreement was evaluated statistically. Agreement on scoring of each feature and on the pathologists’ diagnosis, compared with the final clinical diagnosis, were evaluated using weighted kappa statistics.
There was moderate to substantial inter-observer agreement in identification of bile plugs in ducts, giant-cell transformation, extramedullary hematopoiesis, and bile duct proliferation. The pathologists’ diagnosis of obstruction in clinically proven cases of BA ranged from 79% to 98%, with a positive predictive value (PPV) of 90.7%. Histological features that best predicted BA, based on logistic regression, included bile duct proliferation, portal fibrosis, and absence of sinusoidal fibrosis (each P<0.0001).
The BARC histological assessment system identified features of liver biopsies from cholestatic infants, with good inter-observer agreement, that might be used in diagnosis and determination of prognosis. The system diagnosed BA with a high level of sensitivity and identified infants with biliary obstruction with reasonable inter-observer agreement. However, distinguishing between BA and disorders such as total parenteral nutrition-associated liver disease and alpha-1-antitrypsin deficiency is not possible without adequate clinical information.
Biliary atresia (BA) is a progressive fibroinflammatory process involving the extrahepatic biliary tree resulting in loss of patency of the lumen and obstruction to bile flow, leading to chronic liver damage. BA occurs in one in 8-18,000 live births in various populations and results in 250-400 new cases per year in the US. It accounts for 25% of all cases of conjugated hyperbilirubinemia in infants, and is the most common indication for liver transplantation in children1, 2. Timely diagnosis of biliary obstruction is a major goal of the evaluation of cholestatic infants as early surgical restoration of bile flow results in better outcome and offers the prospect of normal growth and long-term survival without liver transplantation3, 4.
Liver biopsy is a cornerstone of the diagnostic work-up of infants with cholestatic jaundice, and it is standard practice in most pediatric centers to obtain a percutaneous liver biopsy prior to surgical intervention5, 6. However, interpretation of the biopsies in this clinical setting is challenging. The differential diagnosis of infantile cholestasis is perhaps the broadest of any age group and encompasses numerous obstructive as well as non-obstructive disorders5. Furthermore, the histologic features of many cholestatic disorders of infancy may change over time. The earliest histologic changes of BA may be relatively non-specific, and biopsies performed too early in the course of the disease may result in a falsely negative diagnosis7. In addition to its role in diagnosis, evaluation of the liver biopsy may also reveal prognostically significant histological features, such as the degree of fibrosis, which may help predict outcome following Kasai portoenterostomy.
The relatively few studies evaluating the accuracy of liver biopsies in jaundiced infants have been based in single institutions with interpretation by a limited number of pathologists8-10. The Biliary Atresia Research Consortium was formed in 2002 as a National Institute of Health-sponsored collaborative network of 10 pediatric institutions and a data coordinating center with the goal of conducting prospective clinical and basic research in BA. The resources of this network provided the opportunity to evaluate biopsy material from jaundiced infants from multiple institutions. The aims of this study were to: 1) develop and validate a standardized assessment of histological features that typify biopsies in cases of neonatal cholestasis; 2) identify which histologic features were best predictive of BA; 3) determine the inter-observer variability among pathologists to distinguish key histologic features.
The BARC Pathology Committee and data coordinating center (DCC) met to determine the aims of the study, to identify histologic diagnostic categories and to devise an evaluation method for each biopsy based on a semi-quantitative scoring system. A study set of slides from each institution was assembled based on the following inclusion criteria: 1) a liver biopsy performed during the calendar year 2002 in a BARC center; 2) it was obtained in an infant < 181 days of age with clinical cholestasis (with the assumption that BA would be clinically apparent by 6 months of age); 3) a definitive diagnosis of BA cases had been made by intraoperative cholangiogram and/or examination of the excised biliary remnants from the Kasai operation, and diagnosis of all non-BA cases had been established on clinical grounds with adequate follow-up to confirm the absence of BA; and 4) adequate material was available to provide study slides. For each case, one H+E and one Masson-Trichrome –stained slide were included, and a clinical case report form was completed by the BARC study coordinator at each institution. The information abstracted from the patients’ charts consisted of: 1) the final clinical diagnosis; 2) age at onset of jaundice; 3) age at liver biopsy; 4) date of birth, gestational age at birth, gender, racial/ethnic information, if available; 5) laboratory results (liver function panel obtained before or at the time of biopsy); 6) imaging assessments of the biliary tree; 7) imaging or clinical evidence of co-existent congenital anomalies (e.g., heterotaxy, polysplenia, asplenia); 8) date of Kasai surgery, if performed. All case report forms and slides included a research study identifier, but were otherwise de-identified prior to shipment to the DCC. Approval for this study was obtained from each institution’s IRB.
At a pre-study meeting, the BARC pathologists devised a semi-quantitative scoring system including several histological features thought to be important in the evaluation of a liver biopsy for infantile cholestasis. Approximately half of the study set of slides was then circulated among the pathologists to validate the scoring system. At a second meeting, the scoring system was refined and discrepancies in interpretation were resolved. A final scoring system using 16 histologic features was agreed upon, and the entire set of slides was then re-circulated among the participating pathologists for scoring. Each case was then assigned by the pathologist into one of the following histologic categories: 1) favor BA, 2) obstructive changes noted but favor diagnosis other than BA 3) no obstruction, and 4) indeterminate. At the time of the study, the distinction between “favor BA” and “favor obstruction other than BA” was not based upon explicit criteria. Nonetheless, it was agreed upon by all the participating pathologists that one or more of the following features may have contributed to doubt that obstructive changes were due to BA: rarity or absence of bile plugs in proliferating ducts; overall mild degree of bile duct proliferation; excessive nonuniformity or absence of duct proliferation in some portal areas; absence of portal fibrosis; inclusion in the study set of infants up to 180 days old which is 3-4 months beyond the usual age of diagnosis of BA.
The DCC collated the pathology scores and diagnostic assessments, comparing them with the final clinical diagnoses, and evaluated inter-observer variability. Agreement on scoring and histological diagnoses was evaluated using percent agreement among pathologists and weighted kappa statistics. Kappa (κ) varies between 0 and 1, where 1 is perfect agreement and 0 is agreement no better than chance. Negative and positive predictive values of the pathologist diagnosis of BA were determined from percentage agreement of assignments by histology with the clinical diagnoses. Evaluation of the individual features that best indicate obstruction was determined by logistic regression analysis.
The data set in this study comprised 891 interpretations of 97 liver biopsy specimens (63 needle cores and 34 surgical wedge biopsies). The pathologists scoring the slides were provided no clinical information other than the age of the infant at the time of biopsy. There were 49 cases of BA, 17 cases of idiopathic neonatal hepatitis, and 31 other causes of neonatal cholestasis. In this latter group, the diagnoses included cholestasis secondary to total parenteral nutrition (n=14), Alpha-1-antitrypsin deficiency (n=3), Alagille syndrome (n=2), Choledochal cyst (n=2), PFIC (n=3), Bile acid synthetic defect (n=1), Spontaneous perforation of bile duct (n=1), Intrahepatic cholestasis - not specified (n=3), Niemann-Pick type C (n=1), and Biliary obstruction due to pancreatic cyst (n=1). The cases were felt to be representative of the various causes of neonatal cholestasis seen in pediatric referral centers.
Table 1 details the histologic features evaluated and the pathologists’ responses, expressed as a percentage of the total responses for each item. For purpose of comparison, the cases are divided into cases of BA and non-BA according to clinical diagnoses. From the distribution of responses it is clear that no histologic feature was either uniformly identifiable by BARC pathologists or predictive of the diagnosis of BA. The items showing the greatest difference in response between BA and the non-BA cases were primarily those indicating obstruction: bile plugs in bile ducts and canaliculi, portal tract edema, the more severe grades of portal fibrosis and bile ductular proliferation. Conversely, practically no difference in the gradient of response between BA and non-BA cases was observed for those features indicative of parenchymal injury and inflammatory reaction, such as hepatocellular swelling, steatosis, pseudorosette formation, hepatocellular multinucleation, necrosis, extramedullary hematopoiesis, and portal tract and peri-biliary inflammation. The presence of lobular fibrosis, especially if prominent, somewhat ruled against a diagnosis of BA. Logistic regression was used to identify the histological features that were best predictive of BA. The best multivariate model included bile duct proliferation and portal fibrosis and absence of sinusoidal fibrosis (each p< 0.0001).
Inter-observer agreement was assessed as the percent agreement for each response (the proportion of pathologists choosing the same answer for each question) and by weighted kappa values. These are summarized separately for needle and wedge biopsies (Table 2). Inter-observer agreement was similar for most features in needle and wedge biopsies. The features for which agreement was reasonably good were: bile plugs in ducts, multinucleated giant cell transformation, extramedullary hematopoiesis, and bile duct/ductular proliferation (kappa 0.65, 0.60, 0.52, 0.56, respectively). Agreement was not as strong for hepatocellular swelling and steatosis, and was poorest for features of inflammation, such as cholangitis, peribiliary neutrophils and mononuclear cells in bile ducts, and for the presence of portal tract edema.
The correlation of the pathologists’ diagnostic assignments with the three clinical diagnostic groups is shown in Table 3. A total of 454 histologic interpretations were obtained on the 49 cases of BA. The histologic assignment of “favor BA” was chosen in 75% of the total readings, whereas the category of “favor obstruction other than BA” was chosen in 11%. The category of “no obstruction” was favored in 9% of the observations, and “indeterminate” in 5%. In 36 of the 49 cases of BA, there was agreement (defined as 6 or more readers choosing the same answer) for a diagnosis of “favor BA”. Unanimous agreement for the diagnosis of BA (“favor BA”) was observed in 18 of 49 cases, one of which is illustrated in Figure 1a”. In 7 cases, 6 or more pathologists favored BA or an obstruction other than BA. In the remaining 6 cases fewer than 6 pathologists favored BA or obstruction other than BA. Four of these cases were fragmented samples with less than 3 portal tracts per slide, emphasizing the difficulties of arriving at a pathologic diagnosis with an inadequate specimen. One case was a biopsy from a 2-week-old infant, in which only mild biliary proliferation without bile plugs was present (Figure 1b), illustrating the difficulty of confirming a diagnosis of BA in some children less than 1 month of age7. Figure 2 illustrates the percentage of cases rated by each pathologist as either 1) consistent with BA (dark bar) or as 2) obstructive disorder other than BA (light bar) in clinically proven cases of BA. The pathologists’ diagnosis of obstruction (BA or obstruction other than BA) in clinically proven cases of BA ranged from 79% to 98% with a mean of 89%.
For the cases of INH, there was agreement for the category of “no obstruction” in 13 of 17 cases. 79% of the pathologists’ readings were “no obstruction” or “indeterminate”; conversely, 21% of the pathologists’ diagnoses were either “favor BA” or “obstructive changes other than BA”. In 2 cases, a majority favored either BA or obstruction other than BA; one of these cases is illustrated in figure 1c. There was no agreement for any of the diagnostic categories in 2 cases. The pathologists’ assignments were distributed more evenly across the four different histologic categories in the third group of cases (“Other”), as might be expected. However, more specific patterns in the distribution of the pathologists’ diagnoses could be discerned for some of the clinical diagnoses. There was agreement for a diagnosis of either BA or “obstruction other than BA” in 14 of the 15 cases of TPN-associated liver disease, and in all 3 cases of alpha-1 antitrypsin deficiency as illustrated in Figure 1d. It should be pointed out that the pathologists were not provided information regarding TPN administration or alpha-1 antitrypsin status. Conversely, a majority of pathologists favored a diagnosis of “no obstruction” in the 3 cases of progressive familial intrahepatic cholestasis and 1 case of bile acid synthetic disorder. In cases of INH, the percentage of cases read by each pathologist as “no obstruction” ranged from 57% to 93% with a mean of 69%.
The measure of agreement between the histologic diagnosis and the clinical cases of BA was computed for each pathologist by dividing the pathologists’ diagnoses into two groups: obstructive (favor BA and favor obstruction other than BA) and non-obstructive (no obstruction and indeterminate) for the clinical diagnoses of BA and INH. The resulting kappa values and positive (PPV) and negative (NPV) predictive values for each are expressed in Table 4. Only the results from 9 of the pathologists are included, as one did not complete the study. There was good to substantial agreement between histologic and clinical diagnosis for 6 of the pathologists, and agreement was moderate for 3. Similarly, there was some variation in positive predictive value and negative predictive value between pathologists. Overall, however, the histologic diagnosis of either “favor BA” or “favor obstruction other than BA” had an average positive predictive value of 90.7% for cases of BA, and a negative predictive value of 67.0%.
The Biliary Atresia Research Consortium was established to promote clinicopathological and translational research in BA. It was essential to develop a standardized system of histological reporting in the context of a multi-institutional study. The primary goal of the current study was to establish a semi-quantitative assessment system for the histological evaluation of liver biopsy specimens from infants with cholestasis that would lead to a better understanding of the pathogenesis of BA, aid in the recognition of other cholestatic disorders with which BA initially may be confused, and for BA prognostication following Kasai portoenterostomy. The features that were used expanded upon those reported in several retrospective studies11-13, and employed the Ishak grading system to assess the degree of fibrosis14. The number of choices for each histologic feature were limited to as few grades as possible to enhance interobserver reproducibility15.
The second goal of this study was to evaluate the predictive value of the liver biopsy for the diagnosis of BA and to identify which histologic features were most associated with a diagnosis of BA. It needs to be stressed that the clinical information available to the pathologist was restricted to age at biopsy in order to increase the objectivity of the interpretation of histological findings. Critical key clinical data critical for interpretation of biopsies in infants with cholestasis, such as TPN status and the alpha-1 antitrypsin phenotype, were withheld.
This study shows that inter-observer variability is a problematic for some of the included histological features. This may reflect insufficient emphasis on training before the individual blinded assessments were undertaken, the inclusion in this study of 18 of 97 biopsies from children beyond the usual age for diagnostic purposes (12 weeks), or that the definitions used were ambiguous. On the other hand, despite being blinded to clinical information, consortium pathologists consistently recognized histological features of biliary obstruction in approximately 90% of cases of BA. Furthermore, the assessment of a limited number of key histologic features provides the critical information needed to conclude that obstruction is present. The complete clinical follow-up provided an accurate denominator (49 cases of BA) for calculating diagnostic accuracy. The study pathologists agreed on diagnosis of BA in 36 and of obstruction other than BA in 7 of the 49 cases of proven BA (87.8% accuracy). Either of these histologic “diagnoses” would lead to the need for surgical exploration and thus to confirmation of the diagnosis of BA. Four of the remaining six “false negative” liver biopsies were inadequate specimens for proper evaluation (small and fragmented) and one was from a 2-week-old infant, an age when BA may not be fully expressed. Thus, if the inadequate specimens are excluded from the analysis, the histologic diagnosis would have been BA or obstruction other than BA in 96% of infants eventually shown to have BA. Therefore, if adequate biopsy specimens are provided, it can be expected that there would be a very high degree of accuracy in prediction of BA. Based on the consensus of the pathologists from this study, an adequate liver biopsy for interpretation in an infant should be a minimum of 2.0 cm long, and 0.2 mm wide or contain at least 10 portal areas, and if a surgical wedge, sufficiently deep to include 6 complete portal tracts independent of the liver capsule.
Several published reports from single institutions on the accuracy of liver biopsy for the diagnosis of BA involved a limited number of pathologists5, 16-18. Brough and Bernstein retrospectively compared the original pathologic diagnosis in 158 consecutive cases with the ultimate clinical diagnosis 8. The original pathological diagnosis was correct in 148 cases, an accuracy rate of 93.7%. Six of 10 cases of hepatocellular disease histologically misdiagnosed as favoring obstruction could not, even upon review, be differentiated from mechanical obstruction. In a similarly designed retrospective study by Ferry et al, the initial liver biopsy correctly predicted the clinical diagnosis in 94% of 143 cases19. It was against this background that the current study was designed to evaluate the predictive value of liver biopsy for the diagnosis of BA. The pathologists prospectively interpreted liver biopsies blinded to clinical information and the complete clinical follow-up provided an accurate denominator for calculating diagnostic accuracy.
Zerbini et al. applied logistic regression analysis to 100 liver biopsy specimens and confirmed that bile duct proliferation and bile plugs were the best histologic predictors of obstruction13. Ferry et al. also reported that bile duct proliferation is the key feature in biopsies from patients with BA19.
Similar to these investigators, the current study found that items in the scoring system showing the greatest difference in the gradient of responses between BA and non-BA cases were bile duct proliferation, bile plugs in ducts and canaliculi, and the more severe grades of portal fibrosis. In addition, portal tract edema was a feature recognized more often in BA than in non-BA cases, and conversely the presence of significant lobular (sinusoidal) fibrosis militated against BA. Steatosis, hepatocellular swelling, necrosis, multinucleation, pseudorosette formation, extramedullary hematopoiesis, cholangitis and peri-ductular inflammation were seen as frequently in BA as in non-BA cases. No other features examined showed significant value in diagnosing or excluding BA.
According to Landis and Koch, a kappa value of 0.61-0.8 indicates substantial agreement, 0.41-0.6 moderate, 0.21-0.4 fair and 0-0.2 slight20. The most reproducible feature was bile plugs in ducts (K = 0.65). Interobserver agreement as measured by kappa values was similar overall for needle and for wedge biopsies. Moderate interobserver agreement was reached for bile duct proliferation, the grade of portal fibrosis, extramedullary hematopoiesis, giant cell transformation and steatosis. Agreement was fair to slight for the rest. Taking into account those features that were predictive of BA and also those that had reasonably good inter-observer agreement, logistic regression resulted in a “best multivariate model” for diagnosing BA that included bile duct proliferation and portal fibrosis and absence of sinusoidal fibrosis.
Kappa values observed in this study for many features such as steatosis, hepatocellular necrosis and ballooning and bile duct inflammation were similar to those observed in other multiobserver studies of liver biopsies, such as for hepatitis C21-23 and non-alcoholic fatty liver disease15. On the other hand, kappa values are an imperfect measure of agreement, being dependent on the prevalence of a given feature23. For the ductal plate malformation, for example, the kappa values were low despite a relatively high percent agreement, best explained by the low frequency of that feature in the biopsies evaluated.
The pathologists agreed on the presence of obstructive features (either “BA or “obstruction other than BA”) in 14 of 15 cases of TPN-associated liver disease and in the 3 cases of alpha-1-antitrypsin deficiency. Both these entities are known to present with obstructive histologic features and may be impossible to differentiate from BA in the absence of clinical information and with only H+E and trichrome stains available. Thus, it is important for the pathologist to have this clinical information on hand when interpreting the liver biopsy of infants with cholestasis.
This study did not address whether histological features in a biopsy of a cholestatic infant can provide prognostic information in addition to providing a diagnosis, as suggested by other investigators. Reproducing the current study on a larger scale might permit better use of the scoring system to predict outcome. This should become possible as many-fold more patients have been prospectively enrolled into an ongoing longitudinal study in BARC, into which extensive clinical information is entered and follow-up is closely maintained.
In summary, we have developed a systematic histological evaluation system for assessment of liver biopsies of cholestatic infants and have shown reasonable to substantial inter-observer agreement on a number of features that have diagnostic utility. We have also shown that experienced pediatric pathologists can correctly identify BA with a high degree of sensitivity and good interobserver agreement, even with minimal clinical information. However, distinguishing between BA and disorders such as TPN liver disease and alpha-1 antitrypsin deficiency is not possible on biopsy alone without adequate clinical information provided to the pathologist, which is the standard of practice for clinical interpretation of these biopsies. The BARC scoring system appears to be a useful semi-quantitative assessment tool of liver biopsies from infants with cholestasis.
Supported by NIH grants: U01DK062436, U01DK062445, U01DK062452, U01DK062453, U01DK062456, U01DK062481, U01DK062497, U01DK062500, U01DK062503 and U01DK062470
Children’s Hospital Medical Center, Cincinnati: Jorge Bezerra, M.D., John Bucuvalas, M.D., Susan Krug, M.S.
Children’s Hospital of Philadelphia: Barbara Haber, M.D., Jessi Erlichman
Children’s Hospital, Pittsburgh: Benjamin Shneider, M.D., David Perlmutter, M.D., Robert Squires, Jr, M.D., Beverly Bernard
Children’s Memorial Hospital, Chicago: Peter Whitington, M.D., Susan Kelly, R.N.
Johns Hopkins School of Medicine, Baltimore: Kathleen Schwarz, M.D., Robert A. Jurao
The Mount Sinai School of Medicine, New York City: Federick J. Suchy, M.D., Sanobar Parker M.D., Nanda Kerkar, M.D.
Texas Children’s Hospital and Baylor College of Medicine: Saul J. Karpen, M.D.,Ph.D., Kim Pieplow
University of California, San Francisco: Phillip Rosenthal, M.D., Whitney Lieb
University of Colorado Denver School of Medicine and The Children’s Hospital, Denver: Ronald J. Sokol, M.D., Michael Narkewicz, M.D., Sandi Lindahl, R.N., Elizabeth Esterl, R.N.
Washington University School of Medicine, St. Louis: Ross Shepherd, M.D., FRACP, Rosemary Nagy
University of Michigan, Ann Arbor (Data Coordinating Center): John Magee, M.D., Trivellore Raghunathan, Ph.D., Morton Brown, Ph.D., Yuezhou Jing, M.S.
NIH/NIDDK: Patricia Robuck, Ph.D., MPH, Edward Doo, M.D., Jay Hoofnagle, M.D.
Contribution of individual authors:Pierre Russo: study concept and design; acquisition of data; analysis and interpretation of data; drafting of the manuscript; critical revision of the manuscript for important intellectual content
John C. Magee: study concept and design; analysis and interpretation of data; drafting of the manuscript; critical revision of the manuscript for important intellectual content
John Boitnott: study concept and design; acquisition of data; analysis and interpretation of data; drafting of the manuscript; critical revision of the manuscript for important intellectual content
Kevin E. Bove: study concept and design; acquisition of data; analysis and interpretation of data; drafting of the manuscript; critical revision of the manuscript for important intellectual content
Trivellore Raghunathan: analysis and interpretation of data; critical revision of the manuscript for important intellectual content; statistical analysis
Milton Finegold: study concept and design; acquisition of data; analysis and interpretation of data; drafting of the manuscript; critical revision of the manuscript for important intellectual content
Joel Haas: study concept and design; acquisition of data; analysis and interpretation of data; drafting of the manuscript; critical revision of the manuscript for important intellectual content
Ronald Jaffe: study concept and design; acquisition of data; analysis and interpretation of data; drafting of the manuscript; critical revision of the manuscript for important intellectual content
Grace E. Kim: study concept and design; acquisition of data; analysis and interpretation of data; drafting of the manuscript; critical revision of the manuscript for important intellectual content
Margret Magid: study concept and design; acquisition of data; analysis and interpretation of data; drafting of the manuscript; critical revision of the manuscript for important intellectual content
Hector Melin-Aldana: study concept and design; acquisition of data; analysis and interpretation of data; drafting of the manuscript; critical revision of the manuscript for important intellectual content
Frances White: study concept and design; acquisition of data; analysis and interpretation of data; drafting of the manuscript; critical revision of the manuscript for important intellectual content
Peter Whitington: analysis and interpretation of data; revising manuscript critically for important intellectual content
Ronald J. Sokol: study concept and design; analysis and interpretation of data; drafting of the manuscript; critical revision of the manuscript for important intellectual content.
Disclosures: All authors have no relevant disclosures related to the content of this manuscript
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.