Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Placenta. Author manuscript; available in PMC 2010 November 1.
Published in final edited form as:
PMCID: PMC2814241

Methodologic issues in the study of the relationship between histologic indicators of intraamniotic infection and clinical outcomes

Carolyn M. Salafia, M.D., M.S.,1,2 Dawn Misra, Ph.D,3 and Jeremy N. V. Miles, Ph.D4



To determine the structure of the relationships of the histology scores for acute intraamniotic infection collected in the Collaborative Perinatal Project (CPP).

Materials and Methods

44,427 subjects of the CPP had complete histology scores available for the 9 measures that related to acute intraamniotic infection (i.e., neutrophil infiltrates in umbilical cord, amnion of extraplacental membranes and chorionic plate, decidua, chorionic plate and fetal chorionic vessels). Confirmatory factor analysis was used to determine the relationships among the different markers of maternal inflammatory responses (in amnion, chorion and decidua) and fetal inflammatory responses (in umbilical cord and fetal chorionic vessels).


A single CFA model could not be developed across all CPP sites. A well-fit model was developed from the Boston site (N=10,803) and the factor loadings applied to the histology scores from the other CPP sites. The resultant scores for the latent variables (maternal and fetal inflammatory responses) were compared across sites. There was not only considerable variability in factor loadings, and the signs of factor loadings were also inconsistent across sites.


Histopathology scores of neutrophil infiltrates performed by different observers do not have the same interrelationships and, by extension, the latent variables they are supposed to reflect may not be equivalent. The lack of measurement invariance renders their use as indicators of the underlying processes of maternal and fetal inflammatory responses problematic in analysis with any clinical outcome.


Chorioamnionitis, the presence of intraamniotic microbial organisms triggering maternal and/or fetal inflammatory responses, plays a significant role in reproductive and childhood pathology. Numerous investigators have identified ascending infection as a key pathway in the etiology of preterm birth, particularly early preterm births (less than 35 weeks gestation, as reviewed in 1). Ascending infection has also been proposed as a potential explanatory factor for the substantial racial disparity in risk of preterm birth. 2 Neonates born with funisitis, a prime histologic marker of fetal inflammatory response, are at increased risk for neurologic handicap and cerebral palsy. 3 However, it is the minority of infants born from such environments that develop any neurodevelopmental disorder and causal inference remains problematic. 4 Evidence has begun to accumulate that gene-environment interactions determine the likelihood of preterm labor and delivery and, probably, the risk of fetal injury.5

Holzman et al 6 recently and elegantly summarized the conflicting literature regarding the role of infection diagnosed histologically in preterm birth. . In their own analysis, the choice of inflammatory cell threshold (the number of infiltrating neutrophils required to make a diagnosis of infection) dramatically influenced disease prevalence; the rates of histologic chorioamnionitis ranged “from 85 percent … to 7 percent [in term] and in PTD from 63 percent … to 4 percent” at different inflammatory cell thresholds. 6 They also documented variability in the specific tissue components included, the number of tissue samples reviewed, and the specific features detailed (location, density, and degeneration), additional factors that would affect the prevalence of diagnosis of histologic chorioamnionitis and by extension complicate our understanding of its gestational effects.

Controversy remains in pathology circles regarding whether a multi-category (0–4 stage and 0–4 grading) histologic chorioamnionitis scoring system, or a more simplified system (with fewer categories or a “present/absent” categorization) is optimal. Inter-rater reliability is optimized with a “present/absent” system 7 but such a system must blur the subtleties of the complex mix of genes, cytokines, specific bacterial and other environmental stressors that is inflammation. How best to analyze the individual histology scores derived from the different tissues is also controversial. Should the scores of inflammation in amnion, chorion, decidua and chorionic plate be summed to reflect an overall “maternal inflammatory response” or should a “threshold” level of “normal neutrophil infiltration” in sites such as subchorionic fibrin be used to determine “intraamniotic infection greater than would be common in normal term births?8

Summing scores does not allow finer distinctions among the relative “value” of the different histology indicators. For example, neutrophil infiltrates in the amnion may be a stronger indicator of histologic chorioamnionitis than, for example, decidual neutrophil infiltrates. (e.g., 8) One can empirically assign weights to different indicator scores, and thus tinker with the sum. Alternatively, factor analysis can be used to derive weights (or factor loadings) that reflect the actual intercorrelations among the indicator variables. Exploratory factor analysis is employed when little is known of the underlying structure. Confirmatory factor analysis can be applied when we have biologically based, and theoretically derived concepts regarding the underlying structure, which we want to test.

Given the richness of the histology data in the National Collaborative Perinatal Project (NCPP) data, and the clinical importance of reliable and reproducible diagnoses of histologic chorioamnionitis estimated from histologic slides, we determined to apply confirmatory factor analysis to the histology scores related to histologic chorioamnionitis in the NCPP. Our goal was to explore the structure of the relationships of histologic measures of the maternal and fetal inflammatory responses, respectively, within and among institutions and observers.



The study and analytic sample

Subjects were a subset of the National Collaborative Perinatal Project. Details of the study have been described elsewhere. 9, 10. Briefly, from 1959 to 1965, women who attended prenatal care at 12 hospitals were invited to participate in the observational, prospective study. At entry, detailed demographic, socioeconomic and behavioral information was collected by in-person interview. A medical history, physical examination and blood sample were also obtained. In subsequent prenatal visits, women were repeatedly interviewed and physical findings were recorded. During labor and delivery, placental gross morphology was examined and samples were collected for histologic examination. The children were followed up to seven years of age.

The analytic sample for the present analysis was derived from all delivered infants, live or stillborn infants, irregardless of gestational age, and included both singletons and multifetal pregnancies and these clinical data should not prejudice or bias the scoring of neutrophil infiltrates by pathologists blinded to other clinical data. The sample was restricted to those with complete data on the nine measures of neutrophil infiltrates that were specified by the protocol.11 Expert pathologists at each of 12 institutions were provided a scoring sheet with a written description of the grading scale for neutrophil infiltrates of amnion, chorion and decidua of the membranes, amnion and chorion of the chorionic plate, umbilical artery, vein and Wharton’s jelly, and fetal chorionic vessels (9 separate scores).

Analysis Plan

Our a priori understanding led us to formulate a confirmatory factor analysis with 2 latent variables, one reflecting the maternal inflammatory response (indicated by the scores of amnion, chorion and decidua of the membranes, amnion and chorion of the chorionic plate) and one reflecting fetal inflammatory response (indicated by the umbilical cord and fetal chorionic vascular scores). We fitted confirmatory factor analysis models to the data using Mplus 4.2. 12 The data were treated as ordered categorical (that is we modeled the probability of each response, rather than the mean of the responses). Parameters were estimated using the weighted least squares – mean and variance corrected algorithm, this approach has been shown to work well with categorical data.13 We followed the methods described by Joreskog, 14 first attempting a strictly confirmatory approach and then using a model generation approach to modify the model.

To assess model fit, we used the Χ2 statistic, in conjunction with its associated p-value. The Χ2 statistic assesses the difference between the model and the data. Larger, and more statistically significant values of Χ2 are indicative of worse model fit – worse model fit implying a greater mismatch between the model and the data. However, Χ2 suffers from well known problems when fitting models based on large samples – specifically it has a large amount of power to find models which differ from the data in only trivial and inconsequential amount. Because of this, a wide range of other indices have been developed along with Χ2 to aid in determining when good model fit has been found. For this analysis, we also used the Root Mean Square Error of Approximation (RMSEA15), the Comparative Fit Index (CFI 16) and the Tucker Lewis Index (TLI, also referred to as the non-normed fit index, NNFI). The RMSEA can be thought of as a correction to Χ2, to account for the sample size and model complexity; values below 0.05 are often seen as indicative of adequate fit. The CFI and TLI both compare the Χ2 of the fitted model to that of the null model, the null model being the worst model that it would be possible to have, with no relations between any of the variables 17 values above 0.95 are usually considered to show good fit. 18

Confirmatory factor analysis models can be fitted to single groups, or to multiple groups. In a multiple group model, parameters are estimated for each group, and these parameters can then be tested across groups using Wald tests or Χ2 difference tests.


We first examined the percentage endorsement of frequencies of scores for each of the 9 measures at each CPP site (Table 1). Of note, certain sites used in effect a 0–2, rather than the 0–3 scale specified by the protocol 11, and the highest severity score was overall used infrequently when used.

Table 1
Frequencies of histology scores by study site

Next, Using Mplus 4.2 12, and considering the histology scores as ordered categorical variables representing underlying continuous processes, we attempted to fit a multiple group model, with hospital sites defining the groups. This model had convergence problems which we identified as being related to particular sites, where measures either varied inconsistently, or were perfectly correlated. As the Boston cohort (N=10803) was the largest of the 12, we elected to develop a model in this cohort and then test its generalizability to the other cohorts. 12 The close correlation between scores of neutrophil infiltrates in membrane chorion and membrane decidua forced removal of the membrane chorion score from the model; of the two variables, the membrane decidua score provided slightly better fit. Figure 1 shows the final model, which had excellent fit according to established criteria (e.g. CFI, TLI each 0.999, RMSEA 0.033). Of interest, model fit was significantly improved by removing the fetal chorionic vessel score as an indicator of fetal inflammatory response; the covariance of this fetal indicator with maternal inflammation was stronger than with fetal inflammation (0.848 vs. 0.693, Table 1).

Figure 1
Path diagram of the best-fit model for the relationships of scores of neutrophil infiltrates in the specified placental tissues, developed in Boston cohort and applied across study sites (see Table 2)

We then applied this model to each of the other 11 cohorts, and achieved generally as good a fit as for the Boston cohort. However, the loadings for the different histology scores differed significantly from the Boston cohort (Table 2, Wald tests). In addition, comparing the loadings for the group of indicators of maternal and fetal inflammatory responses showed that there were multivariate significant differences from the Boston cohort. In other words, the latent variables of maternal and fetal inflammation are not indicated by the histology scores uniformly across the cohorts. Further inspection of the data revealed other disturbing patterns. In general, maternal and fetal inflammatory responses tend to coincide; there may be variability in the relative strengths of each response, but they tend to be present together. The extent of covariance of maternal and fetal inflammatory responses was widely different among cohorts, ranging from 0.435 (Providence) to 2.094 (Pennsylvania). Moreover the means of the latent variables not only differed from that of the Boston cohort (indicating different prevalences of the histology scores, which would not be unexpected), but they differed in opposite directions (e.g., Buffalo, New Orleans, NY/Columbia, Virginia, Minnesota, NY/Medical, Oregon, Pennsylvania, Providence and Tennessee). These comparisons are, however, difficult to interpret because the measurements are not directly comparable across the cohorts.

Table 2
Factor analysis scores, using Boston fitted model, across study sites.


These data demonstrate that, in the CPP, individual histology scorings of neutrophil infiltrates, markers of intramniotic infection, demonstrate significant differences in their contributions to more general constructs of maternal inflammation and fetal inflammation. While it is possible that demographic and genetic factors may account for part of these differences, at least some of the variability must be due to inter-observer factors. The lack of measurement invariance means that these scorings cannot be used to represent the same construct (or underlying biological process) in different cohorts. In the psychometric literature this is termed “differential item functioning”, or “DIF” 19, and threatens the validity of the measurement instrument. In psychometrics, items showing DIF are rewritten or removed from the instrument in order to generate measures of the latent constructs that can be generalized across groups.

Despite the measurement invariance we have identified in the graded scores of the CPP, we strongly reject one alternative model, namely, collapsing the multiple category scoring system, as has been suggested, because “these distinctions are of no documented clinical significance”. 7 Generally, information is expensive and difficult to collect, and should not be discarded lightly. Certainly if the diagnostic categories are discarded, there will be no chance to document clinical significance moving forward. Our goals should instead be to explore methods that allow improved reliability including image segmentation from digitized slides. 20

A more immediate and concrete criticism to collapsing the scoring system is that the neutrophil infiltrates in amnion, chorion and umbilical cord (for example) are of interest to us only insofar as they reflect aspects of the process of intraamniotic infection, a process we cannot otherwise directly access. Neutrophil infiltrates are indicators of the underlying latent (and not directly observable or measureable) variable in which we are truly interested. In modern pathology practice, we are forced to employ categorical scores as representations of one (or more) underlying continuous variables. However, as the categorical scale is progressively reduced from 0–4 to 0–1, as “absent/present”), the correlation of those scores with the underlying latent variable is also reduced.22 The simpler scale may be more “reliable” but it is less representative of the latent/unobservable process(es) in which we are truly interested. We may trade an appearance of reliability for a long-term limitation on the explanatory value of histology scorings, and ultimately, their utility in both research and clinical contexts.

The HUGE project data underscore the potential disadvantages of “lumping” vs. “splitting’ with regard to such information. The understanding that genetic polymorphisms modify aspects of the maternal and fetal inflammatory responses to a commonly perceived intraamnmiotic infectious stimulus is relatively recent. 23 All histologic scores may not be created equal; some neutrophilic infiltrates may represent an “uphill battle” with gene polymorphisms that would down-regulate inflammatory responses. Other inflammatory responses may have been facilitated by the genetic environments of the mother, the fetus or both. Collapsing a continuous process (recruitment of neutrophils and diapedesis from their site of origin) into, at the extreme, present vs absent 7 precludes ever disentangling the complex interplay between maternal and fetal genetic capacities and the infectious stimulus.

As noted above, chorioamnionitis, defined as the presence of intraamniotic microbial organisms triggering maternal and/or fetal inflammatory responses, plays a significant role in reproductive and childhood pathology. While risk of morbidity rises with severity of inflammation, most infants will not experience adverse outcomes. This suggests that the key exposure is heterogeneous and that the heterogeneity is not reflected in commonly used summary measures of infection. The underlying process of acute intraamniotic infection is physiologically complex, involving cytokines, chemokines, prostanoids, proteases, matrix metallo-proteinases, and almost innumerable other biologically active compounds. Is the categorical quantification of neutrophils the only facet of inflammation that is physiologically relevant to the outcomes that have been associated with acute intraamniotic infection? It is not unreasonable to suggest that the answer to this question may be “No”. Rather than collapsed into fewer categories, histology scoring may need to be expanded to cover other features (such as connective tissue characteristics, fibroblast proliferation, neutrophil karyorrhexis) that may mark other facets of the complex pathophysiology of intraamniotic infection.

We propose that perinatal researchers should, to use a worn but appropriate cliché, step outside the box and consider alternative approaches to both measurement of histology slides that would yield adequate reliability to allow cross-institutional analysis of the latent construct (s) involved in intraamniotic infection and ultimately to achieve a fuller understanding of the infection-preterm birth pathway.


1. Romero R, Espinoza J, Gonçalves LF, Kusanovic JP, Friel LA, Nien JK. Inflammation in preterm and term labour and delivery. Semin Fetal Neonatal Med. 2006;11(5):317–26. [PubMed]
2. Fiscella K. Race, perinatal outcome, and amniotic infection. Obstet Gynecol Surv. 1996;51(1):60–6. [PubMed]
3. Yoon BH, Park CW, Chaiworapongsa T. Intrauterine infection and the development of cerebral palsy. BJOG. 2003;110 (Suppl 20):124–7. [PubMed]
4. Dammann O, Leviton A. Inflammatory brain damage in preterm newborns--dry numbers, wet lab, and causal inferences. Early Hum Dev. 2004;79(1):1–15. [PubMed]
5. Gibson CS, MacLennan AH, Goldwater PN, Dekker GA. Antenatal causes of cerebral palsy: associations between inherited thrombophilias, viral and bacterial infection, and inherited susceptibility to infection. Obstet Gynecol Surv. 2003;58(3):209–20. [PubMed]
6. Holzman C, Lin X, Senagore P, Chung H. Histologic chorioamnionitis and preterm delivery. Am J Epidemiol. 2007;166(7):786–94. [PubMed]
7. Redline RW, Faye-Petersen O, Heller D, Qureshi F, Savell V, Vogler C. Society for Pediatric Pathology, Perinatal Section, Amniotic Fluid Infection Nosology Committee. Amniotic infection syndrome: nosology and reproducibility of placental reaction patterns. Pediatr Dev Pathol. 2003;6:435–48. [PubMed]
8. Salafia CM, Weigl C, Silberman L. The prevalence and distribution of acute placental inflammation in uncomplicated term pregnancies. Obstet Gynecol. 1989;73(3 Pt 1):383–9. [PubMed]
9. Niswander K, Gordon M. The Collaborative Perinatal Study of the National Institute of Neurological Diseases and Stroke: The Women and Their Pregnancies. W.B. Saunders; Philadelphia, PA: 1972.
10. Hardy J. The Collaborative Perinatal Project: Lessons and Legacy. Annals of Epidemiology. 2003;13(5):303–311. [PubMed]
11. Benirschke K. Examination of the placenta. 1961. Prepared for the Collaborative Study on Cerebral Palsy, Mental Retardation and other Neurological and Sensory Disorders of Infancy and Childhood, National Institute of Neurological Diseases and Blindness, US Department of Health, Education and Welfare, Public Health Service.
12. Muthén B, Muthén L. Mplus 4.2 [Computer program] Los Angeles: Muthén and Muthén; 2006.
13. Flora DB, Curran PJ. An empirical evaluation of alternative methods of estimation for confirmatory factor analysis with ordinal data. Psychological Methods. 2004;9:466–491. [PMC free article] [PubMed]
14. Jöreskog KG. In: Testing structural equation models. Bollen KA, Long JS, editors. Thousand Oaks, CA: Sage; 1993.
15. Steiger JH, Lind J. Statistically Based Tests for the Number of Common Factors. Paper presented at the annual meeting of the Psychometric Society; Iowa City. 1980.
16. Bentler PM, Bonett DG. Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin. 1980;88:588–606.
17. Miles JN, Shevlin M, McGhee PC. Gender differences in the reliability of the EPQ? A bootstrapping approach. Br J Psychol. 1999 Feb;90(Pt 1):145–54. [PubMed]
18. Hu LT, Bentler P. Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling. 1999;6:1–55.
19. Millsap RE, Meredith . Factorial invariance: historical perspectives and new problems. In: Cudeck R, MacCallum R, editors. Factor analysis at 100: historical developments and future directions. Hillsdale, NJ: Erlbaum; 2007.
20. Thomas K, Salafia CM, Buhimschi I, Buhimschi CS, Zambrano E, Sottile M. Reliability of automated neutrophil quantitation in digitized H&E stained slides: Pilot analysis of correlation with amniotic fluid proteomics score. Abstract accepted for presentation International Federation of Placenta Associations; October 2009.
21. MacCallum RC, Zhang S, Preacher KJ, Rucker DD. On the practice of dichotomization of quantitative variables. Psychological Methods. 2002;7:19–40. [PubMed]
22. MacCallum RC, Zhang S, Preacher KJ, Rucker DD. On the practice of dichotomization of quantitative variables. Psychological Methods. 2002;7:19–40. [PubMed]
23. Crider KS, Whitehead N, Buus RM. Genetic variation associated with preterm birth: A HuGE review. Genet Med. 2005;7(9):593–604. [PubMed]