|Home | About | Journals | Submit | Contact Us | Français|
Mortality prediction scores have been used for a long time in ICUs; however, numerous studies have shown that they over-predict mortality in the obstetric population. With sepsis remaining a major cause of obstetric mortality, we aimed to look at five mortality prediction scores (one obstetric-based and four general) in the septic obstetric population and compare them to a nonobstetric septic control group.
Women in the age group of 16–50 years with an admission diagnosis or suspicion of sepsis were included. In a multicenter obstetric population (n = 797), these included all pregnant and postpartum patients up to 6 weeks postpartum. An age- and gender-matched control nonobstetric population was drawn from a single-center general critical care population (n = 2,461). Sepsis in Obstetric Score, Acute Physiology and Chronic Health Evaluation II, Simplified Acute Physiology Score II, Sequential Organ Failure Assessment, and Multiple Organ Dysfunction Scores were all applied to patients meeting inclusion criteria in both cohorts, and their area under the receiver-operator characteristic curves was calculated to find the most accurate predictor.
A total of 146 septic patients were found for the obstetric cohort and 299 patients for the nonobstetric control cohort. The Sepsis in Obstetric Score, Acute Physiology and Chronic Health Evaluation II, Simplified Acute Physiology Score II, Sequential Organ Failure Assessment, and Multiple Organ Dysfunction Scores gave area under the receiver-operator characteristic curves of 0.67, 0.68, 0.72, 0.79, and 0.84 in the obstetric cohort, respectively, and 0.64, 0.72, 0.61, 0.78, and 0.74 in the nonobstetric cohort, respectively. The Sepsis in Obstetric Score performed similarly to all the other scores with the exception of the Multiple Organ Dysfunction Score, which was significantly better (p < 0.05).
The Sepsis in Obstetric Score, designed specifically for sepsis in obstetric populations, was not better than general severity of illness scoring systems. Furthermore, the Sepsis in Obstetric Score performance was no different in an obstetric sepsis population compared to a nonobstetric sepsis population. The Multiple Organ Dysfunction Score is a simple organ-based score, and this result supports the use of organ-based outcome predictors in ICU even in an obstetric sepsis population.
The use of severity of illness scoring systems in sepsis patients was recommended by the American College of Chest Physicians/Society of Critical Care Medicine Consensus Conference Committee as an adjunctive tool to assess mortality (1). The severity of illness scoring systems are useful tools for comparing different populations, for hospital planning, and as research tools for critical illnesses. There are many tools available for this use with a wide variety of methods of scoring. Four widely used day of admission scores in current clinical use include Acute Physiology and Chronic Health Evaluation (APACHE) II, Simplified Acute Physiology Score (SAPS) II, Sequential Organ Failure Assessment (SOFA), and Multiple Organ Dysfunction Score (MODS). APACHE II and SAPS II are based on routinely measured physiologic variables, whereas SOFA and MODS are organ failure–based scores. These tools are validated for use in the general critical care population.
With the altered physiology of pregnancy further complicated by the physiologic changes of sepsis, the question is raised whether an obstetric-specific score may be more accurate in the obstetric population? For obstetric sepsis patients, the Sepsis in Obstetrics Score (SOS) has recently been developed to look at predicting ICU admissions from the emergency department (2). Although shown to be a good ICU admission predictor, mortality prediction score (a secondary outcome) could not be assessed owing to the absence of mortality in the population used to develop the score.
The aim of this study is to assess the newly developed SOS score, in comparison to existing severity of illness scoring systems (APACHE II, SAPS II, SOFA, and MODS), to predict mortality owing to sepsis in an obstetric population. We aim to determine whether the SOS score, which was developed specifically for septic obstetric patients, performs better than the general ICU scores in these patients. As an additional control, we also assessed the performance of the four scores in a nonobstetric septic population.
This study is a retrospective case–control study of obstetric and nonobstetric septic patients comparing outcome prediction scores for mortality, which were developed specifically for the general ICU population. Obstetric patients were recruited from the Collaborative Integrated Pregnancy High-dependency Estimate of Risk (CIPHER) database. This is a large database comprising data from 797 obstetric patients admitted to the ICU for greater than 24 hours between the years 2000 and 2012 from 14 sites in 11 different countries worldwide. This cohort included a mix of both higher and lower and middle-income countries (LMICs) as defined by the World Bank (3). A nonobstetric age-matched control group was taken from three general ICU databases at St Paul’s Hospital, Vancouver, to determine whether the SOS score exhibited specificity for an obstetric population. That is, did the SOS score perform well in an obstetric population and less well in a nonobstetric population? Both databases included demographic data, physiologic and laboratory data, admission diagnoses, and patient outcomes. The Providence Health Care and Children’s and Women’s Health Centre/University of British Columbia Institutional Review Boards reviewed and approved the study. All patients or their representatives provided written informed consent.
Inclusion criteria were 1) female patients, 2) between ages 16 and 50, and 3) admitted to the ICU with an admission indication of sepsis. For the obstetric cohort, this included women at any gestation up to 6 weeks postpartum. All patients with incomplete data were excluded.
The SOS score was adjusted for the ICU setting (Table Table11). The adjustments were as follows: Those patients ventilated were given the maximum score (+4) for both the saturation and respiratory rate (RR) component of SOS, and those patients on vasopressor infusions were given the maximum score (+4) for the systolic pressure component of SOS. One of the variables, immature neutrophil %, could not be used as it is not commonly measured in the sites included in the study and was therefore absent from all records in the database.
The adjusted SOS, APACHE II score, SAPS II, SOFA, and MODS score were calculated retrospectively for both the obstetric and nonobstetric databases. The scores were calculated using the most abnormal value of each variable recorded in the first 24 hours of admission to ICU. The SOS scores were calculated using seven physiologic and laboratory variables (heart rate [HR], RR, oxygen saturation [Sats], systolic blood pressure [SBP], temperature, lactate, and WBC count).
APACHE II scores were calculated as described by Knaus et al (4) in 1985 using 12 physiologic variables (temperature, HR, RR, SBP, Sats, WBC count, arterial pH, serum sodium, serum potassium, creatinine, hematocrit, and Glasgow Coma Score [GCS]) as well as the age and chronic health status of the patient. The SAPS II was calculated as described by Le Gall et al (5) in 1993 using 12 physiology variables (HR, SBP, temperature, GCS, ventilation status-linked to Po2-to-Fio2 ratio, urine output, serum urea, serum sodium, serum potassium, serum bicarbonate, and serum bilirubin) as well as age, type of admission (scheduled surgical, unscheduled surgical, or medical), and three underlying disease variables (AIDS, metastatic cancer, and hematologic malignancy). The MODS was calculated as described by Marshall et al (6) in 1995 using six variables (pressure-adjusted HR, GCS, serum creatinine, serum bilirubin, serum platelet count, and Pao2-to-Fio2 ratio). The SOFA score was calculated using similar variables with different cutoff values and with a simplified cardiovascular score as described by Vincent et al (7).
The area under receiver-operating characteristic (AUROC) curves were used to assess predictive performance of each score using the R package pROC (R Foundation for Statistical Computing, c/o Institute for Statistics and Mathematics, Vienna, Austria; http://www.r-project.org) (8), functions roc and auc (partial.auc = FALSE) with differences in ROC curves tested using “roc.test.” Patients were propensity score matched to adjust for differences in baseline variables associated with outcome. Propensity score matching seeks to create patient groups with similar propensity for the same level of variable under study using logistic regression. We calculated propensity scores for patients being above or below median scores separately for APACHE II, MODS, SAPS, and SOS scores using linear logistic regression with age, minimum mean arterial pressure (MAP), and presence of any chronic disease. No constraints were applied to the number of patients above and below median score for each stratum. The caliper width for propensity score (output of the logistic regression model) for each matched strata was set at 0.2. Calipers were also set for age (caliper of 5 yr), minimum MAP (10mm Hg), and presence of chronic disease (0; i.e., all patients in a stratum are identical for having any chronic disease or not). The propensity score matching was performed using R package “optmatch” version 0.9–3 (9) and RItools (10). The weighted standardized difference output from the function xBalance (RItools package) was used as a balance diagnostics as it is not confounded by sample size (9). A p value of less than 0.05 was considered statistically significant.
The CIPHER database, captured using REDCap software (Vanderbilt University, Nashville, TN), included 877 obstetric patients from 14 different centers in 11 different countries admitted to the ICU. Of these, 189 patients had a primary diagnosis of sepsis (Table Table22). The age range was 16–48 years. Forty-three patients were excluded for incomplete data.
The nonobstetric database used in this study included 2,933 patients admitted to the ICU. From this, there were 319 nonpregnant women between 16 and 50 years old admitted for sepsis. Of these, n value equal to 21 patients were excluded for incomplete data. The nonobstetric group had a greater median age. They also had a greater median GCS, maximum temperature, chronic disease status, length of stay, and proportion of ventilated patients. The obstetric population had a greater median WBC count, consistent with the neutrophilia of pregnancy, and minimum MAP. All other variables were similar in both groups (Table Table33).
From the obstetric population, puerperal sepsis was recorded as the most common reason for admission because of sepsis. However, the highest proportion of women admitted to ICU overall had a nonobstetric cause of sepsis. The most common causes of nonobstetric sepsis were respiratory tract infection and urinary tract infections although these causes of nonobstetric sepsis individually were still less frequent than puerperal sepsis. Puerperal sepsis was the biggest cause of mortality; this was mainly in the lower and middle income subgroups (Table Table44). The majority of women were admitted to the ICU in the postpartum period, although proportionally mortality was greatest in the third trimester (Table Table55).
The SOS, APACHE II, SAPS II, SOFA, and MODS scores gave AUROC curves of 0.67, 0.68, 0.72, 0.79, and 0.84 for prediction of mortality in the obstetric cohort, respectively, and 0.64, 0.72, 0.61, 0.78, and 0.74 for prediction of mortality in the nonobstetric cohort, respectively (Figs. Figs.11 and 22). In the obstetric population, the MODS score was the best performing mortality predictor, and the AUROC curve was significantly different from APACHE II, SAPS II, and SOS with p values of 0.0069, 0.0401, and 0.0176, respectively. Because SOFA and MODS scores share similar variables and are calculated in a similar way, it was not surprising that there was no significant difference between SOFA and MODS scores.
We propensity matched for score (above or below median) with calipers restricting the range of age, minimum MAP, and presence or absence of chronic disease. We obtained 28 strata (matched groups that include survivors and nonsurvivors with similar characteristics) for obstetric patients (in 22 strata, chronic disease was absent; in six, chronic disease was present), and 87 strata for nonobstetric patients (in 44, chronic disease was absent, and in 43, chronic disease was present). We found the covariates to be well balanced with standardized differences less than 0.1 and large p values for the significance of these differences (Table Table66). This indicated to us that the matched groups (strata) produced by propensity matching gave good control of severity of disease for analyzing the performance of each score on mortality outcome (i.e., each matched group contained a mixture of patients of similar age, and minimum MAP, and for each stratum, the presence or absence of chronic disease was the same). Therefore, logistic regression using the score of interest and these defined strata would be a good test of the association of the score and the clinical outcome of mortality.
As seen in Table Table77, the MODS score was significantly superior to other scores in assessing risk of death after controlling for other variables, with a p value nearly an order of magnitude more significant than SOS and SAPS scores (p = 0.003 with 90 patients vs 0.02 or 0.03 for SOS and SAPS of 122 and 73 patients, respectively). The MODS score was more significant than the APACHE II score (p = 0.003 vs 0.02 for 90 and 123 patients, respectively). For nonobstetric patients, the MODS score was superior to SOS and SAPS but inferior to APACHE II (Table (Table7).7). However, the availability of data to calculate APACHE II scores hinders its usability, making the MODS potentially a better choice of score.
The best mortality predictor in the obstetric population was the MODS score as shown in the AUROC results. This score was significantly better than the physiology-based scores SAPS II and APACHE II and the obstetric-specific SOS score. This result was further supported by propensity matching, which showed that the MODS score was superior. Propensity-matched patient strata provided adjustment for covariates in assessing outcome without assumption of linearity of covariates and confirmed our independent assessment using ROC curves and the basic scores.
The obstetric-specific SOS score is shown to have poor predictive value with respect to mortality in both the septic obstetric and nonobstetric populations. Thus, the SOS score did not demonstrate specificity for an obstetric population—it was no better in the obstetric population than in the control nonobstetric population. However, the performance of the score is not substantially different from the performance of SAPS II and APACHE II scores. The SOS, APACHE II, and SAPS II scores performed similarly in both the obstetric and nonobstetric populations. This suggests that the septic obstetric population may behave similarly in many respects to septic nonobstetric women of a similar age. This also suggests the possibility that the physiologic response to sepsis may be similar, regardless of baseline physiology (peripartum or not). Conversely, the MODS and SOFA scores performed better in the obstetric population suggesting that maybe an organ-based system is more accurate. This may not be surprising. In a clinical setting, it would be reasonable to expect a poorer outcome in patients with more organ dysfunction than those where organ function is preserved. Early warning scores for nonobstetric and obstetric patients that incorporate organ failure–based data have also shown significant promise (11).
Outcome prediction scores are used widely throughout ICUs worldwide. The scores that exist for the general medical population (e.g., APACHE, SAPS, and SOFA) have been repeatedly shown to over-predict maternal mortality. The MODS score has not been previously investigated in the obstetric population (12–15), whereas the Mortality Probability Model II score has shown promise in a developing country population (16). When investigated in the obstetric population, the APACHE II score was found to over-predict mortality (12, 17–19) as was the SAPS II score (12, 20). The MODS score has been shown to perform moderately well in the general population and in sepsis, better than APACHE II according to Peres Bota et al (21), although no current data exist for the obstetric population (21, 22).
Lagu et al (23) developed a sepsis disease risk score and found that incorporating additional information about the use of interventions such as mechanical ventilation and vasopressors was superior to models that were based solely on demographic information and comorbidities. This study does not support this and instead shows that a simple, more organ-based model (MODS score) is more effective in the septic obstetric population than the models that incorporate intervention information (SAPS II).
Sepsis is a serious threat to pregnant women throughout the world. Although ICU admission and death are uncommon in pregnant and postpartum women (24), sepsis remains an important problem. Maternal deaths from sepsis are highest in Africa, Asia, Latin America, and the Caribbean, responsible for approximately 10% of all maternal deaths in Africa and Asia (25). Sepsis was recently identified as the most common cause of maternal death in the United Kingdom (26). Although there are specific sepsis scores, it is important to know the performance of more general scores in septic populations as well. If a general score performs similarly to a specific score, it cuts down on the number of scores that need to be applied to a population.
This is a large cohort of patients with a global relevance, having included a mix of both higher and LMICs. This is important when assessing the performance of scores in all countries and assessing its worldwide relevance. The more resource-poor LMICs would benefit even more from an accurate mortality prediction score to better enable planning of areas in need of more attention.
This study has a number of important limitations. First, the SOS score was originally developed for emergency department patients but has been used here in the ICU setting. Although this is primarily an emergency department score, it was felt that the variables would also apply to patients recently admitted to ICU (in the first 24hr after admission) with adjustments made as mentioned above. The patient mix in the obstetric cohort is heterogeneous, which produces many confounding factors, although this also allows testing of the scores prediction values in a worldwide setting. Conversely, the control group is from Vancouver only and so comparison between the two groups is limited. To address the issue of differences between populations, we used propensity matching. Second, the study is retrospective in design, and the addition of a prospective validation cohort would be optimal. We found that obtaining sufficient numbers of obstetric patients leading to ICU admission and mortality required a large multinational collaborative effort. Thus, the retrospective design was chosen as a feasible approach to conduct this discovery research. Retrospective data collection also limited the different prediction scores that could be used to compare with the SOS score. Third, this data limitation also meant the absence of one of the original SOS score predictors, the immature neutrophil %. Thus, we modified the SOS scoring system to allow its application to our retrospective cohort. This weakens our conclusions but, we believe, the main results stand, that organ failure–based scores, such as MODS and SOFA, serve the obstetric population equally well and are superior to the SOS score. In addition, the modification we made to the SOS score may skew toward high scores in some patients, for example, a ventilated patient automatically gets 8 points. Finally, despite the large numbers of patients in the original databases, the final septic cohort numbers are small, and therefore, there is a significant geographic skew in the mortality figures.
Although many other studies show that mortality predictors over-predict mortality as in this study, their area under the ROC curves for APACHE II are generally higher than those in this study. No study has looked purely at sepsis and the performance of prediction scores for mortality specifically in this population. This difference could also be attributed to the differences in sample size and the low mortality even in a large original cohort of patients. The different populations may well also play a role in these different figures. Sixty-one percent of the septic obstetric patients were from LMICs, with all but one (3%) of the mortalities from an LMICs. A literature search found studies from LMICs had an obstetric ICU mortality of 16.6–40.4% (27–29). Studies from high-income countries showed a mortality range of 1.3–12.5% (12, 30–32). This is in keeping with this study that shows a mortality score from sepsis in LMICs of 31.5% (29/89) and a mortality score from sepsis in higher income countries of 1.7% (1/57).
In summary, organ failure–based severity of illness scores, such as MODS, are superior to the obstetric-specific SOS score in an obstetric population. Indeed the MODS score performs equally well in obstetric and nonobstetric (age and gender equivalent) populations.
We thank CIPHER study data for their contribution to this study. We would also like to thank each of the CIPHER study site collaborators including Stephen Lapinsky and Gareth Seaward (University of Toronto, Mt Sinai); Zulfiqar Bhutta, Rahat Quereshi, and Sheikh Irfan (Aga Khan University, AKU Medical Centre); J.W. Ganzevoort, AMC de Pont, and Ben Mol (Academic Medical Centre, Amsterdam, The Netherlands); Guilherme Cecatti (University of Campinas, Sao Paulo, Brazil); Daniela Vasquez and Vanina Aphalo (Hospital Interzonal General de Agudos Gral, Buenos Aires, Argentina); Dena Goffman and Cynthia Chazotte (Montefiore Medical Center, New York, NY); Euan Wallace and Tim Crozier (Monash Medical Centre, Melbourne, Australia); Tao Duan and Vivian Zhou (Shanghai 1st Maternity and Infant Hospital, Shanghai, China); Michael Geary and Mary Bowen (Rotunda Hospital Dublin, Ireland); Fionnuala McAuliffe and Colm O’Herlihy (University College Dublin, Dublin, Ireland); Turkan Togal and Oktay Demirkiran (Inonu University, Malatya, Turkey); Isam Lataifeh and Ramzy Tadros (King Abdullah University Hospital, Ar Ramtha, Jordan). Along with their teams of data collectors at each site, their collaboration and assistance with collection of the study data are greatly appreciated.
Dr. Ryan disclosed other support (The Collaborative Integrated Pregnancy High-dependency Estimate of Risk [CIPHER] study in University of British Columbia of which she was the coordinator received a 1-year $100,000 Canadian Institutes of Health Research [CIHR] Catalyst grant: Maternal health: from Preconception to Empty Nest grant on January 9, 2010. This was used to fund data collection and analysis for the CIPHER pilot study, as well as funding for a part-time research assistant to carry out data entry data collection in two of the participating CIPHER sites). Her institution received funding from the CIHR. Dr. Magee received funding from the CIHR. Her institution received funding from the Bill & Melinda Gates Foundation; the Research CIHR (government granting agency); from expert testimony for the Canadian Medical Protective Association (CMPA); and Other CIHR grants for CHIPS Trial, CHIPS-Child, MAG-CP, Meeting grant Bill & Melinda Gates Foundation funding for Preeclampsia and Eclampsia monitoring, prevention and treatment (co-investigator). Dr. von Dadelszen received funding from the CMPA (medico-legal expert testimony unrelated to this article). His institution received funding from the CIHR. Dr. Walley disclosed other support (He is a founder and shareholder of Cyon Therapeutics). His institution received funding from the CIHR. The remaining authors have disclosed that they do not have any potential conflicts of interest.