|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: CN. Performed the experiments: ICP RAW. Analyzed the data: ICP RAW MAC RLD SMW CN. Wrote the paper: ICP MAC RLD CN SMW.
The adequacy of informed consent in the Surfactant, Positive Pressure, and Pulse Oximetry Randomized Trial (SUPPORT) has been questioned. SUPPORT investigators and publishing editors, heads of government study funding agencies, and many ethicists have argued that informed consent was adequate because the two oxygen saturation target ranges studied fell within a range commonly recommended in guidelines. We sought to determine whether each oxygen target as studied in SUPPORT and four similar randomized controlled trials (RCTs) was consistent with usual care.
PubMed, EMBASE, Web of Science, and Scopus were searched for English articles back to 1990 providing information on usual care oxygen management in extremely premature infants. Data were extracted on intended and achieved oxygen saturation levels as determined by pulse oximetry. Twenty-two SUPPORT consent forms were examined for statements about oxygen interventions.
While the high oxygen saturation target range (91 to 95%) was consistent with usual care, the low range (85 to 89%) was not used outside of the SUPPORT trial according to surveys and clinical studies of usual care. During usual care, similar lower limits (< 88%) were universally paired with higher upper limits (≥ 92%) and providers skewed achieved oxygen saturations toward the upper-end of these intended ranges. Blinded targeting of a low narrow range resulted in significantly lower achieved oxygen saturations and a doubling of time spent below the lower limit of the intended range compared to usual care practices. The SUPPORT consent forms suggested that the low oxygen saturation arm was a widely practiced subset of usual care.
SUPPORT does not exemplify comparative effectiveness research studying practices or therapies in common use. Descriptions of major differences between the interventions studied and commonly practiced usual care, as well as potential risks associated with these differences, are essential elements of adequate informed consent.
The Surfactant, Positive Pressure, and Pulse Oximetry Randomized Trial (SUPPORT) (Clinical Trials number NCT00233324) [1–3] sought to identify an optimal target range of oxygen saturation (SpO2) in extremely premature infants. Infants were randomized to a high (91–95%) or low (85–89%) SpO2 target range. The primary outcome was a composite of severe retinopathy of prematurity (ROP) and death before discharge from the hospital.
In a letter to the SUPPORT coordinating center in 2013, the U.S. Office for Human Research Protections (OHRP) found that the informed consent procedure failed to describe “reasonably foreseeable risks and discomforts,” including the risks of blindness and death  (see consent forms in S2 Text). Strong criticism of SUPPORT appeared in the lay press and major scientific journals [5–7]. SUPPORT investigators , the editor of the journal that published the SUPPORT results , many bioethicists , and heads of government study funding agencies  defended the consent procedure, arguing that SUPPORT represented comparative effectiveness research and that additional risks could not have been foreseen because all interventions were within usual care.
It has been argued that informed consent can be simplified or may not even be necessary for randomized trials in which the interventions being compared: 1) are part of “usual care”; 2) have been used long enough to assume that their associated risks are comparable; and 3) involve patients who would be unlikely to prefer one of the interventions over any other . Accordingly, it has been suggested that SUPPORT should have been eligible for a waiver of informed consent because the investigated oxygen saturation target ranges were within the lower and upper limits of usual care .
Although contemporaneous oxygen management in neonatal intensive care units (NICUs) has been described [14, 15], management in SUPPORT has not been rigorously compared to actual usual care. We sought to determine whether oxygen therapy interventions in SUPPORT were consistent with concurrent usual care as documented in the scientific literature. We analyzed and compared usual care to the protocol-specified interventions in SUPPORT and four methodologically similar trials run concurrently—the Benefits of Oxygen Saturation Targeting trials (BOOST II)  in Australia, New Zealand (Australian and New Zealand Clinical Trials Registry numbers, ACTRN12605000055606 and ACTRN12605000253606), and the U.K. (Current Controlled Trials number, ISRCTN0084266); and the Canadian Oxygen Trial (COT) (Clinical Trials number NCT00637169) . We found that trial interventions had substantial deviations from published routine clinical practices at the time of the trials.
To characterize usual care practices concurrent to the five clinical trials, four databases (PubMed, EMBASE, Web of Science, Scopus) were searched (most recently May 15, 2014) for: 1) SpO2 target ranges used in NICUs for extremely premature infants since 1990; 2) achieved SpO2 levels in the same setting; 3) calibration of and SpO2 values from Masimo pulse oximeters (Masimo Radical Pulse Oximeter; Masimo Corporation; Irvine, California), the brand used in SUPPORT and the four similar randomized trials; or 4) data from these five trials. The search was limited to publications in English with additional search terms detailed for each database (see S1 Text). Follow up searches were performed periodically to identify further publications related to the five clinical trials.
Of 470 publications found, 19 provided data on SpO2 target ranges or achieved SpO2 levels in usual care settings [14, 15, 18–34], four provided relevant information regarding Masimo pulse oximeters [35–38], and eight reported results from the five randomized trials [1, 2, 16, 17, 39–42]. Studies were excluded if they did not contain relevant data, were duplicates, or focused on populations dissimilar from those enrolled in the five trials.
To determine how oxygen management interventions were described in SUPPORT consent forms, institutional review board-approved forms were obtained (M.A.C.) for all institutions enrolling infants from the National Institutes of Health (NIH) through the Freedom of Information Act (available in S2 Text).
Two investigators (I.C.P. and M.A.C.) independently reviewed each article and the consent forms. Patient characteristics, SpO2 target ranges, achieved SpO2 values, and pulse oximeter monitoring practices were extracted from each article. Written descriptions of oxygen ranges and potential risks, as provided to parents of potential SUPPORT subjects, were directly excerpted from the consent forms.
Because of similarities in gestational ages, monitors used, and sites where care was delivered, detailed analyses were conducted comparing the five trials to corresponding data from the AVIOx study . From 2003 to 2004, the AVIOx study of usual care enrolled 84 infants born at less than 28 weeks gestation and requiring oxygen therapy at 14 NICUs in the U.S., U.K., and New Zealand (including some NICUs that participated in the randomized trials). Notably, infants in the AVIOx study would have met major enrollment criteria for the five clinical trials. During the first four weeks of life, a second pulse oximeter, the Masimo model used in the five randomized trials, was attached to these infants receiving usual care. SpO2 readings were recorded continuously each week over 72 hours with the Masimo pulse oximeter, but not displayed to caregivers. Graphs were generated comparing SpO2 target ranges and achieved levels for usual care at the 14 AVIOx NICUs to those for the low and high saturation arms studied in the five randomized trials.
The 95% prediction ellipse, for the plot of lower versus upper limits of the intended SpO2 ranges for each AVIOx NICU, was calculated assuming a bivariate normal distribution between the lower and upper limits within each AVIOx NICU. SAS version 9.3 (SAS Institute Inc., Cary, NC) was used; two-sided p-values of 0.05 or less were considered significant. Achieved median SpO2 levels for the high and low groups in the clinical trials were compared to usual care at the AVIOx sites using linear mixed models (LMMs), with a random effect accounting for the variability of results from AVIOx NICUs and the country of study for COT and BOOST II. Similar LMMs were used to compare the percentage of time actual SpO2 was below 85% in the low oxygen arms versus: (i) the high oxygen arms from the clinical trials; and (ii) the percentage of time actual SpO2 was below the lower limit of the intended range during usual care at the AVIOx NICUs with saturation lower limits ≤88%.
We compared the SpO2 target ranges studied in SUPPORT, BOOST II and COT with those intended for use in a comparable population of infants at the 14 centers included in the AVIOx study. The SpO2 target range used for the low arm of the clinical trials was lower and narrower than those applied during usual care. Specifically, the upper limit of the low SpO2 target range arm (89%) was lower than the upper limit of intended ranges (92 to 98%) used during usual care at all 14 AVIOx NICUs (Fig 1A). Across the 14 AVIOx NICUs, as the lower limit of the intended range decreased, the width of the range increased (Fig 1B). While the high target range in the clinical trials was consistent with this relationship, the low target range was not, being narrower than usual care ranges with comparable lower limits. Unlike the high SpO2 target arm, the low arm did not fall within a 95% prediction ellipse for the relationship between the low versus high saturation range limits for usual care (Fig 1C).
Published intended SpO2 ranges applied during usual clinical care at other NICUs worldwide are remarkably consistent with the AVIOx study data. Two surveys of usual care for preterm infants in the U.S, one presenting intended SpO2 ranges for 120 NICUs in 2001  and the other for 40 NICUs in 2004  found that the upper limit of the intended target range was always 92% or greater. Collectively, for more than 100 unique centers worldwide, usual care was reported in surveys, observational studies, and randomized controlled trials to have an SpO2 upper limit of 92% or greater with one exception (Table 1). A single study, reporting data collected between 1990 and 1994, had a SpO2 target range upper limit as low as 90% . The cohort in this early study experienced a high mortality rate of approximately 50% compared to the 15 to 25% commonly observed in more recent reports.
Next, we compared achieved SpO2 values during usual care at the AVIOx centers with those achieved by the low and high arms of the BOOST II and COT trials (median achieved SpO2 values were not available for SUPPORT). During COT and BOOST II in Australia and the U.K., Masimo pulse oximeter calibration software was revised to correct a 1% to 2% overestimation of oxygen saturation measurements, especially between values of 87% to 90% (see S1 Fig). Data before and after recalibration have been analyzed separately for these three trials. Notably, median achieved SpO2 values during usual care in AVIOx NICUs were skewed toward or above the upper limit of intended ranges at all centers but one, center C (Fig 2A). Thus, achieved saturation values in clinical practice extensively overlapped with those targeted by the high, but not the low SpO2 arms of the clinical trials (Fig 2B). Achieved SpO2 values during usual care at all AVIOx NICUs were well above the low target range of the five randomized trials. In all but three AVIOx centers, the 25th percentile of achieved SpO2 values was above the upper limit of the low target range (Fig 2B). Accordingly, median achieved SpO2 levels in the low target arms of BOOST II and COT were significantly lower than those achieved at the nine AVIOx NICUs that targeted ranges with similar lower limits (≤88%) (p = 0.003, Fig 2C). In contrast, median achieved SpO2 levels in the high target arms were not significantly different from the AVIOx NICUs, whether compared to the AVIOx centers with relatively low (≤88%; 9 NICUs) or high (≥90%; 5 NICUs) lower limits.
Infants randomized to the low SpO2 arms of COT and BOOST II (for which published data is available) spent almost twice as much time below the lower limit of their intended target range (85%) as those receiving usual care at the nine AVIOx NICUs with lower limits ≤88% (p = 0.04). Subjects randomized to the low SpO2 arms of COT and BOOST II also spent significantly more time below a true saturation value of 85% than infants randomized to the high arms (p<0.0001) (Fig 3), as expected from their target ranges.
Finally, none of the consent forms acknowledged that the low SpO2 arm was an experimental intervention, not a widely practiced subset of usual care, and therefore posed risks, some of which were foreseeable, some less well-understood. Twenty of twenty-two SUPPORT consent forms explicitly or implicitly described the oxygen ranges studied as standard of care, usual care, or as a desired approach in some units (Table 2). Eleven consent forms had statements indicating that there was no predictable increase in risk to infants enrolled in the study, and two had statements indicating that there was no more risk to subjects than those seen in premature infants needing NICU management. Two forms (institutions I and V) did not have such characterizations of the oxygen ranges and risks. All consent forms for the BOOST II and COT trials were not available to us and were not analyzed.
In five randomized trials of supplemental oxygen for extremely preterm infants, the high SpO2 arms, with target saturations of 91 to 95%, reflected a range well within the scope of usual care. In contrast, for the low arms, targeting saturations of 85 to 89%, the upper limit was lower and the target range much narrower than concurrent usual clinical practice. The full range of clinical practice does encompass the bottom-end (85%) of the SpO2 targets investigated in these studies. However, relatively low, bottom-end saturation limits in usual care were universally paired with upper limits of 92% or greater, creating wider ranges. Importantly, caregivers appear to have a strong tendency to skew actually achieved saturations toward or above the upper end of these ranges. Consistently, low alarm limits in usual care are adhered to more stringently than upper alarm limits [27, 31]. However, in the trials, the narrow low SpO2 arm target range together with protocolized care blinded by offset pulse oximeters [21, 35, 38] resulted in infants spending significantly more time below an SpO2 of 85% compared to either usual care or the high saturation arm. As such, these infants experienced significantly more severe desaturation events .
All five trials used pulse oximeters programmed to display offset SpO2 values, to mask caregivers to trial group assignments. A careful analysis by COT investigators indicated that the transition zones from the 3% offset to the true saturation values impacted bedside care . Each arm used one rapid and one slow transition zone to taper the 3% offset back to true values at each end of the target ranges. In the rapid transition zones, displayed SpO2 values changed up to 4% over the course of a 1% change in true values. In the slow transition zones, the displayed oxygen saturation remained fixed (e.g., at 84%), while true values decreased 3% (e.g., 87% to 84%) [3, 35, 43]. According to the COT investigators, “the masking algorithm and its transition from offset to true values may have had an important and unexpected impact on the titration of oxygen therapy” . The COT investigators suspected that caregivers avoided the instability of displayed SpO2 values in the rapid transition zones by favoring saturation values at the bottom of the high target range and at the top of the low target range, in order to reduce the frequency of alarms .
Prior to starting BOOST II, an audit was initiated at participating centers to evaluate the performance of Masimo pulse oximeters . Selected centers evaluated 176 preterm infants receiving usual care with the Masimo device. This study found that the Masimo pulse oximeters had a calibration error that overestimated SpO2, especially between values of 87 to 90% (see S1 Fig) . As a consequence of this study, Masimo corrected their calibration algorithm, improving the accuracy of this monitoring device. Thus, before this correction, infants were placed on less accurate pulse oximeters as part of enrollment in SUPPORT and other similar trials. Other commercially available pulse oximeters, more commonly used in the United States  did not have this problem [36, 37].
As the COT investigators demonstrated  blinding caregivers using the masking algorithm “may have adversely affected the implementation of the protocol” . Both study arms were differentially managed in an unanticipated manner relative to one another and to usual care, confounding the interpretation of study outcomes . Blinding can be necessary for the validity of research, but needs to be carefully designed and preliminarily assessed in pilot studies to avoid unanticipated problems. This is particularly important in critically ill patients with high mortality rates, where blinding caregivers to a vitally important clinical parameter has the potential to increase risks. Additional pilot studies evaluating the offset pulse oximeters may have avoided changes to the calibration algorithm after the start of enrollment and provided information important to safety monitoring.
A literature review of oxygen exposure in extremely premature infants yielded only one prospective, high-quality, observational study. Despite this notable limitation to our analysis, the AVIOx study  collected robust data for comparing the low and high SpO2 treatment arms in these five trials to usual care. Further, the intended ranges in the AVIOx study centers were consistent with reported practices from two U.S. surveys, one presenting intended SpO2 ranges for 120 NICUs  and the other for 40 NICUs . Similar European surveys were not identified with fully comparable premature infants. However, in a survey of 228 NICUs in the UK, 92% of responding centers maintained premature infants with respiratory distress syndrome or bronchopulmonary dysplasia at SpO2 levels between 90 to 98% . Overall, more than 100 unique centers worldwide reported usual care practices compatible with AVIOx (Table 1). SUPPORT, therefore, was not representative of comparative effectiveness research, as commonly understood.
Unfortunately, problems in study design and informed consent processes often only come to public attention with the occurrence of harm. A recent meta-analysis of these five clinical trials found a significant increase in mortality in the low versus the high oxygen saturation arms, but only after recalibration of the Masimo pulse oximeters . ROP also showed significant heterogeneity across trials, but, unlike mortality, this variability was not associated with changes to the calibration of the pulse oximeters during the course of some of the trials . A patient-level meta-analysis (NeoPROM) is planned that will hopefully clarify some sources of this unresolved heterogeneity. Of note, the incidence of NEC, a condition associated with a high mortality rate, was consistently higher in the low oxygen saturation arms than in the high arms with no significant heterogeneity . This was the only major toxicity consistently found across all trials and calibration schemes. The potential for real harm to subjects in complex clinical trials that alter delivered clinical care underscores the need for a consent process that fully discloses whether research subjects will receive an intervention as commonly practiced at the institutions enrolling subjects or an experimental intervention that significantly deviates from usual care practices and that may pose both foreseeable and less well-understood risks. This is particularly true for therapies routinely titrated based on perceptions of clinical need in critically ill patients and other vulnerable, high-risk populations .
In rapidly lethal conditions with high mortality rates, basic interventions such as oxygen therapy may be lifesaving, and protocol-driven changes in their administration can have serious consequences. A thorough review of available literature, combined with detailed surveys of usual care and appropriately designed pilot studies, can provide important information regarding how trial interventions might affect care relative to usual clinical practices. These achievable steps might have preemptively uncovered the differential impact of the masking algorithm on oxygen saturation targeting  and clarified for investigators and institutional review boards that one of the interventions differed markedly from usual care.
SUPPORT consent forms have been at the core of the controversy surrounding this trial. It is necessary for subjects to make informed decisions that consent forms disclose how the interventions studied differ from usual care. Our analysis of the scientific literature indicates that the narrow, low saturation target range studied in these oxygen trials was not commonly used. In addition, the COT investigators elegantly demonstrated that the offset pulse oximeters also altered oxygen management in unexpected ways. Describing how oxygen management in at least one of the study arms differed from usual care, as well as the potential risks posed by such modifications, were both critical to providing adequate informed consent.
Despite being within the 85 to 95% target range recommended by the American Academy of Pediatrics , the low SpO2 target range studied in SUPPORT and the other four trials had an upper limit of 89% that was below those upper limits used during usual care. Similarly, many other sub-ranges, such as 85 to 86% or 94 to 95%, would not have been usual or standard of care and cannot be assumed to be safe. At the time of these five trials, our literature review found that most NICUs targeted SpO2 ranges with a lower limit between 85 and 89%, but always combined with an upper limit between 92 and 95%. In addition, achieved SPO2 values measured at the bedside often skewed higher than these target ranges. Notably, our literature review of usual care was limited to publications written in English and therefore most reports were from North America, UK, Australia and New Zealand. As such, we cannot rule out the possibility that different SpO2 target ranges were being used in non-English speaking regions or countries.
In conclusion, our findings highlight the need for investigators, prior to designing clinical trials, to rigorously evaluate actual clinical practices at institutions intending to enroll subjects. Likewise, institutional review boards need access to such data before approving protocols and consent forms. This is particularly important for research purported to be testing interventions consistent with usual care.
Figs Figs1A1A and and2B2B were presented at the October 29, 2014 meeting of the Department of Health and Human Services Secretary’s Advisory Committee on Human Research Protections during a discussion of the Office for Human Research Protections’ draft guidance on disclosing reasonably foreseeable risks in research evaluating standards of care.
The opinions expressed in this article are the authors’ own and do not represent any position or policy of the National Institutes of Health, the Department of Health and Human Services, or the United States government. The corresponding authors confirm that they had access to all the data in the study and had final responsibility for the decision to submit for publication.
The lead author affirms that this manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned have been explained.
This study was funded by intramural sources at the Critical Care Medicine Department, Clinical Center at the NIH.
All relevant data are within the paper and its Supporting Information files.