|Home | About | Journals | Submit | Contact Us | Français|
Mortality in National Heart, Lung and Blood Institute–sponsored clinical trials of treatments for acute lung injury (ALI) has decreased dramatically during the past two decades. As a consequence, design of such trials based on a mortality outcome requires ever-increasing numbers of patients. Recognizing that advances in clinical trial design might be applicable to these trials and might allow trials with fewer patients, the National Heart, Lung and Blood Institute convened a workshop of extramural experts from several disciplines. The workshop assessed the current state of clinical research addressing ALI, identified research needs, and recommended: (1) continued performance of trials evaluating treatments of patients with ALI; (2) development of strategies to perform ALI prevention trials; (3) observational studies of patients without ALI undergoing prolonged mechanical ventilation; and (4) development of a standardized format for reporting methods, endpoints, and results of ALI trials.
This report summarizes the findings of a workshop convened by the Division of Lung Diseases of the National Heart, Lung and Blood Institute (NHLBI) on August 5 and 6, 2009. The goal of the workshop was to assess the current state of clinical research addressing acute lung injury (ALI) and the acute respiratory distress syndrome (ARDS), identify research needs, and develop recommendations for clinical research in the near future.
Data on mortality attributable to ALI are conflicting (1, 2), but both longitudinal observations at single institutions and the experience of the NHLBI ARDS Clinical Trials Network report a substantial decline over the past 2 decades. It should be noted that mortality is often higher in observational studies, which do not exclude patients with high-risk comorbidities, than it is in interventional trials. Studies from the Network permit comparison of mortalities in patients of similar disease severity and source, and in these studies mortality has decreased from almost 40% in studies conducted in the mid to late 1990s (3, 4) to approximately 25% in the most recent reports (5–7) (Figure 1). Consequently, clinical trials powered to detect differences in mortality may require ever-increasing numbers of patients to detect significant relative risk reduction. The panel was convened to examine whether novel investigational strategies might allow trials to enroll fewer patients and to consider other types of clinical research that would advance the science and treatment of ALI. The group included pulmonary, critical care, and cardiovascular clinician-scientists experienced in the design and conduct of clinical research as well as experts in biostatistics and genetics. References provided herein are by no means exhaustive but are provided to illustrate key statements. Throughout the text, references to ALI include ALI and ARDS, the subset of ALI with more severe hypoxemia.
Panelists were asked to consider issues in the following areas:
Descriptions of ALI and ARDS have existed since the Civil War. In 1994, a working definition of ALI and ARDS that is now widely used was developed by the American European Consensus Conference (AECC) (8). This definition depends on evaluation of gas exchange, the chest radiograph, and exclusion of a cardiogenic cause for lung edema. Although facilitating study of comparable patient populations, this definition is open to criticism. For example, conditions under which gas exchange is measured, such as level of positive end-expiratory pressure, are not specified and often significantly influence the measurement (9). Widespread use of oximetry may replace the use of arterial blood gas measurements (10). Assessment of cardiac function has evolved as use of Swan-Ganz catheters has diminished and use of serum brain natriuretic peptide levels to assess cardiac function has increased. Interpretation of chest radiographs is inconsistent, although may be made more consistent with training (11). Use of the AECC definitions results in recruitment of populations of patients with heterogeneous predispositions for development of ALI and with a spectrum of comorbidities that may influence outcome. Furthermore, categorizing patients with ALI by gas exchange impairment into ARDS and non-ARDS groups does not necessarily identify patients with different underlying disease mechanisms, pathophysiology, prognosis, or responsiveness to treatment. A fundamental challenge for investigators is that patients with ALI are likely to have significant heterogeneity in each of these dimensions.
Rather than propose new definitions of ALI or ARDS, the panel advocated flexibility in deciding inclusion criteria. Investigators should consider modifying inclusion and exclusion criteria for individual studies, basing those modifications when possible on results of Phase II studies or application of relevant biomarkers or genetic information. Strategies that would enrich study populations with patients likely to respond to a treatment under study may help define appropriate study populations. For example, study of lung recruitment or use of high positive end-expiratory pressure might require the presence of extensive bilateral opacities on the chest radiograph or CT scan, profound gas exchange abnormalities, and evidence that the lungs are acutely recruitable, whereas studies of pharmacologic interventions with low potential for toxicity might allow enrollment of patients with only modest gas exchange abnormalities and less extensive chest radiographic abnormalities. Exclusion of patients with comorbidities expected to influence the study endpoints but unlikely to respond to the study treatment may be appropriate. Inclusion and exclusion criteria, however, should be reproducibly and feasibly applicable by front-line clinicians. The panel recognized that limiting eligibility may make identification of target populations more challenging, make achievement of target sample size more difficult, and may limit the generalizability of results. Furthermore, the hypotheses underlying exclusion of patients on a genetic or physiologic basis may not, in fact, be valid, and if they are not, their spuriousness may never be discovered. Some exclusion criteria (e.g., those based on genetic testing) may be particularly problematic if there is little prospect of the requisite tests being widely and rapidly available. Additional testing also has the added risk of increasing study costs without benefit. Large, simple trials with broad eligibility criteria may in many instances be worthy of consideration.
Some interventions are appropriately applied across a heterogeneous and large group of patients. These often include process-of-care studies, such as those targeting ventilator or fluid management. Other interventions are better evaluated in groups of patients who share common pathogenetic mechanisms that result in ALI. For example, study of immune modulating therapy for patients with ALI triggered by influenza virus infection would properly be performed in patients demonstrated to be infected with that agent. Both large simple trials and more focused studies targeted at specific subgroups are needed.
Mortality and incidence of ALI differ substantially over the spectrum from infants to aged adults (12). The incidence and mortality in young adults and children is similar, and is markedly lower than in the elderly. Studies of ALI in adults outnumber pediatric ALI studies, but there have been some recent noted successes in the latter group (13, 14). It may not be appropriate to include all patients across the entire age range in all trials. To increase the evidence supporting use of ARDS treatments for children, recent recommendations have been published for use of Bayesian and other novel statistical approaches for combining data from children and from adults in ARDS clinical trials (15).
Biomarkers might potentially be used to predict outcome, identify subgroups with different response to therapy, or serve as surrogate endpoints. A variety of plasma and urine biomarkers are associated with mortality in large multicenter studies of ALI (Table 1)(16–18). Inflammatory markers (IL-6, IL-8, tumor necrosis factor receptors 1 and 2, urine nitric oxide), epithelial cell markers (receptor for advanced glycation end-products, surfactant protein D), adhesion molecules (intercellular adhesion molecule), markers of endothelial injury (von Willebrand factor antigen), extracellular matrix (desmosine) and coagulation proteins (protein C, plasminogen activator inhibitor 1) were all associated independently with mortality in the NHLBI ARDS Clinical Trials Network study of a lung-protective ventilation strategy (3). Procollagen peptide III has been shown to be correlated with outcome (19). However, only a few of these markers (surfactant protein-D, IL-6, IL-8, protein C, receptor for advanced glycation end-products, and urine nitric oxide) had changes associated with the low tidal volume treatment effect, and the differences were modest, suggesting that we have yet to identify good surrogate biomarkers for ventilator treatment effects (17, 20–24).
Biomarkers have the potential to allow selection of patient groups for interventional studies that may have a higher predicted mortality than the general population of patients with ALI, thereby reducing the required sample size. In addition, biomarkers could potentially be useful in facilitating both the prediction and early diagnosis of ALI, allowing earlier testing of preventive and therapeutic interventions. For these applications, information derived from combinations of biomarkers is superior to the use of individual biomarkers (16), but the added value of biological markers over clinical predictors is currently small, and further investigation is warranted. Conducting proteomic studies for discovery of novel biomarkers for diagnosis and prognosis in ALI is an attractive goal, although appropriate sample collection and preparation is of critical importance (25).
The group noted that surrogate biomarkers of outcome, which may or may not be in the causal pathway of a disease, and which might substitute for a clinical outcome, have been very difficult to find and validate. Even well-studied markers, such as lipid levels in cardiovascular diseases, have been shown in some studies to be unreliable (26, 27).
Genetic studies may help elucidate pathogenesis, identify clinically important subgroups, and provide explanations for observed variability in outcomes or responses to treatment. Genetic factors can influence the outcomes and adverse events in interventional studies of ALI in a variety of ways. Important genetic factors may be related to: susceptibility to the primary disease; drug metabolism, transport, or receptor interaction; or the biological pathway being targeted by the intervention. Pharmacogenetic effects tend to be much larger than primary disease genetic effects, and therefore may be easier to detect. Many panel members agreed that acquiring specimens for genetic analysis from patients enrolled in large clinical trials is of significant value, and that even when trials are negative, genetic information can be useful. However, it was pointed out that the complexity of the ALI syndrome, the strong influence of environment on development and outcome of ALI, and the challenges inherent in analyzing massive amounts of information may limit the value of genetic studies. At present there are studies on individual gene polymorphisms (28–30) showing association with ALI, and large genetic discovery efforts (genome-wide association studies, GWAS) are in progress. Although large sample sizes are usually required for GWAS, if there are genome–phenome interactions with large effects, smaller samples may detect such effects. In addition, samples may be combined across studies, depending on outcomes, to enlarge the effective sample. Children with ALI present the opportunity to use family-based statistical genetics approaches and offer a more robust control group, although sample sizes for family-based studies are usually much larger than in case-control studies.
Both the number and age of patients in the ICU undergoing prolonged mechanical ventilation (e.g., greater than 24 h) are increasing (31, 32) (Figure 2), yet little is known about this population if ALI is not present. The fraction of ventilated patients with ALI is relatively small (33) and some patients develop ALI after intubation and ventilation for other indications, making studies that clarify the risk factors for this evolution attractive. It would be desirable to develop a greater understanding of the demographics, pathology, and outcomes of these patients, including the frequency with which they progress to ALI (34). Observational studies to discover meaningful phenotypes should be considered. Interventional studies, including wider application of lung-protective ventilation early in the course of respiratory failures of various types, could prove to be valuable in reducing the incidence of ALI. A variety of care processes is common in this heterogeneous group of patients with respiratory failure and includes noninvasive ventilation, sedation, nutrition, and immobilization. Very little evidence underlies best practices with these treatments, and comparative effectiveness research may be valuable for evaluating these processes. Coordination of studies across international boundaries offers the possibility of taking advantage of population-based databases that currently exist in countries with single-payer health-care systems. Moreover, design of a minimum data set for collection on these patients, as part of hospital- or ICU-based clinical information systems, may be beneficial in addressing fundamental gaps in knowledge regarding these patients.
It is now clear that survivors of critical illness and ALI experience a substantial and prolonged negative impact across many domains of their subsequent quality of life (35–37). Survivors have deficits in functional status, physical and psychological symptoms, and cognition (38). There are currently no randomized trials of interventions during the period of critical illness showing improved quality-of-life outcomes. For most studies of ALI, quality-of-life measurements will serve as secondary, rather than primary, endpoints. Quality of life as a primary outcome may be most appropriate in Phase III and IV trials when there is a low predicted mortality rate or in circumstances where the intervention may affect multiple domains evaluated by standard quality-of-life instruments. Assessing quality of life after hospital discharge in ALI survivors also may contribute to cost–utility analyses (39). For these reasons, quality-of-life assessment (40) should be considered a fundamental component of large-scale ALI studies. Quality of life measured using validated instruments and administered via a centralized telephone call center may be a cost-effective, feasible, and valid component of large multisite trials. Generic measures in use today, such as the Short Form-36 (SF36), are appropriate for ALI studies, and development of ALI-specific quality-of-life measures is unlikely to be needed. Further work is required to develop and validate methods for estimating individuals' premorbid baseline function, enhance complete data collection during both the acute and follow-up phases of illness, and evaluate measures in pediatric illness.
The group agreed that mortality continues to be the critical outcome in studies of specific treatments for ALI. When considering other endpoints of “clinical significance,” it is important to note that there is little clarity or consensus regarding the meaning of that term (41). Certain patient-centered or “patient-important” outcomes may be easier to define and agree on. However, complexity is provided by variability in patient and provider knowledge and preferences. For patients with ALI, survival, cognitive function, pulmonary function, and quality of life are clearly important, whereas substantial decreases in time on a ventilator, in the ICU, or in hospital may also have some, though likely less, importance. Endpoints such as organ failure, if unaccompanied by changes in mortality or quality of life, are also likely to be of lesser importance from patients' perspectives.
There are no proven surrogate measures for mortality in ALI. For example, treatments that improve oxygenation in ALI do not reliably result in a reduction in mortality (3, 4, 42, 43). This observation is consistent with other treatments in critical illness where organ failure reversal does not reliably correlate with mortality reduction (44–46). Duration of mechanical ventilation or the number of days alive and free of mechanical ventilation during the first 28 days after diagnosis with ALI (referred to as “ventilator-free days”), are potentially biased outcomes because the decision to discontinue mechanical ventilation is often subjective unless driven by an explicit protocol. The number of ventilator-free days was implemented to account for the reduced duration of mechanical ventilation in patients with early death. However, there are several known limitations to use of ventilator-free days as an outcome measure, and few demonstrated advantages. Studies that show no change between treatment and control groups in ventilator-free days can have a significant difference in mortality (14). Furthermore, this outcome measure is not particularly patient centered, is not a surrogate for cost, and assigns equal weights to death and prolonged mechanical ventilation. When the number of ventilator-free days to day 28 is used as a primary endpoint, investigators must also include mortality as a coprimary endpoint, because the former metric may fail to achieve statistical significance when mortality does (14). If the decision is made to have these outcomes as coprimary endpoints, studies must be powered for both, and creation of stopping rules becomes considerably more complex, though attainable (3).
Use of a primary composite endpoint as a strategy for reducing sample size must be considered with great care. Composite endpoints can be of value only when they satisfy several conditions. Each component should be of similar importance to the patient, the more and less patient-important outcomes should occur with equal frequency, and they should have similar risk reductions in response to effective interventions (47). Without these conditions, composite outcomes can be misleading. For example, a composite endpoint may be predominately determined by a component of lesser patient importance, such as time on mechanical ventilation, whereas a component of greater patient importance, such as mortality, shows no change or even potential harm.
The distinction between survival and mortality as an endpoint is often lost in reports of clinical ALI studies. The former is correctly applied to the length of time between the onset of ALI and death, whereas the latter is a measure of those who have died at a specific point in time. Survival is of limited value in ARDS clinical trials, because the timing of death is often determined by comorbid conditions and by decisions regarding withdrawal of life support. Further, death after a few days may not be a worse outcome than death after a few weeks.
The panel concluded that mortality remains the essential endpoint in ALI trials, but that measures of longer-term quality of life, functional status, and cost should also be measured given the high frequency of important and long-lasting deficiencies in those outcomes.
When considering trials that focus on prevention of ALI, it has been customary to identify populations at high risk of developing ALI, such as patients with sepsis, and consider them as the target population. As the interval between risk exposure and development of ALI may be very short, and as only a small percentage of patients at risk actually proceed to develop ALI, the number of patients required to study treatments to modify the course of ALI has appeared prohibitive. However, several recent developments may alter this conclusion. These include broad application of informatics technology to identify high-risk patients across many centers, identification of very specific populations at great risk (such as pediatric recipients of bone marrow transplants or certain postoperative patient groups), and development of methods to accomplish cluster randomization of critical care units or hospitals across many medical centers and medical disciplines. Identification of patients at risk early in the course of their illness (e.g., in the emergency department or operating room) may allow earlier implementation of ALI prevention strategies.
Interventions appropriate for testing in prevention trials should be able to be rapidly administered, inexpensive, and safe. Such interventions may occasionally be identified in hospital quality assurance programs.
Other prevention trials to consider include those designed to decrease the incidence of specific adverse outcomes of ALI, including cognitive, psychological, and neuromuscular complications.
The group considered the status and value of Phase II studies that screen new therapeutic approaches to provide the rationale for larger Phase III efficacy trials of promising interventions. Phase II randomized clinical trials are preferable to Phase II trials that use historical controls. Useful endpoints for Phase II studies include those that indicate safety, elucidate mechanism of action, and provide suggestion of efficacy. Phase II studies can be investigator initiated and performed at a single site, or can have a more complex recruiting structure. The ARDS Network has generally done Phase II/III trials, designed such that new drug therapies can be tested in Phase II and retained in a Phase III trial if prespecified efficacy, safety, and proof-of-concept outcomes have been demonstrated. The Phase III trials often have a factorial design, include a process-of-care question, and have a fairly liberal futility stopping boundary. If the initial results at the first interim analysis are not strongly positive, the trial is stopped. The opportunity cost of demonstrating unequivocally that a novel drug is not efficacious is considered high when other agents are available to be tested. The panel discussed whether it would be better to conduct a number of Phase II studies simultaneously, perhaps with a common control group, to choose the most promising for further study, but found that approach to be complex and controversial. In general, the group believed there are a number of promising therapies available for testing and that Phase II/III studies should continue to be a priority.
All clinical trials are becoming increasingly complex, slower, and costly. The clinical trial is a tool appropriate for many, but not all, research questions. The design depends on whether the objective is determination of efficacy or discovery of mechanisms. Study size is determined by event rate, the size of the treatment effect expected, the desired precision of the answer (Type II error or probability of falsely rejecting a true hypothesis), and availability of resources. Bayesian trial designs have the potential to reduce sample size (48). The group warned that excessive reliance on a single primary outcome and a single P value is perilous. Instead, multiple results from a clinical trial or trials should be considered for subsequent clinical decision making, including evaluation of secondary endpoints and plausible biomarkers. The panel acknowledged that pragmatism is important in designing trials. Often there must be a compromise between assumptions and precision, on the one hand, and feasibility and available resources on the other.
Most trials in ALI have been explanatory or mechanistic in nature, performed to determine the impact of an intervention on clinical outcomes and biological endpoints. The participants may be highly selected and the intervention is usually strictly defined. Studies under such conditions should be reproducible and give mechanistic insights to guide clinical decision making.
In contrast, large simple trials, which are often designed as pragmatic trials addressing patient-important outcomes in a real-world context, are an appealing possibility in critical care. As opposed to smaller Phase III efficacy trials, these studies include a sample size chosen to detect changes in outcomes that are unequivocally patient-important and to detect relatively small, but still important, effect sizes. They generally have broad inclusion and minimal exclusion criteria, require limited data collection, and permit patient management to remain as close as possible to normal clinical practice. There are examples of unequivocal practical trials in critical care (49, 50). Enthusiasm for large simple trials may be tempered by the limited new knowledge that is often obtained.
The group recognized that design of clinical trials is to some extent an art. Design must take into account practical issues, such as patient availability, time of accrual, and available budget. With the current status of ALI care, it is likely that few if any sponsors have the resources to support appropriately sized studies to detect small mortality differences. Similarly, practical issues usually preclude multiple studies of the same intervention. Multiple smaller studies of similar nature, analyzed by meta-analysis, can provide guidance for clinical decision making. Prospectively planned individual patient metaanalysis is a particularly powerful tool, and probably the only way to reliably define subgroup effects. Stopping trials for futility is controversial (51). Although stopping may provide the opportunity for performing additional studies of new interventions, it may have a negative impact on the ability to perform subsequent individual patient data metaanalyses. International cooperation is recommended so that similar studies with common minimum data sets can be conducted in multiple countries.
In conclusion, progress in understanding and treating ALI has been incremental and is likely to remain so. The preponderance of the evidence, including clinical and mechanistic studies, should be used to guide clinical decision making. The reporting and interpretation of results of ALI clinical studies could be improved by use of a standardized description of methods, endpoints, and results. A standardized format in ALI trials for collecting data, including long-term quality-of-life and functional status outcomes, would facilitate patient-level metaanalyses from multiple trials and enhance collaborative efforts, and development of such an instrument is recommended.
Supported by National Heart, Lung and Blood Institute, National Institutes of Health.
Other workshop participants include John H. Alexander, M.D. (Durham, North Carolina); Nancy Cox, Ph.D. (Chicago, Illinois); Lydia Gilbert-McClain, M.D., F.C.C.P. (FDA), David Reboussin, Ph.D. (Winston-Salem, North Carolina); Myron Waclawiw, Ph.D., and Colin Wu, Ph.D. (NHLBI).
Originally Published in Press as DOI: 10.1164/rccm.201001-0024WS on March 11, 2010
Conflict of Interest Statement: R.G.S. has received consultancy fees from Nycomed GmbH ($10,001–$50,000); he is employed by the University of California, San Diego ($10,001–$50,000) and has received advisory board fees from Veterans Medical Research Foundation ($10,001–$50,000). G.R.B. has no financial relationship with a commercial entity that has an interest in the subject of this manuscript. W.C. has no financial relationship with a commercial entity that has an interest in the subject of this manuscript. J.R.C. has received royalties from Oxford University Press (up to $1,000); he has received sponsored grants from NIH (over $100,000). O.G. has no financial relationship with a commercial entity that has an interest in the subject of this manuscript. G.G. has received consultancy fees from Up To Date ($60,000); he has received expert witness fees from Fosamax ($8,000). J.H. has received expert witness fees from various law firms and insurance companies ($50,001–$100,000); he has received industry-sponsored grants from Eli Lilly ($10,001–$50,000) and Johnson and Johnson ($10,001–$50,000); he has received royalties from McGraw Hill ($10,001–$50,000); he owns health care focused mutual funds (over $100,000); he has received advisory board fees from ATS (up to $1000), lecture fees as ATS SOTA director ($1,000–$5,000), and fees from SEEK Committee as a contributor to CME content ($5,001–$10,000). He is also an Editorial Board member of both Chest and Critical Care Medicine. E.I. has no financial relationship with a commercial entity that has an interest in the subject of this manuscript. M.J. has received sponsored grants from the Cystic Fibrosis Foundation ($50,001–$100,000). D.M.N. has received consultancy fees from Passy Muir Inc. ($1,000–$5,000) and Bay Back Strategies LLC (up to $1,000); he has received sponsored grants from NIH (over $100,000). A.G.R. has received consultancy fees from Eisai Pharmaceuticals (up to $1,000); he has received advisory board fees from Discovery Laboratories ($1,000–$5,000) and industry-sponsored grants from MedImmune Inc. ($10,001–$50,000). G.D.R.'s institution has received the following grants from nonprofit agencies or foundations: National Institutes of Health (approximately $10 million), Robert Wood Johnson Foundation (approximately $500,000); his institution has received the following grants from for-profit companies: Advanced Lifeline Systems (approximately $150,000), Siemens (approximately $50,000), Bayer (approximately $10,000), Byk-Gulden (approximately $15,000), Astra-Zeneca (approximately $10,000). He has received the following honoraria, consulting, editorship, and DSMB membership fees: Bayer (approximately $500), DHD (approximately $1,000), Lilly (approximately $5,000), Hospira (approximately $15,000), Cerner (approximately $5,000), Pfizer (approximately $1,000), KCI (approximately $7,500), American Association for Respiratory Care (approximately $10,000), American Thoracic Society (approximately $7,500), National Institutes for Health (approximately $500), Alberta Heritage Foundation for Medical Research (approximately $250), Faron Pharmaceuticals (approximately $5,000), and Cerus Corporation (approximately $11,000). D.S. has no financial relationship with a commercial entity that has an interest in the subject of this manuscript. B.T.T. has received consultancy fees from Eli Lilly ($10,001–$50,000), AstraZeneca ($5,001–$10,000), and Abbott ($5,001–$10,000); he has received lecture fees from Eli Lilly ($1,001–$5,000). He has received various sponsored grants from NHLBI ($25,000–$110,000), and advisory board fees from NHLBI ($1,001–$5,000). L.B.W. has received industry-sponsored grants from Luminex ($1,001–$5,000) and Sirius Genomics ($10,001–$50,000); he has received sponsored grants from NIH (over $100,000). D.Y. has no financial relationship with a commercial entity that has an interest in the subject of this manuscript. A.L.H has no financial relationship with a commercial entity that has an interest in the subject of this manuscript.