|Home | About | Journals | Submit | Contact Us | Français|
Helicopter emergency medical services and their possible effect on outcomes for traumatically injured patients remain a subject of debate. Because helicopter services are a limited and expensive resource, a methodologically rigorous investigation of its effectiveness compared with ground emergency medical services is warranted.
To assess the association between the use of helicopter vs ground services and survival among adults with serious traumatic injuries.
Retrospective cohort study involving 223 475 patients older than 15 years, having an injury severity score higher than 15, and sustaining blunt or penetrating trauma that required transport to US level I or II trauma centers and whose data were recorded in the 2007–2009 versions of the American College of Surgeons National Trauma Data Bank.
Transport by helicopter or ground emergency services to level I or level II trauma centers.
Survival to hospital discharge and discharge disposition.
A total of 61 909 patients were transported by helicopter and 161 566 patients were transported by ground. Overall, 7813 patients (12.6%) transported by helicopter died compared with 17 775 patients (11%) transported by ground services. Before propensity score matching, patients transported by helicopter to level I and level II trauma centers had higher Injury Severity Scores. In the propensity score–matched multivariable regression model, for patients transported to level I trauma centers, helicopter transport was associated with an improved odds of survival compared with ground transport (odds ratio [OR], 1.16; 95% CI, 1.14–1.17; P<.001; absolute risk reduction [ARR], 1.5%). For patients transported to level II trauma centers, helicopter transport was associated with an improved odds of survival (OR, 1.15; 95% CI, 1.13–1.17; P < .001; ARR, 1.4%). A greater proportion (18.2%) of those transported to level I trauma centers by helicopter were discharged to rehabilitation compared with 12.7% transported by ground services (P < .001), and 9.3% transported by helicopter were discharged to intermediate facilities compared with 6.5% by ground services (P < .001). Fewer patients transported by helicopter left level II trauma centers against medical advice (0.5% vs 1.0%, P < .001).
Among patients with major trauma admitted to level I or level II trauma centers, transport by helicopter compared with ground services was associated with improved survival to hospital discharge after controlling for multiple known confounders.
Trauma remains the leading cause of death and disability among young people around the world. In the United States, more than 50 million people are injured per year, resulting in approximately 169 000 annual deaths and a lifetime cost of $406 billion.1,2
Over the past several years, significant improvements in survival after trauma have been achieved. One reason for this has been improvements in emergency medical services (EMS) and life-saving transport of trauma patients to a center capable of providing definitive care. The utility of helicopter EMS and its possible effect on outcomes for traumatically injured patients remains the subject of debate.3–8 Because helicopter transport is a limited and expensive resource, a methodologically rigorous investigation of its effectiveness compared with ground EMS is warranted.
Several studies have used the National Trauma Data Bank (NTDB) to assess outcomes for traumatically injured adultstransportedbyeitheroption.9–11 Although each of these studies concluded that helicopter transport was associated with improved odds of survival, they also reported limitations to their conclusions because of the lack of testing for the assumptions of a regression model, and none was able to account for the differences in how patients were assigned treatments (helicopter or ground transport). Additionally, the high proportion of missing data in the NTDB may have introduced bias.12
The purpose of this study was to compare the association between the 2 transport modes and survival among adults with traumatic injuries by performing a robust analysis and controlling for known confounders.
The NTDB is the largest repository of trauma data in the world, with data collected from more than 900 centers in the United States.13 Since 2007, the quality of data in the NTDB has markedly improved following adoption of the National Trauma Data Standard. After receiving approval from the Johns Hopkins School of Medicine’s institutional review board, a merged data set, using data from the 2007–2009 NTDB, was created with all available variables. The data sets were merged using source codes provided by the NTDB.12
All hospitalized patients with an International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) code of 800–959 were eligible for inclusion. Adults older than 15 years and admitted to a level I or II trauma center were included. The analysis was restricted to records with complete information regarding transport and disposition information. Other forms of transportation to trauma centers, such as private conveyance, police, and walkins, were excluded. The Injury Severity Score (ISS) was used to quantify the severity of trauma. An ISS higher than 15 was used for inclusion because this has been shown to be associated with a greater need for specializedtraumacare.14, 15 Injury Severity Score and sex were also found to be statistically significant covariates in a previous study comparing urban and rural helicopter transport of patients with blunt trauma injuries.16
The primary intervention was transport by either helicopter or ground EMS. The outcome of interest was survival to discharge from the hospital. This outcome was evaluated with 3 analytical models: a multivariate logistic regression, a multivariate logistic regression model with generalized estimating equations and robust variance calculations to control for clustering by trauma center, and a logistic regression model incorporating the results of propensity score matching.
Covariates included demographic, physiologic, and hospital data. Demographics included information on age, sex, and race. Covariates were carefully selected based on the assumption that none was affected directly by the intervention. Other a priori selected variables were planned for inclusion in the final statistical models. These variables included the type of trauma (blunt vs penetrating), initial recorded vital signs (systolic blood pressure, respiratory rate, heart rate), Glasgow Coma Scale (GCS; motor component) score, and the locally calculated ISS or ISS calculated from ICD-9-CM admission codes. Illicit drug use, alcohol use, and comorbidities were also tabulated but were found to have a prevalence of missing data greater than 40% (eFigure available at http://www.jama.com) so were not included. Patients who were dead on arrival to the emergency department (ED) were excluded.
Variables were considered for inclusion in the final models after calculating correlation coefficients, examining scatterplot matrices, and ensuring that the proportion of missing data was below 20%. Final models included the following independent variables: systolic blood pressure, respiratory rate, heart rate, motor component of GCS, mechanism of injury derived from ICD-9-CM e-codes for primary and secondary diagnoses, age, sex, type of trauma (blunt vs penetrating), and transport mode. Total GCS scores were excluded because of significant colinearity with the motor component and because more than 30% of data for the verbal component was missing. Moreover, a previous study has shown that the motor component is equally predictive of the main outcome measure of survival in the NTDB.17 All vital signs consisted of those first recorded in the ED because of a high proportion of missing data for prehospital vital signs. A sensitivity analysis was conducted to ensure that initial ED vital signs had a high correlation with pre-hospital vital signs (eTable 1). Total elapsed EMS times from dispatch to ED arrival were excluded as a variable because of a 57.8% prevalence of missing data (eFigure). A sensitivity analysis was performed for all complete cases to examine the role of total EMS times as an independent variable.
In the absence of a posited assignment mechanism for how patients are assigned to helicopter or ground transport, formal causal inference is not credible.18 The goal of causal inference is to assess the average effect of a treatment on a subsequently measured outcome. The goal of an observational study should be to create an analysis that resembles conditions that would otherwise be achieved under a randomized design.19, 20 Systematic differences between treatment groups should be balanced at the beginning of observational studies to control for bias.
Propensity score methods, as first described by Rubin,21 are 1 way of creating subgroups of treated units (helicopter) and control units (ground) that are similar with respect to distributions of observed background characteristics and potential confounders.22 Propensity scores reflect the likelihood of a study participant being assigned to a particular treatment group, conditional on multiple variables thought to influence such an assignment.21 Once a propensity score is calculated, the score can be used in a multivariable model as an independent variable or the score can be used to match participants with different treatment assignments, thereby creating a matched patient cohort, reducing the risk of confounding by indication.
Propensity score–based analyses were conducted with subclassification matching. Subclassification matching was selected after assessing standardized mean differences compared with nearest neighbor and full matching protocols. Multiple imputation, using techniques previously established for use with the NTDB,23 was performed for each variable associated with missing values. All variables, including the missing data indicators, were included in a propensity score model. Multiple imputation was performed using the multiple imputation (mi) suite of commands available in Stata version 11 (Stata Corp). Imputation was used for the following missing data: systolic blood pressure, heart rate, motor component of GCS score, and ISS. First, the proportion of missing data for variables of interest was calculated. The mi set of commands was used to generate a regression model to impute missing data based on other available variables. This process was repeated 5 times, creating 5 separate imputed data sets. These 5 data sets were combined using commands from the mi suite to create a full data set with no missing values. Multiple imputation was used only for variables with less than 20% missing data. The proportion of missing data for variables used in all regression analyses is described in eTable 2.
The following independent variables were used to calculate the propensity score: age, sex, ISS, systolic blood pressure, respiratory rate, heart rate, type of trauma, GCS motor component score, mechanism of injury (ICD-9-CM e-codes), and trauma facility. Propensity score matching was performed after the propensity scores were estimated. For the subclassification method, 5 subclasses were used because this has been shown to remove at least 90% of bias in the estimated treatment effect due to the covariates used to estimate a propensity score.24 Balance among the covariates after propensity score matching was assessed with numerical diagnostics, jitter plots, histograms, quantile-quantile plots, and standardized bias plots. Standardized mean differences were reduced by 91.3% for the level I and 91.8% for the level II groups, following matching by propensity scores. The jitter plot for the samples demonstrated few outliers, and all available data were used for both groups. Overall, the matching quality achieved was excellent for level I and level II trauma center data sets, presumably due to the large number of patients in the control group. A logistic regression analysis was performed on the matched samples, and effects were estimated within subclasses and then combined for a final effect estimate with an associated standard error.
The data mergers and construction of the imputed data sets were conducted using Stata version 11. Stata was also used to calculate odds ratios (ORs) with 95% confidence intervals for the logistic regression with generalized estimating equations analyses. Propensity scores and propensity score matching, as well as all propensity score–based logistic regression analyses, were performed using the Match It and opt-match packages available in the 64-bit version of R 2.12.1 (R Foundation for Statistical Computing). With 90% power and an α level of .05, a total of 8802 patients were required to detect a 2% mortality difference between groups. P values of less than .05 were considered statistically significant, and all tests were 2-sided. Absolute risk reduction (ARR) calculations were made after calculating the number needed to treat, based on the adjusted ORs as previously described by Lindenauer et al.25
Regression diagnostics were performed for all multivariate logistic regression models. Leverage was assessed with hat matrices, and influence assessed with changes in Pearson residuals when covariates were fitted to the model. The Hosmer-Lemeshow and Pearson goodness-of-fit tests were used to confirm that the models adequately fit the data (P > .10). Colinearity among the independent variables was assessed by calculating variance inflation factors.
The Figure depicts the study profile. A total of 1 816 982 records were initially available in the 2007, 2008, and 2009 NTDB data sets. After stratification and confirmation of data availability for disposition, transport mode, type of trauma, and injury severity and after excluding patients who died before reaching the ED (1897 by ground; 324 by helicopter), there were 159 511 patients transported to level I and 63 964 patients transported to level II trauma centers available for analysis. Of the NTDB covariates initially considered for inclusion in the models, 38% had more than 40% missing values (eFigure). The final study population included 61 909 patients transported by helicopter and 161 566 by ground.
Patient demographics and characteristics are summarized in Table 1 and Table 2. The mean age was similar between level I and level II trauma centers. There were more men in the level I group (70.1%) than in the level II group (56.3%). Unadjusted mortality was significantly higher for those transported by helicopter (n=7813; 12.6%) than those by ground (n=17 775; 11%); however, a higher proportion of both level I and level II patients transported by helicopter had an ISS higher than 24. Patients transported by helicopter had statistically significantly higher heart rates and lower GCS motor scores, lower respiratory rates, and lower systolic blood pressure compared with patients transported by ground EMS (P < .001 for all 4 variables).
The results of the logistic regression models are listed in Table 3. Unadjusted mortality was higher for patients transported by helicopter to level I (ARR, 1.6%; 95% CI, 1.4%–1.7%) and level II trauma centers (ARR, 1.7%; 95% CI, 1.4%–1.8%). However, in all regression models, helicopter transportation was associated with a statistically significantly greater odds of survival. For level I patients, standard logistic regression revealed a greater odds of survival (OR, 1.31; 95% CI, 1.27–1.38; P < .001; ARR, 2.9%). This association remained stable when generalized estimating equations with robust variance calculations were applied (OR, 1.32; 95% CI, 1.20–1.45; ARR, 3.0%). In the propensity score–matching analysis, patients in the helicopter group had an attenuated increased odds of survival (OR, 1.16; P < .001) and narrower confidence intervals were observed (95% CI, 1.14–1.17) with a smaller ARR (1.5%, 95% CI, 1.4%–1.6%). For level II patients, similar results were observed in each model, with a greater odds of survival in the standard regression model (OR, 1.37; 95% CI, 1.28–1.48; P < .001; ARR, 4.3%) and the model accounting for clustering by trauma center (OR, 1.37; 95% CI, 1.23–1.53; P < .001; ARR, 3.4%). The survival benefit associated with helicopter transport remained statistically significant but was attenuated in the propensity score analysis (OR, 1.15; 95% CI, 1.13–1.17; P < .001; ARR 1.4%; Table 3 and eTable 3).
When considering patient disposition, the results in Table 4 suggest a higher injury severity in the helicopter group than in the ground transport group. Fewer patients in the helicopter groups were discharged home without services (47.6%) than in the ground transport group (57.3%; P < .001) at level I centers. A higher proportion of those transported by helicopter to level I trauma centers were discharged to rehabilitation (18.2% vs 12.7% in ground transport group) and to intermediate facilities (9.3% vs 6.5%, respectively). Fewer patients in the helicopter group left level II trauma centers against medical advice. Patients transported by ground services were more likely to be discharged from level I centers to a nursing home.
The results from this study indicate that helicopter EMS transport is independently associated with improved odds of survival for seriously injured adults. In regression analyses performed after propensity score matching for adult trauma patients, a 1.5% increased absolute rate of improved survival for 159 511 patients transported by helicopter vs ground emergency service to level I trauma centers was observed. For 63 964 patients transported to level II trauma centers, an absolute survival advantage of 1.4% was found for those transported by helicopter compared with ground transport. Thus, for patients transported to level I trauma centers by helicopter, 65 patients would need to be transported to save 1 life; for patients transported to level II trauma centers, the number needed to treat is 69.
These results are congruent, although more conservative, compared with the results of the few multivariate analyses comparing these 2 modes of EMS. Frankema et al26 found that after adjustment for injury severity, time of day, and other physiological variables, transportation of a highly trained medical crew to the scene was associated with a nonstatistically significant difference in survival (OR, 2.2; 95% CI, 0.92–5.9; P =.08). Blunt trauma patients transported by helicopter in the study by Frankema et al had a statistically significant survival improvement (OR, 2.8; 95% CI, 1.07–7.52; P=.04). However, Frankema et al only investigated the effect of helicopter delivery by a highly trained medical crew—including a physician—and did not evaluate the specific effect of helicopter transport of trauma patients to trauma centers.
Brown et al9 used a logistic regression model including the GCS score, demographics, prehospital times, vital signs, and other hospital variables such as intensive care unit admission and length of hospital stay to calculate the association of helicopter transport with a greater odds of survival (OR, 1.22; 95% CI, 1.18–1.27; P < .01). The authors acknowledged major limitations with this study because there were no adjustments for missing data or consideration for clustering by trauma center.
In another study using the 2007 NTDB data, Sullivent et al10 reported a lower odds of death for patients 18 to 54 years old transported by helicopter (OR, 0.51; 95% CI, 0.44–0.60; P < .001). However, Sullivent et al did not control for missing data. In a trauma-related ISS study by Mitchell et al,8 patients with an ISS of 12 or higher had a W score of 6.4, indicating 6.4 more survivors per 100 patients for patients transported by helicopter compared with patients transported by ground services. Although efforts were made to control for selection bias because all trauma patients who received tertiary care were accounted for, 84% of all patients transported by helicopter were transferred to a trauma center and not flown directly from the scene.
Thomas et al27 performed a retrospective, registry-based cohort study with 16 699 patients to investigate the role of helicopter transport for patients with blunt trauma. It was found to be associated with a statistically significant mortality reduction (OR, 0.96; 95% CI, 0.59–0.98; P=.03); however, Thomas et al only examined blunt trauma patients in the state of Massachusetts over a 4-year period.
In another registry-based study conducted by Stewart et al,11 propensity scores were used to calculate hazard ratios for patients transported by both helicopter and ground EMS in the state of Oklahoma. The propensity score in the study by Stewart et al was used as a composite variable in a multivariable regression analysis and was calculated by using multiple covariates, including prehospital vital sign data. Although overall mortality was 33% lower for patients transported by helicopter (hazard ratio [HR], 0.67; 95% CI, 0.54–0.84), no significant difference was found for patients with an ISS between 16 and 24 (HR, 0.96; 95% CI, 0.62–1.48).
Each of the above mentioned studies has methodological limitations related to the use of a regression model to estimate causal effects. In many cases, it is not clear how the assumptions for each of their models were tested because regression diagnostics were not reported. The potential correlation of patient outcomes across different trauma centers—ie, clustering—was not assessed in any of the studies except for Thomas et al.27 Thus, although helicopter transportation was shown to be beneficial, it is possible that the analyses did not fully adjust for known confounders. Indeed, when propensity scores were used in this study to adjust for known confounders, the strength of the association between helicopter and an improved odds of survival was diminished.
In observational studies, the use of logistic regression without balancing background covariates between the treatment and control group can produce biased estimates, especially if the imbalance of covariates is extreme or if the treatment effect is not constant across values of the covariates.20 A benefit in using propensity scores is that covariates that are speculated to be causing an imbalance between treated and control groups can be balanced and used in a postmatching analysis to control for selection bias and known confounders. Propensity scores have the advantage of producing more accurate effect estimates, especially when the outcomes of interest are relatively rare.28 For observational, nonrandomized studies, propensity scores represent one of the best available methods to adjust for baseline differences and to simulate the results of a randomized trial.29
Some limitations to this research are worth noting. The proportion of missing data in the NTDB is high for many variables. At least one study has suggested that some missing data in the NTDB are not missing at random30; this has implications for the imputation methods used to provide plausible values for missing data. Multiple imputation may be more advantageous for valid statistical inference and prevention of type I errors, especially when using large data sets,31 but this technique requires significant computing power and knowledge about how to combine the results of imputed values.32,33 Although multiple imputation has been used with the NTDB to impute missing physiological data34 and has been used in other studies to impute missing baseline covariate data,33 there is no substitute for actual data. In our study, we limited the use of multiple imputation to variables with no more than 20% missing data because this was the threshold established in a previous study that validated the use of multiple imputation for missing variables in the NTDB.23
In addition, the NTDB is described as a “convenience sample,” and contains a disproportionate number of larger hospitals with younger and more severely injured patients.12 Hence, the NTDB is not a population-based sample, so selection bias may be a problem because not all types of hospitals or patients may be represented.
Although the methods used for propensity score matching resulted in good balance among covariates, as assessed with numerical and graphical diagnostics, it is possible that differences attributed to helicopter vs ground EMS transport might have been due to unobserved confounders. This study represents an attempt to estimate the average treatment effect for patients transported by helicopter vs ground transport using national-level data from a large database and controlling for known confounders. Invariably, any beneficial effect of helicopter over ground transport is the result of some combination of speed, crew expertise, and disposition to a designated trauma center.4
Crew configuration and distance data are not available in the NTDB. If distance information were available, alternative methods, such as the use of instrumental variables, might be considered, potentially using distance as the instrument.35 Crew configuration varies regionally, ranging from the presence of a physician and nurse on the helicopter to a single paramedic. The effect of clinical interventions remains unknown, and in some studies, the presence of a physician failed to confer a survival or quality-of-life benefit.5, 36 Clustering of patients within each trauma center might be another source of residual bias; previous work with the NTDB has shown that failure to account for clustering may lead to artificially narrow confidence intervals.30 We attempted to control for this by using deidentified facility identifiers in both our propensity score analysis and generalized estimating equations analysis. We examined the role of total EMS time for helicopter and ground transportation (ie, time from dispatch to arrival in the ED) as an independent variable in a sensitivity analysis that included 42.2% of the data from the level I and level II groups. The results were not qualitatively different from our primary analyses, but any analysis based on EMS times from the 2007–2009 NTDB is highly likely to be biased due to the large amount of missing data, which cannot be assumed to be missing at random.30
The survival benefit found in this study may be the result of not only the covariates available in the NTDB but also other unmeasured variables. It is not clear which aspect of helicopter transport is responsible for the mortality benefit in this highly stratified sample. Future studies should investigate specific components of helicopter EMS such as prehospital interventions, total prehospital time, crew configuration, and distance as factors that may in part or whole explain the benefit of helicopter EMS for adults with major trauma because understanding the effectiveness of each may help determine which patients benefit most from this resource. To date, the development and use of effective prehospital triage tools that can identify adults with a high ISS have remained elusive.37 Future studies should focus on efficient and user-friendly prehospital assessment tools to properly identify injured adults who will be the most likely to benefit from helicopter transport.
Additional outcomes besides mortality, such as health-related quality of life, should also be considered in future helicopter EMS transportation studies.38 Because it one of the most expensive interventions in contemporary health care, cost must be considered. In a recent systematic review, the annual cost of helicopter transportation ranged from $114 777 to $4.5 million per institution.39 Five studies showed helicopter transportation to be a more expensive transport modality whereas 8 studies indicated that the cost per life-year saved ranged from $2227 to $3292 per trauma.35 In the state of Maryland, the average estimated cost of a helicopter transport is $5000.40 Using the number needed to treat estimate of 65 for patients transported to level I trauma centers, approximately $325 000 would have to be spent to save 1 life. However, this figure does not account for the number needed to treat to prevent disability or other health-related quality-of-life outcomes. Indeed, helicopter transportation represents 1 of the highest cost prehospital modalities used in contemporary trauma care. Hence, policy makers should consider funding a formal cost-effectiveness analysis to help inform policy decisions regarding its use.
Because it is highly unlikely that a randomized clinical trial will be practicable or endorsed by the public to further study the effectiveness of helicopter EMS for adults with major trauma, future studies to estimate its treatment effects for trauma patients will need to rely on designs and techniques that control for selection biases and confounding. Propensity score–based methods appear to be viable in achieving balance among covariates and in making valid statistical inference. Given the attenuation of the association between helicopter EMS and survival observed with propensity score matching, future studies should use this technique and account for missing data or risk overestimating treatment effects.
Among patients with major trauma admitted to level I or level II trauma centers, transport by helicopter compared with ground EMS was associated with improved survival to hospital discharge after controlling for multiple known confounders.
Funding/Support: Dr Galvagno was funded in part by an institutional training grant (T-32 Ruth Kischstein grant) from the National Institutes of Health when this study was initiated as part of his PhD program at Johns Hopkins Bloomberg School of Public Health. Dr Haut receives support from a mentored clinical scientist development award (1K08HS017952-01) from the Agency for Healthcare Research. Dr Haider receives support from the American College of Surgeons (C. James Carrico fellowship in trauma and critical care) and from the National Institute of General Medical Sciences (career development award K23GM093112-01).
Role of the Sponsors: None of the funding sources had a role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript.
Conflict of Interest Disclosures: The authors have completed and submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. Dr Haut reported that he receives royalties from Lippincott Williams & Wilkins for Avoiding Common ICU Errors; and has provided expert testimony in various medical malpractice cases. No other disclosures were reported.
Online-Only Material: The eFigure, eTables 1–3, and the Author Video Interview are available at http://www.jama.com.
Author Contributions: Dr Galvagno had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.Study concept and design: Galvagno, Haut, Millin, Efron, Koenig, Pronovost, Haider.
Acquisition of data: Galvagno, Haut, Zafar, Haider.
Analysis and interpretation of data: Galvagno, Zafar, Baker, Bowman, Haider.
Drafting of the manuscript: Galvagno, Haut, Milllin, Efron, Baker, Bowman, Pronovost, Haider.
Critical revision of the manuscript for important intellectual content: Galvagno, Koenig, Baker, Bowman, Pronovost, Haider.
Statistical expertise: Galvagno, Zafar, Bowman, Haider.
Administrative, technical, or material support: Galvagno, Haut, Efron, Pronovost, Haider.
Study supervision: Galvagno, Haut, Haider.