A challenging feature of outcomes prediction is the capability of measuring and including numerous potential predictor variables. The inclusion of numerous predictors, some or many of which provide little or no predictive power to a model, reduces the overall significance of the model. For example, as additional predictors are added to a regression model, the R-squared statistic generally increases, but the adjusted R-squared, which includes a penalty for each additional predictor in the model, may sharply decrease if the added predictor(s) do not significantly contribute added value to the true predictive power of the model.

A more serious drawback of inclusion of too many predictor variables is that two or more of the predictors may not be independent of one another, but rather may be correlated with one another, because they are driven at least in part by the same underlying principle or mechanism. If two or more predictor variables that are significantly correlated with one another are considered within a regression analysis, there may be a resulting spurious lack of statistical significance of one or more of these predictor variables, inaccurate regression coefficients, and erroneous conclusions regarding the relationships among the independent and predictor variables.

Principal component analysis is an accepted method for simplifying a set of numerous variables into a smaller fundamental set of variables that (1) are not correlated with one another, and (2) contain most of the variability in the predictor variables (Shlens,

2009). In the current study, reduction of numerous predictor variables to a smaller number of uncorrelated predictors had several advantages over prior models: (1) demonstration of the collinearity among certain predictor variables, which in itself yields important insights, (2) reduction of collinear variables to a smaller set of independent (uncorrelated) predictors, allowing for more stable and consistent regression results, (3) corroboration of certain key aspects of the prior Rotterdam CT classification, and (4) improvement of predictive power over previous outcomes models, through the computer-aided quantitative analysis of the admission head CT.

Our initial “global” exploratory PCA performed on the set of variables including both GOS-E and predictor variables demonstrated not only a significant correlation of GOS-E with certain predictors, but also indicated significant covariance among certain predictor variables themselves. This finding justified a second PCA analysis, which was performed on the predictor variables alone.

The second PCA analysis yielded important insights into underpinnings of previous successful outcomes models, such as the Rotterdam classification. In the PCA analyses of clinical and CT predictors, principal components that clearly corresponded to intracranial mass effect were identified. These components were characterized by significant contributions from midline shift, basal cistern effacement, and subdural hematoma volume, and little contribution from other variables. These mass effect principal components, governed primarily by subdural hematoma size, midline shift, and cistern effacement, were shown to be powerful predictors of 6-month GOS-E scores. This result strongly corroborates the Rotterdam CT classification, in which features of mass effect account for 3 points in a maximum possible score of 6 points in the prediction of likelihood of 6-month mortality after TBI.

Several other observations of the loading coefficients of the PCA analysis of clinical and qCT predictors were also intuitively satisfying. In particular, GCS was negatively correlated with the severity of cistern effacement. The volume of subdural hematoma was significantly correlated with degree of midline shift and with severity of basal cistern effacement, but was not significantly correlated with volume of subarachnoid/intraparenchymal hemorrhage. Although epidural hematoma volume was not found to be a statistically significant predictor in our model, this was also most likely attributable to the small number of epidural hematoma cases; a trend toward significance (*p*=0.25) was observed (). Similarly, although subarachnoid/intraparenchymal hemorrhage volume was not a statistically significant predictor, a trend toward significance (*p*=0.06) was demonstrated (). Patient age, epidural hematoma volume, and subarachnoid/intraparenchymal hemorrhage volume were not significantly correlated with each other or with other predictors.

Our study also shows that with incorporation of quantitative rather than qualitative CT features, the results of logistic regression of 6-month GOS-E score upon clinical and CT features improve substantially (). Two of the best-known head CT classification systems in acute TBI are the Marshall CT classification and subsequent Rotterdam classification. The Marshall CT classification (Marshall et al.,

1991) divided severe head trauma patients into 6 groups according to head CT findings, and has been widely used for descriptive purposes; later, it was also used for prediction of mortality. The subsequently developed Rotterdam CT classification (Maas et al.,

2005) achieved an improved discriminative value for the prediction of long-term outcome (6-month mortality) through regrouping of some CT features underlying the Marshall classification, and inclusion of CT evidence of epidural hematoma and traumatic subarachnoid hemorrhage as additional predictors in the model. Although the Marshall and Rotterdam classifications have demonstrated prognostic power, the head CT features in these classification schemes are qualitative features that are susceptible to observer bias. Some features, particularly the assessment of basilar cistern effacement and subfalcine herniation (Maas et al.,

2007), and the presence or absence of parenchymal contusions/hematomas (Chun et al.,

2007; Laalo et al.,

2009) have been demonstrated to be limited by interobserver variability. In addition, trace quantities of intracranial hemorrhage in these classifications are not differentiated from larger volumes of hemorrhage, even though it has been suggested that in milder TBI, the volume of intracranial subarachnoid hemorrhage may be a marker of overall severity of brain injury (Chieregato et al.,

2005). We demonstrate that use of quantitative rather than qualitative CT features along with age and GCS as predictors results in a substantial increase in the adjusted R-squared, to 0.50 from 0.43 (). Thus, qCT features, age, and GCS accounted for 50% of the variability in 6-month GOS-E score after acute head injury, compared to 43% when qualitative CT features, age, and GCS were used.

The purpose of the current study was not to construct a comprehensive outcome model, as in prior studies (Hukkelhoven et al.,

2005; MRC CRASH Trial Collaborators,

2008; Murray et al.,

2007; Perel et al.,

2009; Steyerberg et al.,

2008). Rather, we sought primarily to compare the effectiveness of quantitative versus qualitative CT measures in predicting outcome, using a fixed set of clinical features that have previously been established as strong predictors of outcome. Steyerberg and associates (

2008) did essentially the inverse, using a fixed CT classification system (Marshall CT classification) as a predictor, while varying the clinical and laboratory predictors. Other previous models of outcomes have also used different inclusion criteria and different statistical approaches, making direct comparison of our results to these prior studies difficult. The three-tier outcomes prediction model by Steyerberg and colleagues (

2008) for the prediction of mortality and GOS score at 6 months after injury was developed using the IMPACT database. A core model based only on age, motor GCS score, and pupillary reactivity provided baseline discriminatory ability with areas under the curve (AUCs) ranging from 0.66–0.84. Slightly higher discriminative ability with AUCs ranging from 0.71–0.87 could be achieved through augmentation by additional tiers of predictors, including Marshall CT classification, hypotension, hypoxia, presence or absence of epidural hematoma, presence or absence of traumatic subarachnoid hemorrhage, and serum glucose and hemoglobin on admission. Binomial logistic regression and proportional odds logistic regression analysis were used for that study, appropriate for the bivariate outcome (fatality versus nonfatality at 6 months), and ordinal outcome (GOS score at 6 months) measures considered in that study. Although our study has some similarities, a direct quantitative comparison of our results to these is difficult, as this and many prior studies (Eisenberg et al.,

1990; Jacobs et al.,

2010b,

2011; Hukkelhoven et al.,

2005; Maas et al.,

2005;

2007; Steyerberg et al.,

2008) or meta-analyses (Chestnut et al.,

2000; Husson et al.,

2010) included only moderate and severe head-injury patients, while fully half of our study population consisted of mild head injury (GCS score 13

*p*=15), with the other half divided between moderate and severe head injury (GCS score 3–12). Furthermore, prior studies have used 6-month mortality (Maas et al.,

2005; Nelson et al.,

2010; Steyerberg et al.,

2008), or a dichotomized version of the 6-month GOS score (Nelson et al.,

2010; Perel et al.,

2008; Steyerberg et al.,

2008), in contrast to the 6-month GOS-E score used in the current study that includes the wide spectrum of mild, moderate, and severe head injury.

A recent study did analyze the relationship between 6-month mortality and quantitative CT measures of hematoma volume and midline shift in 605 patients, finding a monotonically increasing relationship between mortality at 6 months and both hematoma volume and midline shift (Jacobs et al.,

2011). Our findings corroborate these results, but suggest through a global simultaneous analysis of numerous CT features through PCA that subdural hematoma volume, midline shift, and basal cistern effacement are not independent predictors, but are significantly correlated with one another. Our results, in fact, are very closely related to those in the relatively few prior studies that have used a multivariate approach that explicitly takes into account the covariance among clinical and CT predictors (Eisenberg et al.,

1990; Nelson et al.,

2010), rather than univariate or stepwise multivariate regression approaches that do not. Recently, Nelson and colleagues (

2010) performed a detailed analysis of features of the admission head CT and their relationship to outcome; as part of this analysis, they demonstrated a significant interdependence of certain quantitative CT predictors such as midline shift and hematoma volume (measured manually by a radiologist). They concluded, for example, that a separate measure of hematoma volume is redundant if quantitative midline shift is included, because of the strong collinearity between these predictors. Furthermore, they found that quantitative midline shift, measured manually by the radiologist, was the single strongest CT predictor affecting outcome. Although our approach differs from that of Nelson and colleagues (

2010) in that (1) our quantitative measures of midline shift and hematoma volume are derived from computer analysis of images, and (2) Nelson and associates (

2010) used a very different statistical technique, the support vector machine, rather than PCA to demonstrate strong covariances among quantitative CT predictor variables, our results are strikingly similar. It is intuitively appealing from these results that the three CT predictors of midline shift, hematoma volume, and basal cistern effacement may be important predominantly as surrogates for a more fundamental predictor such as prolonged elevated intracranial pressure.

A limitation of our study is that the sample size in our study is far smaller than that analyzed in many prior studies, such as the Marshall CT classification study (746 patients; Marshall et al.,

1991), Rotterdam CT study (2269 patients from the tirilazad trials; Maas et al.,

2005), the CRASH trial (10,008 patients; Perel et al.,

2008), the Karolinska study (861 patients; Nelson et al.,

2010), and the IMPACT study (8509 patients; Steyerberg et al.,

2008). Additional important limitations are the rate of loss-to-follow-up or denial of the GOS-E interview of 40%, and exclusion of an additional 17% of the initial cohort whose admission head CT could not be evaluated by the software due to the presence of contrast, non-traumatic acute intracranial hemorrhage, severe motion artifact, or other technical difficulties, raising concern for potential effects of selection bias on our results. Patients ≤16 years old or >75 years old were not included in the study, and our results therefore cannot be generalized to those at the extremes of age. Finally, the initial cohort consisted only of patients admitted to the neurosurgical intensive care unit (ICU), and therefore would tend to exclude milder head injuries from the study population. Finally, it is certainly possible that the quantitative parameters we present here (hematoma volume and quantitative midline shift) actually derive their predictive value mainly as surrogates for a confounding factor of which we are not aware (e.g., not intracranial mass effect, but a different fundamental parameter such as coagulopathy, which could result, for example, in a larger rate of hematoma growth, and thus larger measured hematoma volumes and/or midline shifts at any given time point when averaged across patients).

In summary, in the current study, we applied two new approaches to the problem of TBI classification and outcomes prediction. Our first strategy was to pursue a more objective, quantitative description of the abnormal features of the admission head CT. More quantitative, objective analyses of brain imaging studies in head trauma, with development of standardized metrics for abnormal features, may improve prediction of outcome as well as more immediate triage to appropriate early treatment. Second, we employed principal component analysis, first in a global exploratory analysis for possible relationships among outcome and predictor variables, and second in an analysis of the predictor variables alone as groundwork for construction of an ordinal logistic regression model of the 6-month GOS-E. Reduction of numerous predictor variables, including quantitative rather than qualitative CT predictors, to a smaller set of independent predictors demonstrated a substantial collinearity among certain predictor variables, corroborating the prior Rotterdam CT classification and more recent work of Nelson and associates (Nelson et al.,

2010), while demonstrating an improvement in predictive power through the computer-aided quantitative analysis of CT images. The finding of an improvement in predictive power using quantitative CT features, and employing two different types of statistical analysis (global PCA on outcome and predictors, and PCA on predictors followed by logistic regression), is a promising result. Given the small study population and modest follow-up rate, this should be regarded as hypothesis-generating work that warrants further investigation. Future directions will include improvement of software accuracy using enhanced algorithms and possible exploration of different CT slice-reconstruction methods, and validation of our results in a larger multicenter population with a higher follow-up rate to lessen the effects of selection bias. Study of a larger population will also allow consideration of a larger variety of clinical variables in conjunction with qCT predictors. We believe that further development and application of objective computational tools and data-driven analytical methods hold great promise for neurotrauma research and may ultimately have a role in image analysis for clinical care.