|Home | About | Journals | Submit | Contact Us | Français|
The admission noncontrast head computed tomography (CT) scan has been demonstrated to be one of several key early clinical and imaging features in the challenging problem of prediction of long-term outcome after acute traumatic brain injury (TBI). In this study, we employ two novel approaches to the problem of imaging classification and outcome prediction in acute TBI. First, we employ the novel technique of quantitative CT (qCT) image analysis to provide more objective, reproducible measures of the abnormal features of the admission head CT in acute TBI. We show that the incorporation of quantitative, rather than qualitative, CT features results in a significant improvement in prediction of the 6-month Extended Glasgow Outcome Scale (GOS-E) score over a wide spectrum of injury severity. Second, we employ principal components analysis (PCA) to demonstrate the interdependence of certain predictive variables. Relatively few prior studies of outcome prediction in acute TBI have used a multivariate approach that explicitly takes into account the potential covariance among clinical and CT predictive variables. We demonstrate that several predictors, including midline shift, cistern effacement, subdural hematoma volume, and Glasgow Coma Scale (GCS) score are related to one another. Rather than being independent features, their importance may be related to their status as surrogate measures of a more fundamental underlying clinical feature, such as the severity of intracranial mass effect. We believe that objective computational tools and data-driven analytical methods hold great promise for neurotrauma research, and may ultimately have a role in image analysis for clinical care.
Noncontrast head computed tomography (CT) is the imaging test of choice in the initial rapid assessment and triage of acute head trauma patients. The initial head CT directly guides immediate management decisions, including the need for surgical decompression or for invasive intracranial pressure (ICP) monitoring. Follow-up head CT at 2–6h after injury is often performed to assess for growth of intracranial hematomas or brain herniation requiring surgical decompression.
In addition to its role in immediate management decisions, the initial noncontrast head CT is known to carry longer-term prognostic significance following acute traumatic brain injury (TBI). Previous studies have converged upon several key early clinical and imaging parameters that are most highly predictive of long-term outcome: age, components of the Glasgow Coma Scale (GCS) score at admission, pupillary reactivity, hypotension, hypoxia, and certain features of the admission head CT (Chestnut et al., 2000; Eisenberg et al., 1990; Hukkelhoven et al., 2005; Husson et al., 2010; Jacobs et al., 2010a,2010b,2011; Maas et al., 2007,2005; Marshall et al., 1991; MRC CRASH Trial Collaborators, 2008; Murray et al., 2007; Nelson et al., 2010; Perel et al., 2009; Steyerberg et al., 2008;).
In the current study, we focus on two new approaches to the problem of imaging classification and outcomes prediction in acute TBI. Our first strategy focuses on features of the admission head CT. The best-known head CT classification systems in acute TBI, the Marshall CT classification (Marshall et al., 1991) and more recent Rotterdam classification (Maas et al., 2005), are based on qualitative features of the admission head CT. These include the presence or absence of any acute traumatic intracranial abnormality (Marshall et al., 1991), presence or absence of significant midline shift, usually defined as exceeding 5mm (Maas et al., 2005; Marshall et al., 1991), presence or absence and severity of basal cistern effacement (Maas et al., 2005; Marshall et al., 1991), presence or absence of a large intracranial hematoma, defined as a hematoma with volume exceeding 25 cubic centimeters (Marshall et al., 1991), presence or absence of acute traumatic subarachnoid hemorrhage (SAH; Maas et al., 2005), and presence or absence of epidural hematoma (EDH; Maas et al., 2005). We explore the feasibility and potential strengths of quantitative CT (qCT), a quantitative approach to the evaluation and description of the abnormal features of the admission head CT. In particular, we use computer-aided image analysis to derive quantitative measures of those qualitative or subjective CT features that have been previously validated as predictors of outcome (Chestnut et al., 2000; Eisenberg et al., 1990; Hukkelhoven et al., 2005; Husson et al., 2010; Jacobs et al., 2010b,2011; Maas et al., 2005,2007; Marshall et al., 1991; MRC CRASH Trial Collaborators, 2008; Murray et al., 2007; Nelson et al., 2010; Steyerberg et al., 2008), and explore the potential of these quantitative CT features to improve outcome prediction. It has been proposed in general that more quantitative, objective analysis of brain imaging studies in head trauma, with development of standardized metrics for abnormal features, may improve prediction of outcome, as well as triage to more appropriate early treatment (Saatman et al., 2008).
Our second strategy is to apply principal component analysis (PCA) to predictor variables prior to constructing a regression model of outcome on those variables. A challenging feature of outcomes prediction in general is the existence of numerous potential predictor variables. Many predictors are not independent, but rather are correlated with one another because they are governed at least in part by the same underlying mechanism. The inclusion of variables that are significantly correlated with one another can invalidate the results of regression analyses. In particular, this approach may result in a spurious lack of statistical significance of one or more predictor variables, inaccurate regression coefficients, and erroneous conclusions regarding relationships among the dependent and predictor variables. In the current study, we use PCA to demonstrate partial collinearity of certain predictor variables. Then, through reduction of a set of clinical and quantitative CT predictors to a smaller number of non-covarying predictors using PCA, we both corroborate certain key features of the prior Rotterdam CT classification, and demonstrate improved predictive power through the use of computer-aided quantitative image analysis.
One hundred ninety-one patients (ages 13–97 years) admitted to the neurosurgery intensive care unit of San Francisco General Hospital for acute closed head trauma from September 2008 to June 2009 were scheduled for assessment of the Extended Glasgow Outcome Scale (GOS-E) score at 6 months following injury. Admission GCS scores ranged from 3–15, with a mean of 10.6, standard deviation of 4.4, and median of 13. Of these, 115 patients (60%) completed the GOS-E assessment at 6 months, 71 patients (37%) were lost to follow-up, and 5 patients (3%) declined the GOS-E interview. Of the 115 patients who completed GOS-E assessment at 6 months, 2 patients with a concurrent diagnosis of acute aneurysm rupture at the time of TBI, 4 patients whose earliest head CT had been performed after acute surgical intervention, 2 patients whose initial head imaging consisted of post-contrast head CT, 22 patients at the extremes of age (≤16 years of age or >75 years of age), and 19 patients whose head CT studies demonstrated motion, severe rotation, or exclusion of a portion of the head from the field of view or other technical difficulties precluding automated computer evaluation, were excluded. For the final study population of 66 patients, admission GCS scores ranged from 3–15, with a mean of 11.2, standard deviation of 4.2, and median of 13. All GOS-E scores were assigned through structured interviews performed by a research assistant or research nurse who generally had no specific knowledge of the patient's admission head CT results or admission GCS score, although he or she was generally aware of the patient's age or age range. The interviewers had no knowledge of the hypotheses investigated in this study regarding the use of quantitative CT for outcome prediction.
We used a suite of computer algorithms written in the MATLAB 7.0.1 programming environment (The Mathworks, Natick, MA) and described previously (Yuh et al., 2008), to perform semiautomated evaluation of the initial trauma head CT obtained upon admission for each of the following features: volume of acute subarachnoid hemorrhage (SAH) and intraparenchymal hemorrhage (IPH), volume of acute subdural hemorrhage (SDH), volume of acute epidural hematoma (EDH), midline shift defined as the shift of the centroid of the lateral ventricles relative to the midline falx plane, and volume of the basal cisterns. The software displays these features as color overlays on the original noncontrast head CT data (Fig. 1). For example, in Figure 1 the software depicts suspected SDH and SAH. Each pixel corresponding to acute hemorrhage corresponds to a volume per pixel of 0.923 cubic millimeters, based on a field of view (FOV) of 22cm, 512×512 matrix, and slice thickness 5mm for each head CT image. Normal structures such as the dorsum sella and midline falx plane are identified automatically, and allow localization and automated measurement of the basal cistern volume and midline shift.
Previous work (Yuh et al., 2008) described sensitivity and specificity of the software for acute intracranial hemorrhage, midline shift, and basal cistern effacement, when used in a fully automated fashion. As the original software was designed to have very high sensitivity for acute intracranial hemorrhage at the cost of lower specificity, for the current study we used the software in semiautomated, supervised fashion, allowing a neuroradiologist to either confirm or invalidate each computer designation of an area of acute intracranial hemorrhage. The basis for this was that any spurious intracranial hemorrhage identified by the software would add variability to predictor variables and decrease the strength of the model. To investigate the interobserver reliability of these supervised software interpretations, a second board-certified neuroradiologist reviewed the head CT software interpretations for a random selection of 10 of the 66 study subjects, and classified 77 areas of software-detected acute intracranial hemorrhage as subdural, subarachnoid, or epidural in location, or likely artifactual. Cohen's kappa for agreement between the two neuroradiologists was 0.92 (p<10–30). The only discrepancies were in 4 of 77 areas, each less than 0.5 cubic centimeter in volume; three of these were designated as subarachnoid in location by one neuroradiologist and subdural in location by the other; and one was designated artifactual by one neuroradiologist and subdural in location by the other.
Statistical analyses were performed using SPSS Statistics 18 (SPSS, Chicago, IL) with OMS syntax for bootstrapping analysis. First, for exploratory purposes, we applied PCA to the entire set of variables consisting of both the outcome measure (GOS-E at 6 months), and the predictor variables (age, admission GCS score, and CT features). PCA is a method for uncovering covariances among variables in a data set. Our aim in this first analysis was to use PCA (1) in a global, exploratory fashion to determine which predictor variables, if any, were strongly correlated with GOS-E at 6 months, and (2) to determine whether covariances among the predictor variables themselves (independent of the outcome measure) were significant. The initial, “exploratory” PCA was performed on two sets of variables. One set of variables was comprised of (1) GOS-E score at 6 months, (2) patient age, (3) admission GCS score, and (4) the component features of the Rotterdam CT classification, consisting of qualitative assessments of the admission head CT for the presence or absence of midline shift exceeding 5mm, the presence or absence and severity of basal cistern effacement, the presence or absence of acute traumatic SAH, and the presence of absence of EDH. The component features of the Rotterdam CT classification were determined by review of CT studies by a board-certified neuroradiologist (E.L.Y.), without concurrent access to the patient's age, admission GCS score, or 6-month GOS-E score. The other set of variables consisted of GOS-E score at 6 months, age, GCS score, presence or absence and severity of basal cistern effacement, and quantitative CT parameters, including the volumes of subarachnoid, subdural and epidural hemorrhage, and severity of midline shift. The exploratory PCA analysis was coupled to a bootstrapping procedure to mitigate the impact of spurious relationships among variables and to increase the reliability of pattern detection.
We employed PCA analysis again, this time for the purpose of reducing the set of predictor variables to an equivalent, non-covarying set of variables. We emphasize that for this second PCA analysis, the outcome measure (GOS-E at 6 months) was omitted from the variable set. This is because the aim of this second PCA analysis was to transform the original, covarying predictor variables to an equivalent set of independent predictor variables suitable for use in construction of a regression model for the prediction of 6-month GOS-E. PCA was thus again performed on two sets of candidate predictor variables. The first set of predictor variables consisted of (1) patient age, (2) admission GCS score, and (3) the qualitative component features of the Rotterdam CT classification as described above. The second set of predictor variables consisted of age, GCS score, and quantitative CT parameters, including volumes of subarachnoid, subdural and epidural hemorrhage, and severity of midline shift. As before, both sets of candidate predictor variables included a qualitative assessment of basal cistern effacement (none, partial effacement, or complete effacement).
Following principal component analysis of each of the two sets of predictors, ordinal logistic regression of the 6-month GOS-E score upon each of the two sets of PCA-derived predictors was performed using a logit link function. Significance, R-squared, coefficients for the ordinal logistic regression model, and their corresponding p values and 95% confidence intervals were determined for each of the two models.
Figures 2 through through44 demonstrate characteristics of the study population. Figure 2b shows that slightly over half of the study population had an admission GCS score between 13 and 15, with the remainder divided between GCS scores in the moderate (9–12) and severe (3–8) head injury categories.
Figure 3 demonstrates qualitative features of the admission head CT for the study population, including all of the individual component features of the Rotterdam CT classification. Subdural hematoma was identified in over half of patients (Fig. 3a). Subarachnoid and/or intraparenchymal hemorrhage was seen in over 70% of patients (Fig. 3b). Epidural hematoma occurred in fewer than 10% of cases (Fig. 3c), while slightly over 10% of patients had midline shift exceeding 5mm (Fig. 3d). Figure 3e shows that approximately 25% of patients had some basal cistern effacement, divided evenly between partial and complete effacement.
Figure 4 also demonstrates admission head CT results. Unlike Figure 3, however, Figure 4 shows quantitative, rather than qualitative, versions of the CT features. These include volumes of subdural hematoma (Fig. 4a), subarachnoid/intraparenchymal hemorrhage (Fig. 4b), and epidural hematoma (Fig. 4c). Quantitative midline shift consisted of the measured displacement of the centroid of the lateral ventricles relative to the falx cerebri plane (Fig. 4d). The largest midline shift was greater than 1.5cm (Fig. 4d). Over half of patients had at least trace subdural hemorrhage (Fig. 4a). Subdural hemorrhages ranged from very small volumes up to nearly 40 cubic centimeters; small subdural hemorrhages up to 5 cubic centimeters were most common (Fig. 4a). Though seen in over 70% of patients, subarachnoid/intraparenchymal hemorrhage was observed most commonly in small quantities up to 3 cubic centimeters (Fig. 4b).
Figure 5 shows the distribution of the outcome measure, the GOS-E score at 6 months after head injury. The mean 6-month GOS-E score was 5.2 with a standard deviation of 2.5, and median of 6.
Tables 1 and and22 show results from the “exploratory” PCA applied to the entire set of variables consisting of both 6-month GOS-E and quantitative CT features. We used a bootstrapping technique, performing 2000 iterations of the analysis, with each iteration carried out on a random sample of 60% of the original cases. Table 1 shows the loading matrix and Table 2 the eigenvalues and percentage of variance explained by each of the components, where each value in these tables is derived from averaging over the 2000 iterations. Although the size of our study population is relatively small, prior highly-cited work in PCA (Nunnally, 1978; Tabachnkik and Fidell, 2007) has suggested 5 to 10 samples for each predictor variable is adequate; this criterion is satisfied in our analysis.
Examination of the eigenvalues and percent variance explained (Table 2) demonstrates the presence of 2 principal components with eigenvalues equal to or greater than 1.0. The first principal component accounts for 41% of the variance in the data. The loading matrix (Table 1) demonstrates significant loading (coefficient ≥0.3) on this first principal component by the following variables: 6-month GOS-E score, GCS score, subdural hematoma volume, quantitative midline shift, and cistern effacement. The loading of subarachnoid hemorrhage/hemorrhagic contusion volume on the first component is 0.22, falling short of the conventional 0.3 threshold for significant loading (Tabachnick and Fidell, 2007). Table 2 shows that the second principal component accounts for an additional 16% variance in the data; Table 1 demonstrates strong loading on this second principal component by epidural hematoma size, with smaller loading by the patient's age. The significant co-loadings by GOS-E, GCS, and certain quantitative CT features on the first principal component in this case suggests that (1) certain quantitative CT features are strongly correlated with, and therefore should be predictive of, the outcome measure (6-month GOS-E), and (2) there is, in addition, a substantial relationship (covariance) among several clinical and quantitative CT predictor variables themselves, making their direct unaltered use in a regression model for prediction of 6-month GOS-E possibly problematic.
Tables 3 and and44 show results for a similar PCA analysis applied to the set of variables consisting of 6-month GOS-E, age, GCS score, and qualitative CT features. Examination of the eigenvalues and percent variance explained (Table 4) demonstrates the presence of 3 principal components with eigenvalues equal to or greater than 1.0. This global PCA analysis of clinical and qualitatitve CT features again demonstrates an interdependence of 6-month GOS-E, GCS, and qualitative CT predictors related to intracranial mass effect (SDH volume, shift, and cistern effacement). Although similar to the results for quantitative CT, the loading coefficients for GOS-E and quantitative CT in Tables 1 and and22 were larger. This suggests that the 6-month GOS-E is more strongly correlated with quantitative CT predictors than with qualitative CT predictors. Therefore, a regression model of GOS-E on age, GCS, and quantitative CT features should be stronger than one employing only qualitative CT features.
The initial global exploratory PCA revealed a substantial covariance among certain clinical and qCT predictors: subdural hematoma volume, quantitative midline shift, and basal cistern effacement, and GCS. This observation is separate and independent of the finding of a substantial covariance of the outcome measure (6-month GOS-E) with these predictors. Therefore, we next performed PCA on the predictors alone, to further elucidate the relationship among these predictors, as well as to construct a new set of non-covarying predictors suitable for use in a regression model for prediction of 6-month GOS-E.
Table 5 shows the loading matrix resulting from PCA applied to predictor variables only, comprised of age, admission GCS, and quantitative CT features. This PCA analysis of clinical factors and qCT features yielded a set of 7 components. The components were of two types: those driven primarily by one feature and those driven by a combination of features. A principal component that is driven purely by a single predictor demonstrates a coefficient of exactly 1.00 for a single predictor, and a coefficient of 0.00 for all other predictors under consideration. This idealized scenario does not arise in the situation of real experimental data. However, we can define, for the purpose of this discussion, a principal component driven predominantly by a single predictor as one that is characterized by a single very large coefficient (≥0.98), and no other coefficient ≥0.3. For example, Table 5 shows that components 3, 5, and 6 were each driven predominantly by a single feature (age, SAH/IPH volume, and epidural hematoma volume, respectively), indicating that these features had little collinearity with each other or with other predictors. In contrast, components 1, 2, 4, and 7 consisted of significant contributions from more than one qCT predictor, including subdural hematoma volume, severity of midline shift, and severity of basal cistern effacement, indicating multicollinearity among these predictors. Component 2 demonstrated significant collinearity of subdural hematoma volume, cistern effacement, and severity of midline shift, and Component 1 demonstrated a negative correlation of admission GCS with severity of cistern effacement; both of these result from an entirely data-driven PCA approach, yet are also intuitively satisfying.
Table 6 shows the loading matrix resulting from PCA analysis of age, GCS, and qualitative features of the admission CT, including all components of the Rotterdam CT classification. Components 4, 5, and 6 were each driven predominantly by a single feature (subarachnoid/intraparenchymal hemorrhage, epidural hematoma, and age, respectively). As expected, there was re-demonstration of a negative correlation between GCS score and severity of basal cistern effacement, as seen in Component 2.
Table 7 compares the results of ordinal logistic regression of 6-month GOS-E score upon clinical and CT features. Both qualitative CT and quantitative CT models were statistically significant (p≤10–5). With incorporation of quantitative rather than qualitative CT features, the Nagelkerke R-squared improved from 43% to 51%. Thus, quantitative CT features, age, and GCS account for approximately 51% of the variability in 6-month GOS-E scores after acute head injury, compared to 43% when qualitative CT features, age, and GCS are used.
To address the issue of whether the difference in prediction strength between the quantitative and qualitative CT models is more than could be expected by chance, we performed permutation testing. First, we calculated the absolute values of the residuals for each of the two different models. The mean difference between the absolute values of the residuals for the quantitative and qualitative CT models was 0.27, signifying that the mean improvement in GOS-E prediction per patient by the quantitative CT model relative to the qualitative CT model, averaged across all 66 subjects, was 0.27. We then constructed a permutation distribution of the mean difference, based on 105 random permutation resamples of the 66 subjects with the assumption that there is no difference between the two models. The p-value was 0.03 for the likelihood that the difference in means of the absolute values of the residuals between the two models would equal or exceed 0.27 by random chance.
Table 8 shows the estimated B coefficients and corresponding significance levels in the ordinal logistic regression of 6-month GOS-E on age, GCS, and quantitative CT predictors using a logit link function. As shown, Components 1, 3, 4, and 7 (in boldface) are statistically significant predictors of the dependent variable, the 6-month GOS-E, in this model. The major clinical and quantitative CT predictors that drive each component (extracted from Table 5) are summarized in the second column. As shown, Component 4, driven primarily by subdural hematoma volume, and to lesser degree midline shift, demonstrates the largest magnitude of the B coefficient (with corresponding odds ratio, exp-B, per unit reduction in the 6-month GOS-E). Components 1, 3, and 7, driven predominantly by GCS, age, and cistern effacement, also demonstrated significant contributions to the model.
A challenging feature of outcomes prediction is the capability of measuring and including numerous potential predictor variables. The inclusion of numerous predictors, some or many of which provide little or no predictive power to a model, reduces the overall significance of the model. For example, as additional predictors are added to a regression model, the R-squared statistic generally increases, but the adjusted R-squared, which includes a penalty for each additional predictor in the model, may sharply decrease if the added predictor(s) do not significantly contribute added value to the true predictive power of the model.
A more serious drawback of inclusion of too many predictor variables is that two or more of the predictors may not be independent of one another, but rather may be correlated with one another, because they are driven at least in part by the same underlying principle or mechanism. If two or more predictor variables that are significantly correlated with one another are considered within a regression analysis, there may be a resulting spurious lack of statistical significance of one or more of these predictor variables, inaccurate regression coefficients, and erroneous conclusions regarding the relationships among the independent and predictor variables.
Principal component analysis is an accepted method for simplifying a set of numerous variables into a smaller fundamental set of variables that (1) are not correlated with one another, and (2) contain most of the variability in the predictor variables (Shlens, 2009). In the current study, reduction of numerous predictor variables to a smaller number of uncorrelated predictors had several advantages over prior models: (1) demonstration of the collinearity among certain predictor variables, which in itself yields important insights, (2) reduction of collinear variables to a smaller set of independent (uncorrelated) predictors, allowing for more stable and consistent regression results, (3) corroboration of certain key aspects of the prior Rotterdam CT classification, and (4) improvement of predictive power over previous outcomes models, through the computer-aided quantitative analysis of the admission head CT.
Our initial “global” exploratory PCA performed on the set of variables including both GOS-E and predictor variables demonstrated not only a significant correlation of GOS-E with certain predictors, but also indicated significant covariance among certain predictor variables themselves. This finding justified a second PCA analysis, which was performed on the predictor variables alone.
The second PCA analysis yielded important insights into underpinnings of previous successful outcomes models, such as the Rotterdam classification. In the PCA analyses of clinical and CT predictors, principal components that clearly corresponded to intracranial mass effect were identified. These components were characterized by significant contributions from midline shift, basal cistern effacement, and subdural hematoma volume, and little contribution from other variables. These mass effect principal components, governed primarily by subdural hematoma size, midline shift, and cistern effacement, were shown to be powerful predictors of 6-month GOS-E scores. This result strongly corroborates the Rotterdam CT classification, in which features of mass effect account for 3 points in a maximum possible score of 6 points in the prediction of likelihood of 6-month mortality after TBI.
Several other observations of the loading coefficients of the PCA analysis of clinical and qCT predictors were also intuitively satisfying. In particular, GCS was negatively correlated with the severity of cistern effacement. The volume of subdural hematoma was significantly correlated with degree of midline shift and with severity of basal cistern effacement, but was not significantly correlated with volume of subarachnoid/intraparenchymal hemorrhage. Although epidural hematoma volume was not found to be a statistically significant predictor in our model, this was also most likely attributable to the small number of epidural hematoma cases; a trend toward significance (p=0.25) was observed (Table 8). Similarly, although subarachnoid/intraparenchymal hemorrhage volume was not a statistically significant predictor, a trend toward significance (p=0.06) was demonstrated (Table 8). Patient age, epidural hematoma volume, and subarachnoid/intraparenchymal hemorrhage volume were not significantly correlated with each other or with other predictors.
Our study also shows that with incorporation of quantitative rather than qualitative CT features, the results of logistic regression of 6-month GOS-E score upon clinical and CT features improve substantially (Table 7). Two of the best-known head CT classification systems in acute TBI are the Marshall CT classification and subsequent Rotterdam classification. The Marshall CT classification (Marshall et al., 1991) divided severe head trauma patients into 6 groups according to head CT findings, and has been widely used for descriptive purposes; later, it was also used for prediction of mortality. The subsequently developed Rotterdam CT classification (Maas et al., 2005) achieved an improved discriminative value for the prediction of long-term outcome (6-month mortality) through regrouping of some CT features underlying the Marshall classification, and inclusion of CT evidence of epidural hematoma and traumatic subarachnoid hemorrhage as additional predictors in the model. Although the Marshall and Rotterdam classifications have demonstrated prognostic power, the head CT features in these classification schemes are qualitative features that are susceptible to observer bias. Some features, particularly the assessment of basilar cistern effacement and subfalcine herniation (Maas et al., 2007), and the presence or absence of parenchymal contusions/hematomas (Chun et al., 2007; Laalo et al., 2009) have been demonstrated to be limited by interobserver variability. In addition, trace quantities of intracranial hemorrhage in these classifications are not differentiated from larger volumes of hemorrhage, even though it has been suggested that in milder TBI, the volume of intracranial subarachnoid hemorrhage may be a marker of overall severity of brain injury (Chieregato et al., 2005). We demonstrate that use of quantitative rather than qualitative CT features along with age and GCS as predictors results in a substantial increase in the adjusted R-squared, to 0.50 from 0.43 (Table 7). Thus, qCT features, age, and GCS accounted for 50% of the variability in 6-month GOS-E score after acute head injury, compared to 43% when qualitative CT features, age, and GCS were used.
The purpose of the current study was not to construct a comprehensive outcome model, as in prior studies (Hukkelhoven et al., 2005; MRC CRASH Trial Collaborators, 2008; Murray et al., 2007; Perel et al., 2009; Steyerberg et al., 2008). Rather, we sought primarily to compare the effectiveness of quantitative versus qualitative CT measures in predicting outcome, using a fixed set of clinical features that have previously been established as strong predictors of outcome. Steyerberg and associates (2008) did essentially the inverse, using a fixed CT classification system (Marshall CT classification) as a predictor, while varying the clinical and laboratory predictors. Other previous models of outcomes have also used different inclusion criteria and different statistical approaches, making direct comparison of our results to these prior studies difficult. The three-tier outcomes prediction model by Steyerberg and colleagues (2008) for the prediction of mortality and GOS score at 6 months after injury was developed using the IMPACT database. A core model based only on age, motor GCS score, and pupillary reactivity provided baseline discriminatory ability with areas under the curve (AUCs) ranging from 0.66–0.84. Slightly higher discriminative ability with AUCs ranging from 0.71–0.87 could be achieved through augmentation by additional tiers of predictors, including Marshall CT classification, hypotension, hypoxia, presence or absence of epidural hematoma, presence or absence of traumatic subarachnoid hemorrhage, and serum glucose and hemoglobin on admission. Binomial logistic regression and proportional odds logistic regression analysis were used for that study, appropriate for the bivariate outcome (fatality versus nonfatality at 6 months), and ordinal outcome (GOS score at 6 months) measures considered in that study. Although our study has some similarities, a direct quantitative comparison of our results to these is difficult, as this and many prior studies (Eisenberg et al., 1990; Jacobs et al., 2010b,2011; Hukkelhoven et al., 2005; Maas et al., 2005;2007; Steyerberg et al., 2008) or meta-analyses (Chestnut et al., 2000; Husson et al., 2010) included only moderate and severe head-injury patients, while fully half of our study population consisted of mild head injury (GCS score 13 p=15), with the other half divided between moderate and severe head injury (GCS score 3–12). Furthermore, prior studies have used 6-month mortality (Maas et al., 2005; Nelson et al., 2010; Steyerberg et al., 2008), or a dichotomized version of the 6-month GOS score (Nelson et al., 2010; Perel et al., 2008; Steyerberg et al., 2008), in contrast to the 6-month GOS-E score used in the current study that includes the wide spectrum of mild, moderate, and severe head injury.
A recent study did analyze the relationship between 6-month mortality and quantitative CT measures of hematoma volume and midline shift in 605 patients, finding a monotonically increasing relationship between mortality at 6 months and both hematoma volume and midline shift (Jacobs et al., 2011). Our findings corroborate these results, but suggest through a global simultaneous analysis of numerous CT features through PCA that subdural hematoma volume, midline shift, and basal cistern effacement are not independent predictors, but are significantly correlated with one another. Our results, in fact, are very closely related to those in the relatively few prior studies that have used a multivariate approach that explicitly takes into account the covariance among clinical and CT predictors (Eisenberg et al., 1990; Nelson et al., 2010), rather than univariate or stepwise multivariate regression approaches that do not. Recently, Nelson and colleagues (2010) performed a detailed analysis of features of the admission head CT and their relationship to outcome; as part of this analysis, they demonstrated a significant interdependence of certain quantitative CT predictors such as midline shift and hematoma volume (measured manually by a radiologist). They concluded, for example, that a separate measure of hematoma volume is redundant if quantitative midline shift is included, because of the strong collinearity between these predictors. Furthermore, they found that quantitative midline shift, measured manually by the radiologist, was the single strongest CT predictor affecting outcome. Although our approach differs from that of Nelson and colleagues (2010) in that (1) our quantitative measures of midline shift and hematoma volume are derived from computer analysis of images, and (2) Nelson and associates (2010) used a very different statistical technique, the support vector machine, rather than PCA to demonstrate strong covariances among quantitative CT predictor variables, our results are strikingly similar. It is intuitively appealing from these results that the three CT predictors of midline shift, hematoma volume, and basal cistern effacement may be important predominantly as surrogates for a more fundamental predictor such as prolonged elevated intracranial pressure.
A limitation of our study is that the sample size in our study is far smaller than that analyzed in many prior studies, such as the Marshall CT classification study (746 patients; Marshall et al., 1991), Rotterdam CT study (2269 patients from the tirilazad trials; Maas et al., 2005), the CRASH trial (10,008 patients; Perel et al., 2008), the Karolinska study (861 patients; Nelson et al., 2010), and the IMPACT study (8509 patients; Steyerberg et al., 2008). Additional important limitations are the rate of loss-to-follow-up or denial of the GOS-E interview of 40%, and exclusion of an additional 17% of the initial cohort whose admission head CT could not be evaluated by the software due to the presence of contrast, non-traumatic acute intracranial hemorrhage, severe motion artifact, or other technical difficulties, raising concern for potential effects of selection bias on our results. Patients ≤16 years old or >75 years old were not included in the study, and our results therefore cannot be generalized to those at the extremes of age. Finally, the initial cohort consisted only of patients admitted to the neurosurgical intensive care unit (ICU), and therefore would tend to exclude milder head injuries from the study population. Finally, it is certainly possible that the quantitative parameters we present here (hematoma volume and quantitative midline shift) actually derive their predictive value mainly as surrogates for a confounding factor of which we are not aware (e.g., not intracranial mass effect, but a different fundamental parameter such as coagulopathy, which could result, for example, in a larger rate of hematoma growth, and thus larger measured hematoma volumes and/or midline shifts at any given time point when averaged across patients).
In summary, in the current study, we applied two new approaches to the problem of TBI classification and outcomes prediction. Our first strategy was to pursue a more objective, quantitative description of the abnormal features of the admission head CT. More quantitative, objective analyses of brain imaging studies in head trauma, with development of standardized metrics for abnormal features, may improve prediction of outcome as well as more immediate triage to appropriate early treatment. Second, we employed principal component analysis, first in a global exploratory analysis for possible relationships among outcome and predictor variables, and second in an analysis of the predictor variables alone as groundwork for construction of an ordinal logistic regression model of the 6-month GOS-E. Reduction of numerous predictor variables, including quantitative rather than qualitative CT predictors, to a smaller set of independent predictors demonstrated a substantial collinearity among certain predictor variables, corroborating the prior Rotterdam CT classification and more recent work of Nelson and associates (Nelson et al., 2010), while demonstrating an improvement in predictive power through the computer-aided quantitative analysis of CT images. The finding of an improvement in predictive power using quantitative CT features, and employing two different types of statistical analysis (global PCA on outcome and predictors, and PCA on predictors followed by logistic regression), is a promising result. Given the small study population and modest follow-up rate, this should be regarded as hypothesis-generating work that warrants further investigation. Future directions will include improvement of software accuracy using enhanced algorithms and possible exploration of different CT slice-reconstruction methods, and validation of our results in a larger multicenter population with a higher follow-up rate to lessen the effects of selection bias. Study of a larger population will also allow consideration of a larger variety of clinical variables in conjunction with qCT predictors. We believe that further development and application of objective computational tools and data-driven analytical methods hold great promise for neurotrauma research and may ultimately have a role in image analysis for clinical care.
We thank Pratik Mukherjee, M.D., Ph.D., for evaluation of a subset of computer-aided head CT interpretations for the purpose of determination of interrater reliability for the semi-automated quantitative CT measurements. We gratefully acknowledge support from National Institutes of Health grants NS067092 and NS069537 (PI: A.R.F.) and NS069409 (PI: G.T.M.).
No competing financial interests exist.