|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: NK CAB MLL. Performed the experiments: NK MLL. Analyzed the data: NK MLL. Contributed reagents/materials/analysis tools: NK CAB MK MLL. Wrote the paper: NK CAB MK MLL.
Group-wise analyses of DTI in mTBI have demonstrated evidence of traumatic axonal injury (TAI), associated with adverse clinical outcomes. Although mTBI is likely to have a unique spatial pattern in each patient, group analyses implicitly assume that location of injury will be the same across patients. The purpose of this study was to optimize and validate a procedure for analysis of DTI images acquired in individual patients, which could detect inter-individual differences and be applied in the clinical setting, where patients must be assessed as individuals.
After informed consent and in compliance with HIPAA, 34 mTBI patients and 42 normal subjects underwent 3.0 Tesla DTI. Four voxelwise assessment methods (standard Z-score, “one vs. many” t-test, Family-Wise Error Rate control using pseudo t-distribution, EZ-MAP) for use in individual patients, were applied to each patient’s fractional anisotropy (FA) maps and tested for its ability to discriminate patients from controls. Receiver Operating Characteristic (ROC) analyses were used to define optimal thresholds (voxel-level significance and spatial extent) for reliable and robust detection of mTBI pathology.
ROC analyses showed EZ-MAP (specificity 71%, sensitivity 71%), “one vs. many” t-test and standard Z-score (sensitivity 65%, specificity 76% for both methods) resulted in a significant area under the curve (AUC) score for discriminating mTBI patients from controls in terms of the total number of abnormal white matter voxels detected while the FWER test was not significant. EZ-MAP is demonstrated to be robust to assumptions of Gaussian behavior and may serve as an alternative to methods that require strict Gaussian assumptions.
EZ-MAP provides a robust approach for delineation of regional abnormal anisotropy in individual mTBI patients.
Measures of fractional anisotropy (FA) derived from Diffusion Tensor Imaging (DTI) reveal white matter abnormalities in mTBI, consistent with traumatic axonal injury (TAI), the presumptive pathologic substrate of adverse clinical outcomes after TBI (e.g.,–). Voxelwise analyses applied to mTBI research, almost universally compare groups of individuals. These studies thus implicitly assume that the spatial distribution of mTBI pathology will be the same across subjects, as only changes affecting a common location across the patient group will be identified as abnormal. This approach is inherently insensitive to intersubject variation in location of pathology. Since the spatial distribution of mTBI pathology among individual patients depends upon location and mechanism of injury, and given the wide variation in mechanism of injury and patient characteristics, this is a highly questionable assumption , . Furthermore, clinical use of DTI requires assessment of individual patients. An approach to identifying loci of brain injury in individual mTBI patients is needed to fully understand the nature and extent of mTBI pathology toward personalizing and improving clinical practice.
Several studies have assessed DTI in individuals , , –. Viviani, et al.  applied a pseudo t-statistic with spatially smoothed standard deviation and degrees of freedom (DF) calibrated by cross-validation. They identified abnormal regions of the apparent diffusion coefficient (ADC) for single stroke and glioblastoma patients, with thresholds optimized for the Family-Wise Error Rate (FWER) based on the calibrated pseudo t-distribution. In many neuroimaging studies focused on lesion detection, spatial smoothing has not been carried out due to the risk for blurring of lesion margins. However, the FWER for control of Type-I errors in neuroimaging data may be overly conservative, especially when the images are not smoothed sufficiently . The “one vs. many” T-test approach, employing a priori thresholds (individual voxel and cluster level) has been previously applied to mTBI patients , ,  and the standard Z-score approach , . However, these authors did not report validation or effectiveness testing of their thresholds.
This study aims to validate “Enhanced Z-score Microstructural Assessment of Pathology” (EZ-MAP) described by Lipton et al. , for detection of regional FA abnormalities in individual mTBI patients, and to compare EZ-MAP to previously reported methods. Like other studies , , –, , EZ-MAP compares a patient’s FA value to those from a normal reference group at each voxel. Therefore, assessment of abnormality for each voxel entirely depends on summary statistics, i.e., mean and standard deviation, from the chosen reference group. It follows that final results may vary with the composition of the reference group, with potential for highly unreliable inferences when the reference group is small as it was in previous studies (10–11 subjects in the reference groups reported by , , , ). We employed a bootstrap procedure to overcome the potential for sample-to-sample variation of Z-scores. We also address limitations of all the prior approaches including EZ-MAP and perform specific validation addressing robustness, sensitivity, specificity and diagnostic utility.
After Albert Einstein College of Medicine Institutional Review Board (IRB) approval, Health Insurance Portability and Accountability Act (HIPAA) compliance and written informed consent, subjects were prospectively enrolled, distinct from clinical care. Thirty-four mTBI patients from one hospital emergency department met inclusion/exclusion criteria (Table 1) and were enrolled between August 2006 and May 2010. Forty-two control subjects with no history of head injury were recruited through advertisements.
Imaging was performed at 3.0-T (Achieva; Philips Medical Systems, Best, the Netherlands) using an eight-channel head coil (Sense Head Coil; Philips Medical Systems). T1-weighted whole-head structural imaging was performed using sagittal three-dimensional magnetization-prepared rapid acquisition gradient echo (MP-RAGE; 9.9/4.6; field of view, 240 mm; matrix, 240×240; and section thickness, 1 mm). T2-weighted whole-head imaging was performed using axial two-dimensional turbo spin-echo (4000/100; field of view, 240 mm; matrix, 384×512; and section thickness, 4.5 mm) and axial two-dimensional fluid-attenuated inversion recovery turbo spin-echo (1100/120; inversion time, 2800 msec; field of view, 240 mm; matrix, 384×512; section thickness, 4.5 mm; and average number of signals acquired, one) imaging. DTI was performed using single-shot echo-planar imaging (3800/88; field of view, 240 mm; matrix, 112×89; section thickness, 4.5 mm; independent diffusion sensitizing directions, 32; and b=1000 sec/mm2).
The American Board of Radiology certified neuroradiologist reviewed MR images of all subjects (patients and controls) in random sequence, blind to clinical information and group membership (patient or control). The 33 diffusion-weighted image sets (32 diffusion sensitizing directions and the b=0 sec/mm2 image) were corrected for head motion and eddy current effects using an affine registration algorithm. FA was derived from DTI at each voxel using the FMRIB Diffusion Toolbox . Preprocessing procedures implemented for DTI included skull stripping, echo-planar imaging distortion correction, intermediate rigid-body registration, registration to standard space, transformation of DTI to standard space, and white matter segmentation, in sequence. Non-brain voxels were removed from the MP-RAGE and turbo spin-echo images using FMRIB-FSL software . Each brain volume was inspected section-by-section, and residual non-brain voxels were removed manually. Turbo spin-echo images were acquired with the same section thickness, position and orientation as DTI. Distortion correction was accomplished using a nonlinear deformation algorithm to match each echo-planar image to the corresponding turbo spin-echo volumes . For intermediate rigid-body registration, each subject’s turbo spin-echo images were registered to their three-dimensional MP-RAGE volume using the Automated Registration Toolbox  three-dimensional rigid-body approach . For registration to standard space, the nonlinear registration module in ART was used to register each subject’s three-dimensional MP-RAGE volume to a standard T1-weighted template (Montreal Neurological Institute atlas; MNI) . For transformation of DTI to standard space, distortion correction, intermediate rigid-body registration, and standard space registration were applied to the calculated FA maps in a single resectioning operation using ART. Final cubic voxel size was 1 mm3, masked to exclude non-brain voxels from the analysis. For white matter segmentation, the fast automated segmentation tool in the FMRIB-FSL package  was used to generate a white matter mask for the three-dimensional MP-RAGE template brain images and restrict subsequent statistical analysis of FA to white matter voxels.
Because application of our methods to clinical settings will require use of ready control data to assess a new patient, we chose not to match controls one-to-one with patients. However, controls were chosen with an even distribution of age, gender and educational attainment that fully brackets the range of the patients; no patient age or educational attainment exceeds all controls at either extreme. For the purpose of our validation experiments, we subdivided the control group into two similar subgroups of 21 controls each. We adjusted for the potential effects of age, gender and education using a linear regression model estimated from one of the subgroups (the reference group). FA images used in subsequent analyses were first adjusted by applying regression coefficients to voxels where effects were significant. Regression coefficients thus determined were applied to FA images of the remaining 21 control subjects (“normal control subjects”) and patients, but only at locations where effects on individual voxels were significant at p<0.05 and where more than 100 significant voxels formed a contiguous cluster. This approach was taken because application of the regression model to all regions will only add noise to the Gaussian Random Field (GRF), diminishing sensitivity .
We computed the Z-score defined by with at each voxel within a subject’s FA volume with reference values (mean and Standard Deviation (SD)) computed from the reference control group (n=20), where one subject of the reference group in Table 2 was excluded for this calculation of mean and SD. Since the bootstrap procedure described in Text S1 requires n+1 control subjects to simulate Z-scores, each of which is a realization of deviation of a control subject’s FA from n control reference subjects, the reference group (Table 2) comprises 21 subjects. For each bootstrap implementation, (n+1) subjects were resampled, where for each resampling, first n subjects were grouped as a reference control group and the (n+1)th subject served as the control test subject. Ideally, although practically impossible, the reference mean and SD used to compute any Z-score should be derived from a very large control group. In our case, the ideal reference population would include all normal subjects within the demographic parameters defined above; we extracted subsets of this population as our control sample. In practice, the limited size of a control group (e.g., 10–11 subjects as previously reported , ), relative to the size of the entire reference population may mean that the control group does not optimally represent the full population from which it was selected. Therefore, the control group mean and SD may change with the composition of the selected control group, causing a bias (away from zero) and adding variance to the Z-score. Since the control group mean converges to the population mean at a faster rate than SD, additional variance is likely to be the most important factor contributing to variation of Z-scores across different control subgroups. Inferences based on Z-scores computed using only the control group SD might thus yield an unacceptably high rate of false positive results. Uniform application of higher Z-score thresholds to all voxels can be adopted in an effort to minimize false positive results, but may result in decreased detection power. We account for the potential excess variance in the Z-score at each voxel nonparametrically by employing a bootstrap procedure as described in Text S1. We term the Z-score based on this bootstrap-adjusted variance the Enhanced Z-score (EZ-score). The EZ-score at voxel (i) is then given in Equation (1).
where is the bootstrap SD estimate of Z-scores at voxel i from the bootstrap procedure (Text S1). The estimated variance from the bootstrap procedure for estimation of sample-to-sample variation of Z-scores may be greater than 1 (Figure 1). Therefore, determining abnormalities based on the EZ-score in Equation (1) will be more conservative than the assessment with the standard Z-score. The EZ-score approach adjusts each Z-score, with its potential variability induced by differing the reference group, and produces more robust results. This approach may provide a better coverage rate for FA values from new normal control subjects who were not part of the control group used for estimation of the mean and SD, in that the coverage rate is defined by the proportion of voxels lying between ±z1-λ/2, with a target coverage rate (1-λ)×100(%). This result is demonstrated in Figure 2. For all three coverage rates tested (90%, 95%, 99%). Coverage rates for the EZ approach are closer to target rates than those for the standard Z-score approach.
Potential variability induced by differing reference groups, which was estimated by the bootstrap SD procedure, was maximal with a smaller reference group and decreased as the number of reference subjects increased (Figure 1 (top: n=10, middle: n=20, bottom: n=40)). Determination of an abnormality with standard Z-scores based on a small control group may therefore include a substantial number of false discoveries because sample-to-sample variation in FA among the normal control subjects is not accounted for.
We initially determined significance by assessing the Tail Probability at a voxel i (TPi) of the EZ-score from the standard Gaussian distribution (Equation (2)).
We applied two levels of thresholding to identify significantly abnormal voxels. First, each voxel must meet a threshold (α1) for the TPi (Equation (2)) in order to be classified as abnormal. Second, the subset of these voxels that forms contiguous clusters meeting a size threshold (α2) is ultimately classified as abnormal. We determined the threshold for cluster size using GRF theory , which determines the significance of each suprathreshold cluster, a set of contiguous voxels which meet the individual voxel threshold (α1), and tested thresholds both uncorrected and corrected for multiple comparisons.
Receiver operating characteristic (ROC) analysis is suited to the assessment of complex diagnostic methods, such as neuroimaging, where theoretical validation of all aspects of a diagnostic procedure may be impossible. ROC is particularly useful in our case because it allows simultaneous assessment of multiple threshold values, α1 and α2, which may have a complex relationship to classification power.
ROC analysis requires explicit definition of true positive and false positive states. In our sample, however, no observable structural brain abnormalities were present on which to base a decision as to the presence of mTBI pathology. Furthermore, we expect that microstructural pathology will be present even in the absence of overt imaging abnormalities. Thus, we used ROC analysis to test the utility of EZ-MAP for classification of subjects as patients or normals in terms of the number of abnormal voxels detected. EZ-MAPs were generated for both 34 mTBI patients and 21 “normal control subjects”. We used a separate unique subgroup of normal subjects (not members of the “normal control group” tested [as mentioned at the end of the previous sentence]) as the “reference group” for computation of the EZ-MAP in each patient or “normal control subject”. That is, none of the “normal control subjects” for whom we computed EZ-MAPs were members of the “reference group” used to provide mean and SD for computation of the EZ-MAPs. The sole role of the “normal control group” in this study was to serve as test subjects or “pseudo-patients”. Using a range of combinations of the two thresholds (α1 and α2), ROC analysis identified optimal levels of the two thresholds, where AUC was maximal.
However, the t-score strictly requires a Gaussian distribution in order for FA values at each voxel to have the assumed theoretical t distribution with n-1 DF. Although the two-group t-test is known to be robust to deviation from the Gaussian distribution, this property cannot be applied to “one vs. many” t-test because when a patient is compared to a group of control subjects, the central limit theorem, which provides robustness in the two-group t-test, is no longer valid. One way to validate this Gaussian assumption is to compare theoretical variance and variance estimated from the data. The SD of the t-score can be estimated from the data simply by dividing the bootstrap SD estimate of the Z-score at each voxel by . A comparison of the theoretical SD (=) and bootstrap SD estimates of t-scores is demonstrated in Figure S1. The distribution of the bootstrap SD estimates of t-scores is approximately centered (Median=1.05; Mean=1.07) on the theoretical value 1.06 with n=20, but widely spread, indicating deviation from the theoretical t distribution. Inferences that voxels are abnormal, based on the theoretical t distribution, may therefore be substantially biased for those voxels located at the tail regions of the histogram shown in Figure S1. Since about 60% of voxels showed smaller SD estimates than the theoretical SD, determination that voxels are abnormal based on “one vs. many” t-test tends to be conservative and subject to false negative inferences. Further, voxelwise bootstrap SD estimates were classified into 3 classes: (a) under-dispersion, (b) over-dispersion and (c) close to the theoretical SD value by comparing bootstrap SD estimates to the theoretical SD. Over-dispersion was frequently found in peripheral WM regions while under-dispersion was found in the deep white matter (Figure 3). Methods that apply a theoretical SD threshold uniformly across all voxels may thus produce higher false positive or false negative decisions for the over- and under- dispersion regions. Deviation from the theoretical variance suggests deviation from the Gaussian assumption for FA measures. For example, the distribution of FA at a voxel can be a mixture of two Gaussian distributions as in Equation (4).
with the standard Gaussian density , 0<π1<1, and σi >0 (i=1, 2). The distribution of FA values in Equation (4) corresponds to a mixture of two subpopulations, each of which has Gaussian distribution with the same mean μ, but different variances. This model may be plausible in certain populations, for example, above a certain age, variance may increase or decrease significantly. Although, in the present study, effects of age on the mean FA were removed by linear regression, heterogeneous variance among age groups may remain. Under an assumption that the sample size (n) for the reference group is sufficiently large, the distribution of t-scores in Equation (3) derived with samples from a mixture Gaussian distribution is found to be a mixture of two t-distributions. Accordingly, classification of a voxel as abnormal based on the theoretical t-distribution is not valid; the coverage rate bounds (- tn-1,α/2, tn-1,α/2) from the theoretical t distribution cannot achieve the desired coverage rate [(1-α)×100%] in the presence of a mixed Gaussian distribution as in Equation (4). Details of this derivation are provided in Text S2.
Characteristics of the patient and control samples (Table 2) show that the range of age, gender, and years of education of controls fully encompasses those of patients. Furthermore, no significant difference in age (t-test; p=0.289, 0.324) or gender distribution (Chi-squared test; p=0.864, 0.511) was found between each control group and the patient group. However, a significant difference in years of education (t-test; p=0.005, 0.004) was found. 27,290 white matter voxels (4.5% of total white matter volume) met significance criteria for the effects of demographic covariates, dominated by the effect of education. Four axial slices with significant effects of demographic covariates on FA are shown in Figure 4.
Figure 5 shows example maps of abnormal FA from three different mTBI patients, demonstrating multiple areas of abnormally high and low FA with significant variation in the size and spatial distribution of FA abnormalities across patients. The determination of abnormality at this stage was defined using thresholds (α1=0.05 (2-tails); α2=0.05 (corrected for multiple comparisons)) determined from the ROC analysis (below).
In assessing diagnostic utility in individual patients, we initially investigated three global metrics: (1) all abnormal white matter voxels across the whole brain, (2) all white matter voxels where FA was significantly lower than normal and (3) all white matter voxels where FA was significantly higher than normal. Table 3 shows AUC and p-values (Wilcox-Mann-Whitney test, 1-tailed) calculated at various levels of the two thresholds, α1 and α2, for the first global metric (all abnormal white matter voxels across the whole brain). Greatest efficacy in making the discrimination between patients and controls, based on maximizing AUC, was found at lower thresholds for α1 and higher thresholds for α2.
Similarly, ROC studies were conducted for the “one vs. many” t-test and the standard Z-approach to find optimal thresholds; these results are summarized in Table 3. All three methods showed maximal AUC scores at α1=0.05 (2-tails); α2=0.05 (corrected for multiple comparisons). This pattern is opposite to that for FWER control, where thresholds less strict for α1 but fairly strict for α2 yield optimal discrimination power. Table 4 shows the sensitivity and specificity achieved when applying optimized thresholds (α1 and α2) for each method.
Diagnostic utility of the different analysis methods were compared for each of the three global metrics derived at the optimized thresholds [(α1=0.05 (2-tails); α2=0.05 (corrected for multiple comparisons) for EZ-MAP, standard Z-score and one vs. many t-test, while FWER-control was tested at 5% for each tail area]. Overall, all of the three global metrics attained significant power to discriminate mTBI patients from controls using EZ-MAP, standard Z-score or “one vs. many” t-test. However, the discrimination power of the FWER control approach was not significant as shown in Table 5 and Figure 6. The first global metric (all abnormal FA voxels) was somewhat more significantly different between groups than the other two metrics (all abnormally low FA voxels or all abnormally high FA voxels). Two example axial slices from two different patients are presented in Figure 7. As shown in Table 5 and Figure 7, the standard Z-score approach identifies the largest number of voxels, while EZ-MAP identifies fewer and the “one vs. many” T-test still fewer abnormalities. The FWER-control identified the fewest voxels as abnormal. Interestingly, the number of abnormal voxels from standard Z-score was about 1.5 times greater than the number from EZMAP, and again the number from EZ-MAP was about 1.5 times greater than the number from “one vs. many” t-test. Abnormal regions detected by the standard Z-score should contain more false positives due to underestimation of variance in comparison to EZ-MAP. Since 60% of voxels showed under-dispersion when compared to the expected theoretical variance of the t-distribution, the number of voxels classified as abnormal by the “one vs. many” t-test is excessively conservative. The FWER-control approach is also overly conservative. Robustness of the spatial extent of abnormalities for individual patients derived with different control subjects were explored, as demonstrated in Table 6 and Figure 8. Two sets of reference groups (n=20 and n=40; the n=40 group includes the n=20 group plus an additional 20 control subjects) were used separately to assess individual mTBI patients. For each of the three global metrics the number of abnormal voxels detected by standard Z-score approach greatly decreased as the size of the reference group increased, while “one vs. many” T-test showed the opposite pattern. The regions of abnormally low FA detected with the EZ-score were stable across reference group. Regions of abnormally high FA detected with the EZ-score, however, decreased as the size of the reference group increased.
Individualized assessments are needed to guide personalized therapeutic interventions –. Personalized medicine is generally understood to encompass genotype-tailored treatment –, but other unique manifestations of disease demand individualized diagnostic and therapeutic approaches. Individualized assessment of DTI has been reported in only a few studies of TBI, which applied group-wise methods to individuals , , , . However, individualized assessments are especially relevant to TBI, where the nature of the injury and its pathologic manifestations will be unique in each individual . Additionally, the analytic methods we have described could, after appropriate validation, be generalized to the assessment of many brain diseases.
We carefully addressed several important considerations in the implementation and validation of our approach. First, any study must in practice employ a control group that is a small subset of the population against which determinations of abnormality are to be inferred. This sampling limitation may lead to underestimation of variance in Z-scores and consequent erroneous inferences. EZ-MAP accounts for this potential additional variance by bootstrap, a nonparametric method which resamples the Z-score (i.e., by resampling the composition of the control group). An alternative bootstrap method would first resample the deviation of an individual FA from the mean FA obtained from a control group arriving at a new bootstrap SD quantity, which would then be used in computing the individual patient Z-scores. We employed the former approach because it yields a more robust and stable approximation of the true distribution .
We did not incorporate bias (variability of the control group mean based on its composition) in calculating the EZ-MAP, assuming that bias would be very close to zero. The narrow distribution (particularly around zero) of the mean of resampled Z-scores at each voxel confirms the validity of this assumption (Figure S2).
We further confirmed that differences between the patient and control groups were carefully assessed and accounted for to minimize the chance that covariates such as age, gender and education would be detected as real effects. Notably, we found that significant effects of age, gender and education were modest in magnitude and spatial extent.
EZ-MAP discriminated mTBI patients from normals, showing statistical significance on assessment of area under the ROC curve and significant differences between patients and controls in the number of abnormal voxels detected. In terms of discrimination power, the standard Z-score and “one vs. many” t-test approaches also attained significance while an approach employing FWER-control did not. Although all three methods achieved significant discrimination power, the extent of abnormal regions varied among the methods. Inferences based on the standard Z-score approach tend to produce more false positive findings and those identified with the “one vs. many” t-test yield more false negative inferences. EZ-MAP inferences fall in between these two extremes. Because EZ-MAP is a data-adaptive approach, it is inherently less sensitive to underlying assumptions regarding the composition of the reference group than standard Z-score approach and “one vs. many” t-test.
Several potential limitations of our study should be considered. The assumption that the distribution of FA at each voxel would be symmetric, implicit in the standard Z-score, EZ-MAP and the “one vs. many” t-test, was explored because we suspected that such assumptions would lead to erroneous inferences. The density functions of EZ-scores from each normal control subject (Figure S3, top) and all subjects (Figure S3, bottom), were estimated by mixture modeling . In the estimated density functions for individuals, we found a thicker tail to the right (highly positive EZ-scores). The estimated density function for all control subjects also showed a thicker tail to the right in comparison to the standard Gaussian distribution. As a consequence of this asymmetry, abnormally high FA voxels (i.e., the right tail) were likely to be identified in “normal control subjects”. Such “abnormalities” might be detected due to deviation from underlying assumptions about the control group (i.e., the absence of a symmetric distribution across subjects at each voxel) not due to actual abnormalities. Therefore, a further improvement for analysis of DTI images in individual patients beyond the EZ-MAP may be achieved by developing methods to account for potential asymmetry in the FA distribution.
It is also important to critically assess the likelihood that the effects seen in the mTBI patients we studied are due to mTBI rather than some other white matter abnormality. Although, strictly speaking, it would never be possible to accept a null hypothesis that our patient and control subjects do not differ other than due to mTBI, we went to great lengths to ensure, to the maximum extent possible, that differences between patients and controls are reasonably attributable to mTBI. The mTBI patients enrolled in this study were carefully screened to exclude pre-injury medical, neurological or psychiatric disorders, including substance use, which could possibly cause white matter pathology. In addition to adjusting for age, we excluded patients at extremes of the lifespan, where developmental or senescent changes may affect FA. Although a significant difference in education was found between patients and controls, we found only minimal effects on FA, which were adjusted for in our analyses. Finally, abnormal findings occurred in areas expected to be affected in TBI (e.g., , –, , –) and are consistent with numerous prior studies of DTI in TBI (e.g., , –, , –).
Bootstrap SD estimates of T-scores. Bootstrap SD estimates of t-scores from all white matter voxels across the whole brain is compared to the theoretical SD (=1.06) from the t-distribution with DF=19. The thin red bar indicates the theoretical SD. About 60% of the voxels are located below the theoretical SD.
Histogram of the mean of resampled Z-scores. Histogram of the mean of resampled Z-scores at each voxel, based on the bootstrap procedure (Text S1) is shown. The histogram is approximately centered at zero with a narrow width (±0.1).
Estimated density function of EZ-scores. The density function of EZ-scores was estimated for each normal control subject (top, blue) and all subjects (top, red), by concatenating individual volumes. The estimated density function for all control subjects (bottom, red) was compared to the standard Gaussian density function (bottom, black).
Bootstrap procedure for estimation of the SD for use in Enhanced Z-scores.
Derivation of the distribution of T-score with random samples from a mixture of two Gaussian distributions.
The authors thank Tova M. Gardin, B.A. and Miriam B. Hulkower, B.A. for assistance in data processing and Jeremy L. Smith, Ph.D. for programming implementation.
This work was supported by The Dana Foundation, David Mahoney Neuroimaging Program and by National Institutes of Health/National Institute of Neurological Disorders and Stroke (R01 NS082432). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.