|Home | About | Journals | Submit | Contact Us | Français|
Landmark-based high-dimensional diffeomorphic maps of the hippocampus (although accurate) is highly-dependent on rater’s anatomic knowledge of the hippocampus in the magnetic resonance images. It is therefore vulnerable to rater drift and errors if substantial amount of effort is not spent on quality assurance, training, and re-training. A fully-automated, FreeSurfer-initialized large-deformation diffeomorphic metric mapping procedure of small brain substructures, including the hippocampus, has been previously developed and validated in small samples. In this report, we demonstrate that this fully-automated pipeline can be used in place of the landmark-based procedure in a large-sample clinical study to produce similar statistical outcomes. Some direct comparisons of the two procedures are also presented.
As large imaging databases of hundreds and sometimes thousands of cross-sectional and longitudinal magnetic resonance (MR) images become available (e.g., Alzheimer Disease Neuroimaging Initiative), accurate, automated tools are needed to perform brain structure segmentation reliably and accurately. Recent advances in computational anatomy have begun to provide such segmentation tools. Powerful whole-brain segmentation tools such as FreeSurfer (Fischl et al., 2002) can produce an initial automated segmentation of brain substructures such as the hippocampus and deep brain nuclei. However, the segmentations are not smooth enough to be suitable for shape computation and analysis, which have been shown to provide additional discriminant power and understanding of disease process as compared to volume alone (Csernansky et al., 2004).
Template-based global whole-brain mapping methods often require the computation of an invertible transformation (e.g., large-deformation diffeomorphic maps). Although they provide smooth substructure representations, these methods are prone to trapping in local minima because of inaccurate initial whole-brain alignments. Therefore, restriction to a region of interest has a better likelihood of computing an invertible transformation from an anatomical atlas image. Thus far, locating and extracting such a region of interest (ROI) has been achieved by expert raters manually landmarking homologous landmark points in the scans (Haller et al., 1997; Hogan et al., 2000; Du et al., 2001).
For the past decade, our group has applied a landmark-initialized large-deformation high-dimensional brain mapping (LMK + HDBM-LD) procedure for mapping the hippocampus and deep brain nuclei (Csernansky et al., 2004; Wang et al., 2007). This was a variant of the “greedy” algorithm implemented by Christensen et al. (1994). In this procedure, 12 global landmarks were placed at the anterior and posterior commissures, and points at the periphery of the cerebrum; 22 additional, local landmarks were placed at pre-selected points on the surface of the hippocampus along its anterior–posterior axis. Initial landmark-based registration (Joshi et al., 1995; Haller et al., 1997) served to adjust the orientation and size for the head (based on global landmarks) and the hippocampus (based on local landmarks) and therefore provided initial alignment for the HDBM-LD methods (Christensen et al., 1994), which then proceeded independent of further user input. Even though this procedure has been shown to be superior in reliability to the results obtained from manual outlining (Haller et al., 1997), it is not entirely automatic, and the expertise of the neuroanatomist plays a key role. The landmarking procedure depends on individual expert knowledge of the MR scans, and when applied to large-sample clinical imaging studies, may bias the initial landmark-driven transformations. It is also vulnerable to rater drift and errors if sufficient amount of effort is not spent on quality assurance, training, and retraining. A fully-automated procedure that does not involve such expert interactions is needed.
FreeSurfer-initialized large-deformation diffeomorphic metric mapping (FS + LDDMM) as a fully automated mapping pipeline of small brain substructures, including the hippocampus and other deep nuclei, has been validated in small samples against expert manual segmentations (Khan et al., 2008). It has demonstrated improved accuracy over both FreeSurfer alone, or the above LMK + HDBM-LD approaches.
In this study, we sought to demonstrate that the new FS + LDDMM pipeline, as a fully-automated replacement of the previously published studies using the semi-automated approach, when applied to a clinical imaging study, should yield similar statistical outcomes (Wang et al., 2006).
A total of 89 CDR 0.5 (clinical dementia rating) with a DAT (dementia of Alzheimer type) diagnosis and 125 non-demented CDR 0 subjects were included in this study. The CDR (Morris, 1993) was performed to assess the severity of dementia symptoms. The CDR rates the presence or absence of cognitive impairment on a 5 point scale: 0 indicates no dementia and 0.5, 1, 2, and 3 indicate very mild, mild, moderate, and severe dementia, respectively. CDR assessments have been shown to have high inter-rater reliability (Morris et al., 1997; Berg et al., 1998). Subject characteristics are summarized in Table 1.
MPRAGE scans were collected on a 1.5-Tesla Vision system (Siemens Medical Systems) for all subjects, using a standard head coil with the following parameters: TR = 9.7 μs, TE = 4.0 μs, flip angle = 10°, voxel resolution = 1 mm × 1 mm × 1.25 mm, matrix = 256 × 256, scan time = 6.5 min. Multiple (2–4) MPRAGE image volumes were collected in sequential order for each subject, and were aligned with the first scan and averaged to create a low-noise image volume (Buckner et al., 2004).
The MR image from a non-demented 69-year-old male CDR 0 subject was used as the template scan. This subject was obtained from the same source as the other subjects in the study, but was not otherwise included in the data analysis. The left and right hippocampi in this template scan have been manually segmented (Csernansky et al., 2000). This was the same template as used in Wang et al. (2006).
All MPRAGE images were processed through the FS + LDDMM pipeline, which consisted of the following three stages, (1) FreeSurfer labeling, (2) initial alignment with intensity normalization, and (3) LDDMM-based diffeomorphic transformation. The details can be found in Khan et al. (2008), and we describe them briefly here.
Initial, automated segmentation of the template and each target image using FreeSurfer 3.0.5 (autorecon1 and autore-con2), which generated 37 structural labels of the brain (Fischl et al., 2002), formed the first stage in the FS + LDDMM pipeline. The labels of cerebro-spinal fluid (CSF) and the hippocampus were subsequently used by FS + LDDMM to map the hippocampus.
LDDMM required the coarse alignment between the target and template ROIs surrounding the hippocampus. This alignment was achieved by an affine transformation between the FS hippocampal labels in the target and the template images. An ROI subvolume was then generated in the template image and each template-aligned target image centered on the FreeSurfer label of the hippocampus.
Inside the ROI, we performed a variant of histogram matching to ensure homogeneity of corresponding tissue type intensities between the images. This step was a specialization of the intensity scale standardization by Nyul et al. (2000) where the knowledge of the tissue intensity distributions was assumed to be known. Before this, the MR image intensities were rescaled by linearly mapping the range between the 0.5 and 99.5 percentile intensities to the full image intensity range. Then, the median image intensities in the FreeSurfer CSF, gray matter, and white matter histograms were located in the ROI. These intensities were aligned using piece-wise linear intensity transformation.
LDDMM (Beg et al., 2005) generates a diffeomorphic transformations that is smooth and has a smooth inverse, thus anatomy is mapped consistently, without fusions or tears, while preserving smoothness of anatomical features. A three-step procedure for computing the optimal diffeomorphic transformation was developed in a multi-resolution coarse-to-fine strategy. At each step, LDDMM was initialized with the optimal velocity vector field and map computed at the previous step, with additional anatomical information added into the optimization process. This strategy was designed to guide the optimization away from potential local minima. In the first step, LDDMM was performed using the FreeSurfer CSF labels, or equivalently, the portion of the ventricles in the ROI. In the second step, the ROI was smoothed [convolution with a 3 × 3 × 3 voxel Gaussian mask of 0.5-voxel standard deviation (SD)] and LDDMM was then performed on the ROI. In the third step, a second LDDMM was performed with the smoothing removed. After the above multi-step mapping, the template surface in the atlas space was propagated to the target ROI space using the final LDDMM transformation, followed by the inverse affine transformation to the target’s whole brain space, thus generating the final hippocampus surface in the target.
Left and right hippocampal volumes in each subject were calculated as the volumes enclosed by the hippocampal surfaces. An average hippocampal surface constructed from 86 healthy subjects from Wang et al. (2006) was used as a reference surface, from which linear displacements of each subject’s hippocampal surface was calculated at each surface vertex. For each subject, deformations were averaged within surface zones that represent surface deformations for CA1, subiculum, and remainder [CA2, 3, 4, dentate gyrus (DG)] subfields (Csernansky et al., 2005; Wang et al., 2006). Negative values for these measures represented inward variation of the surface while positive values for these measures represent outward variation of the surface. Note what were referred to as lateral, inferior-medial, and superior zones in Wang et al. (2006) were referred to as CA1, subiculum, and remainder subfields, respectively, in the present study.
All statistical analyses were performed in SAS (SAS Institute, 2000). For volume comparison, we included left and right hippocampal volumes in a repeated-measures analysis of variance (ANOVA) with CDR status as main effect and hemisphere as repeated factor. For subfield deformation comparisons, deformation values from each surface zone were entered into a repeated-measures ANOVA, with CDR status as main effect, hemisphere and surface zone deformations as repeated measurements. Surface zone deformations were listed as “identity” in the repeated statement in SAS. Gender was used as a covariate throughout. Effect sizes (Cohen’s d) were also calculated for each zone per hemisphere.
Finally, to determine post hoc whether hippocampal volume and surface zone deformations could be used for discriminant purposes, logistic regression procedures were used to determine odds ratios, significance (95% confidence limits) and the C-statistics for each variable (i.e., volume and surface zone deformation). The odds ratio is like regression coefficients, whereas the C-statistic is like area under the ROC curve, which can be interpreted as probability of correct classification. An alpha of 0.05 was maintained for all analyses.
For evidence of replication, the results from the above statistical tests were compared with the results from Wang et al. (2006), where LMK + HDBM-LD was used to generate the same kinds of hippocampal volume and surface subfield deformation measures as in the current study. The caveat, however, was that only 79 (27 CDR 0.5, 52 CDR 0) subjects were common to both studies. This was because FLASH scans were used in Wang et al. (2006), MPRAGE scans were used in the present study, and only a portion of the subjects from our previous study had also MPRAGE images at scanning. Also, some subjects included in the present study had not enrolled in the previous study. We therefore could not make a complete comparison with the published data. However, we used the hippocampal data for these subjects from 2006 to correlate with the hippocampal data derived from FS + LDDMM pipeline on these overlapping subjects. Spearman Correlation was calculated between the two methods. Significance was reported without correction for multiple comparisons.
The difference between the DAT group and the non-demented control group is visualized in Figure 1, with the boundaries between the three zones of the hippocampal surface (i.e., CA1, subiculum, and remainder) drawn in black. Areas of hippocampus showing the greatest group differences (as marked by the blue colors) are concentrated in the CA1 and subiculum surface zones. This pattern of deformation resembles our previous findings (Wang et al., 2006). It should also be noted that most of the lateral aspects of the hippocampal surface (also in the CA1 subfield zone) showed much of the inward deformation, a pattern that we have previously reported in non-overlapping subjects (Csernansky et al., 2000; Wang et al., 2003).
Mean hippocampal volume and surface zone deformation with respect to the healthy, reference group mean from Wang et al. (2006) are listed in Table 2 for the DAT and control groups. There was a significant group effect on volume (F = 43.0, df = 1,210, P < 0.0001) and subfield surface deformations (F = 12.1, df = 3,208, P < 0.0001). When the DAT group was compared with the control group, the effect sizes were negative, indicating that the DAT subjects had smaller hippocampal volumes and more inward surface zone deformations. As in the previous study, we found larger effects in the CA1 and subiculum surface zone deformations than in the remainder zone.
Finally, each measure of volume and surface zone deformation was entered into a separate logistic regression procedure to examine its ability to discriminate the two groups, with the healthy non-demented subjects as the reference group; these results are summarized in Table 3. A significant odds ratio of 1.14 for 0.1 mm decrement of the left CA1 can be interpreted as follows: for an inward displacement of 0.1 mm (i.e., negative when compared to the mean of the healthy comparison subjects; inward displacement of 0.1 mm is equivalent to volume reduction of 68 mm3), the odds of a subject being DAT is 1.14 times the odds of it being a healthy comparison subject [In the logistic regression procedure, the healthy comparison group is the reference group, the stricter interpretation of the statistical output is that for a given subject, when the CA1 zone is deformed outward by 0.1 mm (positive increase), the odds of this subject being a healthy comparison subject is 1.13 times the odds of it being a DAT subject. Since diagnosing healthy subjects is not intuitive, we turned the interpretation around because the reciprocity property of the odds ratio allows one to do that]. When the confidence limits for a particular variable do not include 1, the odds ratio is significant. When left and right hippocampal volume, CA1, subiculum, or left remainder surface zone deformation was increased by 0.1 mm, a significant odds ratio was obtained (range, 1.14–1.27). However, odds ratio for the right remainder surface zone deformation was non-significant.
For the 79 subjects that were common to both studies, the hippocampal maps computed via the FS + LDDMM pipeline in the current study, and the hippocampal maps computed via the LMK + HDBM-LD pipeline in the previous study were used to compute correlations. The correlations across surface vertices were summarized in Table 5, and visualized on the template surface in Figure 2, where the significant correlations were painted as a flame scale onto each surface point. Surface points for which correlations were not significant were painted yellow–green. Scatter plots (correlation and Bland-Altman) of hippocampal volume and surface deformation measures derived from FS + LDDMM pipeline versus measures derived from the LMK + HDBM-LD pipeline are shown in Figure 3, and correlations among the hippocampal variables are listed in Table 4. It is not surprising that the remainder surface zone, which we previously reported to have little impact on DAT discrimination, had the lowest correlations between the two methods.
The overall deformation pattern of DAT subjects versus the control subjects was visually similar to that reported in our previous study using the LMK + HDBM pipeline (Wang et al., 2006), with CA1 and subiculum surface zones showing the majority of the differences between these two subjects groups.
With the exception of the left remainder surface zone, all the other surface zones showed similar group effect, compared with the previous study: compared with CDR 0 subjects, CDR 0.5 subjects had significantly smaller hippocampal volumes and more inward deformation in the CA1 and subiculum subfield surface zones. The CDR 0.5 subjects showed similar mean (SD) surface zone deformations. However, the CDR 0 subjects exhibited greater surface zone deformations compared with our previous study. For example, in the current study, the mean (SD) of the CA1 zone was −0.33 (0.46) mm, while the previous study showed a mean (SD) of 0 (0.40) (unpublished, reference data). The phenomenon is similar for the other subfield zones, where the SD (variability) was similar between the two studies and the mean was 0 in the previous study (the mean was zero because the average of the CDR 0 subjects were used as the reference).
The effect sizes of hippocampal surface zone deformations in this study were smaller than previously reported. This was probably due to the fact that the reference groups were different. Although the DAT group exhibited similar means and standard deviations and the control group had similar standard deviations, the control group in this study showed an appreciable amount of inward deformation in the CA1 (left −0.33, right −0.22) while the healthy reference group from Wang et al. (2006) had an average of 0 (used as reference). This difference could be due to the fact that 20 healthy subjects from a study of schizophrenia were included in the reference group of the previous study. The subjects included in this study were somewhat older, and aging may be related to hippocampal volume loss and shape deformities.
The odds ratios and C-statistics also reflected the above trend when comparing the two studies: that the values in the current study were similar but slightly lower compared with those reported in the previous study, primarily due to the fact that the current CDR 0 subjects had more inward deformations and smaller hippocampal volumes. For example, for every volume decrease in the left hippocampus by 68 mm3, the odds ratio of DAT diagnosis increases by 16%, and such relationship was non-significant in the right remainder subfield zone.
We could not make a complete comparison between the two studies because only 79 (27 out of 89 CDR 0.5, 52 out of 125 CDR 0) subjects included in this study overlapped with Wang et al. (2006). We did, however, report correlations for the shared subjects, as a further validation of the fully-automated FS + LDDMM pipeline. In addition, the averaged MPRAGE scans used in the present study were of considerably higher quality [visually higher signal-to-noise (S/N) and contrast-to-noise ratios] than the FLASH scans used in the previous study, which led to better segmentation quality (by visual inspection).
The influence of segmentation accuracy on the surface measurements is worth commenting. In Khan et al. (2008), we reported a mean surface distance of 1.23 mm for CDR 0.5 subjects and 1.52 mm for CDR 0 subjects when compared with manually delineated hippocampi. This group-dependent difference was similar for both the FreeSurfer-only and FS + LDDMM approaches. This indicates that a tendency for any algorithm to overestimate the hippocampus may exist for one group versus the other. Furthermore, the 10 subjects used in Khan et al. (2008) were not included in this study.
Limitations of this approach includes the dependence on a single-subject template, which can be alleviated in the future by using average templates computed from populations. Another limitation is the lack of testing on scans collected on higher-field (e.g., 3-Tesla) scanners. The overall statistical comparisons indicate that the fully-automated FS + LDDMM pipeline could be used in place of the semi-automated LMK + HDBM-LD pipeline that depended upon manual placement of anatomic landmarks.
Grant sponsor: PHS; Grant numbers: P01 AG026276, P01 AG03991, P50 AG05681, R01 AG025824, R01 MH60883, P50 MH71616, P41 RR15241, NSERC 31–611387, CHRP 751115; Grant sponsor: Pacific Alzheimer Research Foundation; Grant number: 869294.
Ali Khan was supported by NSERC PGS-M scholarship.