In this study, we demonstrate that an automated method for the measurement of cortical thickness from MRI data is remarkably reliable for the detection and quantification of regional cortical thickness correlates of cognitive performance in normal older adults. In a small sample of subjects, dissociated effects were detected between the performance of two different cognitive tests and cortical thickness in two different brain regions. Verbal memory performance was associated with left medial temporal cortical thickness, while visuomotor speed/set-shifting was associated with thickness of a region of the lateral parietal cortex. These effects were highly reliable—not only in terms of spatial localization but also magnitude of absolute cortical thickness measurements—across four different scan sessions, including 1.5T test and re-test sessions, a 1.5T session at a different site using a scanner made by a different manufacturer, and a session on a higher-field strength (3.0T) scanner, each of which was separated by a two-week interval.
With respect to the localized regions of the cerebral cortex where thickness correlates with cognitive performance, the findings of the present study are consistent with previous data. Delayed recall of recently learned information depends critically on the integrity of the medial temporal lobe memory system (
Squire et al., 2004). The rostral medial temporal cortex, encompassing entorhinal and perirhinal cortices, is known from neurophysiologic studies to be important for delayed memory performance in rats (
Young et al., 1997), monkeys (
Suzuki et al., 1997), and humans (
Fried et al., 1997). Although some volumetric imaging studies of entorhinal cortex suggest that it is important for immediate recall (
De Toledo-Morrell et al., 2000), others demonstrate its importance in delayed recall (
Rodrigue and Raz, 2004;
Dickerson et al., 2005). The TMT is usually thought to be a test in which performance is compromised by frontal cortical lesions, particularly those in the dorsolateral prefrontal cortex (
Stuss et al., 2001b). Yet functional imaging studies have shown that performance of this task recruits parietal cortex (
Asari et al., 2005), likely because it involves visuospatial attentional processing and set-shifting. A functional MRI study of an analogue of the TMT reported that this task engages, in addition to dorsolateral prefrontal and supplementary motor cortex, cortical regions within the lateral parietal cortex in the intraparietal sulci (
Moll et al., 2002).
Manual-operator generated ROI measurements from MRI data have been used in hypothesis-driven studies to identify specific relationships between the volume of particular cortical regions and neuropsychological test performance measures. For example, in patients with AD, delayed verbal free recall was best predicted by left hippocampal volume, while delayed spatial free recall was best predicted by right hippocampal volume (
de Toledo-Morrell et al., 2000). In a mixed group of AD, frontotemporal dementia, and semantic dementia patients, hippocampal volume was the best predictor of delayed recall while frontal cortical volume was the best predictor of semantic clustering (
Kramer et al., 2005). However, because of their labor-intensive nature, manual ROI-based approaches generally lend themselves best to studies in which
a priori hypotheses have enabled the selection of particular brain regions for measurement. For example, in both of the aforementioned studies, entorhinal volume was not measured, so it is unclear whether relationships would have been present with memory variables.
In contrast, voxel-based morphometry (VBM) techniques have been used in exploratory whole-brain analyses to identify regionally-specific relationships between cortical grey matter density and neuropsychological test performance (
Gale et al., 2005). Yet VBM involves the voxel-based transformation and smoothing of individual MRI data into common coordinate space, which may remove the precise features of interest for studies of cortical anatomy (because transformations are performed based on voxel intensity without regard to gyral or sulcal anatomic features of the cortex). Furthermore, it is difficult to interpret the grey matter density measure employed by VBM with respect to measurable biologic properties of brain tissue. Thus, it may be useful to interpret the results of an exploratory VBM analysis in terms of the generation of a hypothesis, which then could be tested to identify the particular biologic property of brain tissue that accounts for the effect (e.g., volume, thickness, or surface area of a particular neuroanatomic structure).
The automated tools used in the present study enable both efficient exploratory whole-cortex analyses for the generation of hypotheses, as well as the derivation of ROIs for use as a priori ROIs in the subsequent testing of these hypotheses in, for example, a separate subject sample. The measure is a morphometric property of the cortex--thickness--that is interpretable in an individual subject and may relate, at least in part, to neuronal or synaptic numbers within the cortex (
Gomez-Isla et al., 1996;
Regeur, 2000). The present analysis is limited, however, to the cerebral cortex, and a separate but related set of automated tools is required to generate measurements of the hippocampal formation, amygdala, basal ganglia, and other subcortical structures (
Fischl et al., 2002b). Another important limitation of this study is the small sample size. Because of this, the particular cortical regions identified in this study should be interpreted cautiously and subjected to replication in larger samples; they are highlighted here primarily to illustrate reliability of thickness measures. Furthmore, it is not clear why there may be inverse correlations between performance and thickness, as was observed in the paracentral/cingulate sulcus region in which thinner cortex was associated with better performance on CVLT. Issues such as these deserve further investigation in larger samples of subjects.
Given the growth of multicenter imaging studies that seek to identify quantitative measures of brain structure or function that relate to disease (
Jack et al., 2003;
Mueller et al., 2005;
Murphy et al., 2006;
Belmonte et al., 2007), it is surprising that there has been fairly little study of the influence of MRI instrument-related factors on the reliability of such putative imaging biomarkers (
Han et al., 2006;
Jovicich et al., 2006). Since measures of cortical thinning are sensitive (
Lerch et al., 2006) and reasonably specific (
Du et al., 2007), at least in the context of particular neurodegenerative diseases, these measures are a promising candidate MRI imaging biomarker. Knowledge of the degree to which different MRI instrument-related factors—such as field strength, scanner manufacturer, and scanner software and hardware upgrades—affect the reliability of cortical thickness measures is essential for the interpretation of these measures in basic and clinical neuroscientific studies. This knowledge is critical if cortical thickness measures are to find applications as biomarkers in clinical trials of putative treatments for neurodegenerative or other neuropsychiatric diseases. Reliability of putative MRI biomarkers should be assessed not only using statistical reliability measures (
Han et al., 2006), such as intraclass correlation, but also by determining how well “real world” effects of interest can be detected. The present study indicates that automated measures of cortical thickness are highly reliable within scanner systems and across manufacturers and field strengths for the localization and quantification of cortical thickness correlates of cognitive test performance. Further study of the influence of instrument-related factors on quantitative MRI-derived measures of brain anatomy will be critical if these measures are to be successfully translated into useful imaging biomarkers of disease or performance ability (
Dickerson and Sperling, 2005).