A significant body of existing literature (
Johnson et al., 2006;
Whitwell et al., 2007;
Reiman et al., 1996;
Canu et al., 2010;
Thompson and Apostolova, 2007) suggests that pathological manifestations of Alzheimer’s disease begin many years before the patient becomes
symptomatic – which is typically when cognitive tests can be used to make a diagnosis (
Albert et al., 2001). Unfortunately, by this time significant neurodegeneration has already occurred. In an effort to identify AD-related changes early, a promising direction of ongoing research is focused on exploiting advanced imaging-based techniques to characterize prominent neurodegenerative patterns during the prodromal stages of the disease, when only mild symptoms of the disease are evident. A set of recent papers (
Davatzikos et al., 2008a,
b;
Fan et al., 2008b;
Vemuri et al., 2008) including work from our group (
Hinrichs et al., 2009a,
b) have demonstrated that this is indeed feasible by leveraging and extending state-of-the-art methods from Statistical Machine Learning and Computer Vision. Good discrimination (in identifying whether an image corresponds to a control or AD subject) has been obtained on classification tasks making use of MR
or FDG-PET images (
i.e.,
one type of image data) (
Davatzikos et al., 2008a,
b;
Fan et al., 2008b;
Vemuri et al., 2008;
Hinrichs et al., 2009a). A natural question then is whether we can exploit data from multiple modalities and biological measures (if available)
in conjunction to (1) obtain improved accuracy, and (2) identify more subtle class differences (
e.g., sub-groups within MCI). This paper considers exactly this problem –
i.e., methods for systematic combination of multiple imaging modalities and clinical data for classification (
i.e., class prediction) at the level of individual subjects.
Recently, we have seen evidence that various aspects of AD-related neurodegeneration such as structural atrophy (
Jack Jr. et al., 2005;
deToledo-Morrell et al., 2004;
Thompson et al., 2001), decreased blood perfusion (
Ramírez et al., 2009), and decreased glucose metabolism (
Hoffman et al., 2000;
Matsuda, 2001;
Minoshima et al., 1994) can be identified (in structural and functional images) in Mild Cognitive Impaired (MCI) and AD subjects, as well as at-risk individuals (
Small et al., 2000;
Querbes et al., 2009;
Davatzikos et al., 2009). A number of groups have made significant progress by adapting well-known machine learning tools to the problem – this includes Support Vector Machines (SVMs), logistic regression, boosting, and other classification mechanisms. In the usual classification setting, a number of image acquisitions (training examples) are provided for which the subjects’ clinical diagnosis is as certain as diagnostically possible. The objective is to choose a discriminating function which optimizes a statistical measure of the likelihood of correctly labeling ‘future’ examples. Such measures may be based on certain brain regions, (
e.g., the hippocampus or posterior cingulate cortex) for example. The function’s output can then be used as a targeted disease marker in individuals that are not part of the training cohort. In the remainder of this section, we briefly review several interesting AD classification-focused research efforts, and lay the groundwork for introducing our contributions (
i.e., truly multi-modal analysis).
The machine learning, or classification approach has been used to provide markers for various neurological disorders including Alzheimer’s disease (
Davatzikos et al., 2008b;
Klöppel et al., 2008;
Vemuri et al., 2008;
Duchesne et al., 2008;
Arimura et al., 2008;
Soriano-Mas et al., 2007;
Shen et al., 2003;
Demirci et al., 2008). These efforts have primarily utilized brain
images, though some have also used other available biological measures. In (
Fan et al., 2008b,
a;
Davatzikos et al., 2008a,
b), the authors implemented a classification/pattern recognition technique using structural (sMR) images provided by the Baltimore Longitudinal Study of Aging (BLSA) dataset (
Shock et al., 1984). The proposed methodology was to first segment the images into different tissue types, and then perform a non-linear warp to a common template space to allow voxel-wise comparisons. Next, voxels were selected to serve as “features” (using statistical measures of (clinical) group differences), used to train a linear Support Vector Machine (SVM) (
Bishop, 2006). The reported accuracy was quite encouraging. The authors of (
Klöppel et al., 2008) also used linear SVMs to classify AD subjects from controls using whole-brain MR images. An additional focus of their research was to separate AD cases from Frontal Temporal Lobar Degeneration (FTLD). The authors reported high accuracy (> 90%) on confirmed AD patients, and less where post-mortem diagnosis was unavailable. In related work, Vemuri
et. al. (
Vemuri et al., 2008) demonstrated a slightly different method of applying linear SVMs on another dataset obtaining 88 – 90% classification accuracy. More recently, the methods in (
Fan et al., 2008a;
Misra et al., 2008;
Hinrichs et al., 2009a) have been applied to the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset, (
http://www.loni.ucla.edu/ADNI/Data/) (
Mueller et al., 2005) consisting of a large set of Magnetic Resonance (MR) and (18-fluorodeoxyglucose Positron Emission Tomography) FDG-PET images, giving accuracy measures similar to those reported in (
Fan et al., 2008b,
a;
Davatzikos et al., 2008a,
b). In (
Hinrichs et al., 2009a), we proposed a combination of
1 sparsity and spatial smoothness bias, implemented via augmentation of the linear program used in training. The spatial bias lead to an increase in accuracy, and made the resulting images more interpretable. Steady increases in the levels of accuracy on this problem,
i.e., separating AD subjects from controls, have lead some researchers in the field to move towards the more challenging problem of making similar classifications on MCI subjects, with the expectation of extending such methods for identifying signs of the disease in its earlier stages. We provide a brief review of some preliminary efforts in this direction next.
Several recent studies (
Schroeter et al., 2009;
deToledo-Morrell et al., 2004;
Dickerson et al., 2001;
Hua et al., 2008) have shown that certain markers are significantly associated with conversion from MCI to AD. In (
deToledo-Morrell et al., 2004;
Dickerson et al., 2001), the authors show that traced volumes of the hippocampus and entorhinal cortex show significant group-level differences between converting and non-converting MCI subjects. We note that these studies show (in a
post-hoc manner) that certain brain regions are correlated with AD histopathology; what we seek to do instead is to evaluate such markers in terms of their ability to classify novel examples. In (
Hua et al., 2008) a large number of ADNI subjects were tracked longitudinally using Tensor-Based Morphometry (TBM). The authors compared conversion from MCI to AD over 1 year with atrophy in various regions, but a discussion of the predictive accuracy results was relatively limited (
i.e., included
p-values of 0.02 between converters and non-converters). In (
Davatzikos et al., 2009), the authors applied statistical techniques to both ADNI and BLSA subjects (
Shock et al., 1984). A classifier was trained using ADNI subjects, and applied to MCI and control subjects (in the BLSA cohort) to provide a SPARE-AD disease marker. This procedure could successfully separate MCI and control subjects with high confidence (AUC of 0.885), and it was demonstrated that the MCI group had a larger increase in SPARE-AD scores longitudinally. However, the main focus in (
Davatzikos et al., 2009) was
not on predicting which MCI subjects would progress to AD, but rather on finding a marker for MCI itself. In (
Querbes et al., 2009), cortical thickness measures were used on a large set of ADNI subjects to characterize disease progression in AD and MCI subjects. Freely available tools (FreeSurfer) were used to calculate cortical thickness values at points on the surface of each subject’s brain (after warping to MNI template space) and then the thickness measures were agglomerated into 22 Regions of Interest (ROI), which the authors used as features (
i.e., covariates) in a logistic regression framework. Using age as a covariate, a set of AD and control subjects were used to train a logistic regression classifier for each subject, yielding a Normalized Thickness Index (NTI). It was found that this NTI was able to give 85% accuracy in separating AD subjects vs. controls, and had 73% accuracy (0.76 AUC) in predicting which MCI subjects would progress to full AD within 3 years. The latter objective is of special interest in the context of the techniques presented in this paper.
A common trend in the studies mentioned above is their focus on using a single scanning modality and processing pipeline. For instance, in a recent study (
Schroeter et al., 2009), the authors surveyed 62 original research papers in a meta-analysis aimed at identifying which brain regions might make the most useful markers of AD-related atrophy, in a variety of different scanning modalities. A fundamental assumption is that the studies use only one scanning modality and analysis method in isolation, rather than combining the several available modalities into a single disease marker. However, each scanning modality and processing method can reveal information about different aspects of the underlying pathology. For instance, structural MR images may reveal patterns of gray matter atrophy, while FDG-PET images may reveal reduced glucose metabolism (
Ishii et al., 2005), PIB imaging highlights the level of amyloid burden in brain tissue (
Klunk et al., 2004), and SPECT imaging can allow an examination of cerebral blood flow (
Ramírez et al., 2009); similarly, Voxel-Based Morphometry (VBM) shows gray matter density at baseline, while Tensor-Based Morphometry (TBM) shows longitudinal patterns of change (
Hua et al., 2008). Another important issue one must consider is that as new types of biologically relevant imaging modalities become available, (
e.g., new tracers for use in PET scanners, or new pulse sequences in MRI scanners), it is desirable for the diagnostic process to incorporate such advances seamlessly. Further, since AD pathology is known to be heterogeneous, (
Thompson et al., 2001) it may be advantageous to include multiple scanning modalities in a single classification framework. Indeed, a wide variety of markers may be available, and it is desirable to make the best use of
all such information in a predictive setting. The main difficulty is that as the number of available input features grows, many machine learning algorithms may lose their ability to generalize to unseen examples, due to the disparity between the sample size and the increased dimensionality. To address this problem, we propose to employ a recent development in the machine learning literature, called Multi-Kernel Learning (MKL), which is designed to deal with multiple data sources while controlling model complexity. We have evaluated this method’s performance on subjects from the ADNI data set, and report these results below. We have also applied the multi-modal classifier to MCI subjects, showing a promising ability to predict which subjects will convert from MCI to full AD in the ADNI sample.
The principal
contributions of this paper are:
(1) We propose a new application of Multi-Kernel Learning (MKL) to the task of classifying AD, MCI, and control subjects, which permits seamless incorporation of tens of imaging modalities, clinical measures, and cognitive status markers into a single predictive framework. The main ideas behind MKL are presented in Section 2.2;
(2) We have conducted an extensive set of experiments using ADNI subjects, aimed at providing a rigorous evaluation of the method’s ability to predict disease progression under conditions designed to match a clinical setting. We present these results in Section 4;
(3) We employ our method to produce a Multi-Modality Disease Marker (MMDM) for MCI subjects, and present an analysis of its predictive value on rates of conversion from MCI to AD in Section 4.3. A discussion of our results is given in Section 5.
1