|Home | About | Journals | Submit | Contact Us | Français|
There is an unmet medical need to identify neuroimaging biomarkers that is able to accurately diagnose and monitor Alzheimer's disease (AD) at very early stages and assess the response to AD-modifying therapies. To a certain extent, volumetric and functional magnetic resonance imaging (fMRI) studies can detect changes in structure, cerebral blood flow and blood oxygenation that are able to distinguish AD and mild cognitive impairment (MCI) subjects from normal controls. However, it has been challenging to use fully automated MRI analytic methods to identify potential AD neuroimaging biomarkers. We have thus proposed a method based on independent component analysis (ICA), for studying potential AD-related MR image features, coupled with the use of support vector machine (SVM) for classifying scans into categories of AD, MCI, and normal control (NC) subjects. The MRI data were selected from Open Access Series of Imaging Studies (OASIS) and the Alzheimer's Disease Neuroimaging Initiative (ADNI) databases. The experimental results showed that our ICA-based method can differentiate AD and MCI subjects from normal controls, although further methodological improvement in the analytic method and inclusion of additional variables may be required for optimal classification.
Alzheimer's disease (AD) is by far the most common cause of dementia associated with aging. Post-mortem studies of AD have showed three typical lesions in AD brains: intraneuronal neurofibrillary tangles (NFTs), extracellular deposits of Aβ amyloid plaques, and the loss of neurons [1, 2]. To clinically diagnose AD patients at an early stage, many biomedical imaging techniques have been used, such as structural magnetic resonance imaging (sMRI) , functional magnetic resonance imaging (fMRI) , positron emission tomography (PET) , etc.
SMRI promises to aid in the diagnosis and treatment monitoring of MCI and AD through the facile detection of surrogate biomarkers for disease progression. Studies involving the analyzing of sMRI brain scans can be generally categorized into two classes: region-of-interest (ROI) analysis  and whole brain analysis [7, 8]. ROI analysis focuses on specific brain regions, especially the hippocampus and the entorhinal cortex [9, 10], to show histopathological changes at early stages of AD . ROI analysis of the brain structure is considered a gold standard, but it has some drawbacks such as operator-dependency, labor-and time-intensiveness, and the required step of a priori choice of regions for investigation . To overcome these shortcomings, some automated methods of measuring whole brain atrophy have been developed, such as voxel-based morphometry (VBM) , tensor-based morphometry , and source-based morphometry , etc. However, sMRI researches are often divided in choosing image analysis methods: volume statistics [6, 9], cortex shape analysis , and blind signal separation/machine learning techniques [5, 7, 15, 16].
Independent component analysis (ICA), an important blind signal separation technique, has proved to be a powerful method for analyzing neuroimage data [16, 17]. It is one of the multivariate and data-driven techniques that enable an exploratory analysis of MRI datasets to extract useful information about the relationships among voxels in local brain substructures. For the diagnosis or classification of AD and MCI patients, support vector machine (SVM), one of machine learning techniques, has also received much attention [7, 15].
In the current study, we have applied ICA-based method coupled with SVM technique to selected MRI data from both the OASIS and ADNI databases. The ICA-based method has three steps. First, all MRI scans are aligned and normalized by statistical parametric mapping (SPM). Then, ICA is applied to the images for extracting specific neuroimaging components as potential classifying features. Finally, the separated independent component coefficients are fed into a classifying machine that discriminates among AD, MCI, and control subjects. Our results indicate that the proposed method is able to classify AD and MCI patients and normal control subjects with certain accuracy. In our knowledge, the application of ICA-based method coupled with SVM technique in anatomical MRI data analysis for AD brain has not been reported before.
In the study, we will use MRI data from two image datasets: the Open Access Series of Imaging Studies (OASIS) and the Alzheimer's Disease Neuroimaging Initiative (ADNI).
OASIS (http://www.oasis-brains.org/) is made available by the Washington University Alzheimer's Disease Research Center, Dr. Randy Buckner at the Howard Hughes Medical Institute (HHMI) at Harvard University, the Neuroinfomatics Research Group (NRG) at Washington University Scholl of Medicine, and the Biomedical Informatics Research Network (BIRN). Their aim is to make MRI datasets of the brain freely available to the scientific community. The dataset consists of a cross-sectional collection of 416 subjects aged between 18 to 96 years old, including 218 subjects aged between 18 to 59 years old, and 198 subjects aged between 60 to 96 years old. Of the older subjects, 98 had a Clinical Dementia Rating (CDR) score of 0, indicating no dementia; and 100 had a CDR score greater than zero (70 CDR=0.5, 28 CDR=1, 2 CDR=2), indicating a diagnosis of very mild to moderate AD. The detailed statistics of the dataset was described in the literature .
Launched in 2003, the ADNI's (http://www.loni.ucla.edu/ADNI/)  primary goal has been to test whether serial MRI, PET, fluid biological markers in cerebrospinal fluid (CSF) and blood, and clinical and neuropsychological assessments can be combined to measure the progression of MCI and early AD. The Principal Investigator of this initiative is Dr. Michael W. Weiner, of the Veterans Administration Medical Center and the University of California-San Francisco. Subjects have been recruited from approximately 50 sites across the United States and Canada. The ADNI has recruited over 800 adults aged between 55 to 90 years old, to participate in the research. These include approximately 200 cognitively normal individuals who are followed for 3 years, 400 subjects with MCI who are followed for 3 years, and 200 patients with early AD who are followed for 2 years.
To alleviate our computing burden, we only used the processed image data in the OASIS database, that is, the atlas-registered gain field-corrected and brain-masked images in the processed directory.
For MRI images in the ADNI database, we pre-processed them and then normalized the MRI of each subject into a standard space defined by the template image T1.nii supplied with the SPM8 toolbox. The detailed configurations included source image smoothing with 8mm, affine regularization with ICBM space template, a nonlinear frequency cutoff of 25, 16 nonlinear iterations, and trilinear interpolation. Finally, we extracted the sub-volumes within the bounding box of (−79~80, −112~79, −74~85 in mm) relative to the anterior commissure in the space described in the atlas of Talairach and Tournoux . Therefore, all MRI images were normalized into 160×192×160 voxel-wise images.
To further analyze the MRI images, we segmented the whole brain images in ADNI database into three parts: gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF). Three parts have been processed by ICA.
The framework of our proposed ICA-based method is shown in Fig. 1. First, all MRI data are normalized into a template as abovementioned in the preprocessing section. Next, the normalized brain images are decomposed into MRI basis functions and the corresponding coefficients using the FastICA algorithm . Finally, the separated coefficients are fed into a SVM-based classifier for diagnosis of individuals with or without AD.
A key technique in the framework is ICA as its basic goal to solve the blind signal separation problem by expressing a set of random variables (observations) as linear combinations of statistically independent component variables (source signals). According to assumptions of sources independent over space or time, ICA can be further described as spatial ICA (sICA)  and time ICA (tICA) . SICA seeks a set of mutually independent component (IC) source images and a corresponding set of unconstrained time courses. By contrast, tICA seeks a set of IC source time courses and a corresponding set of unconstrained images. In concrete sMRI data, sICA embodies the assumption that each image in X is composed of a linear combination of spatially and statistically independent images.
The spatial ICA model for MRI images is shown in Fig. 2, where, the data X denotes voxels of all MRI images, the voxels in each MRI image are arranged into one row in X. S and A are unsupervisedly learned from X. Each row in A denotes a base, also called a basis function, or a feature. In this study, these bases are also considered as potential neuroimaging biomarkers.
In the study, the MRI images have been processed by the FastICA algorithm, as described in Table 1. After ICA computation, any MRI image can be reconstructed by linearly combining the set of basis functions and the corresponding coefficients, for example, as shown in Fig. 3.
SVM , a very popular classifier and one of the machine learning methods based on statistical learning theory, has been recently used to help distinguish AD subjects from elderly control subjects using anatomical MRI images . SVM conceptually implements the idea that vectors are linearly or nonlinearly mapped to a very high dimension feature space. In this feature space, a linear separation surface is created to separate the training data by minimizing the margin between the vectors of the two classes. We used the LIBSVM (http://www.csie.ntu.edu.tw/~cjlin/libsvm) as the classifier to diagnose AD or MCI subjects from normal controls.
We have performed several analyses using the MRI images from OASIS and ADNI databases to verify the performances of the proposed method, such as feature extraction, representation, and discrimination of AD and MCI subjects from normal controls. On the MRI images in the OASIS database, we applied the FastICA algorithm to extract features from MRI images and to show the ability of manifold representation in the ICA subspace. Promising classification accuracy obtained by the SVM-based classifier has proven that the proposed method is very useful for the analysis of the structure of MRI images. To further verify the proposed method, we have applied the method to analyze gray matter of the MRI images in the ADNI database.
Using the FastICA algorithm to decompose the brain images, we can obtain MRI image basis functions as shown in Fig. 4. From these bases, it should easily be noticed that each base has only coded a local part of the brain MRI images. Different bases locally code different parts of the brain images. If a corresponding coefficient is significant, we can draw a conclusion that the base is more important in the individual MRI scans.
After using the FastICA tool to separate all MRI images with AD and normal subjects, the dimensionality of all independent components is reduced using principal component analysis (PCA) . Next, we project all MRI scan samples onto the ICA feature subspace. For the convenience of view, three maximum principal components are shown in three-dimensional space in Fig. 5.
As one of the applications of the ICA model, the whole data set in the OASIS database is divided into four groups: Group 1 (100 subjects with the CDR score greater than 0), and Group 2, 3, and 4, included 116, 100, and 100 subjects, respectively, divided from the 316 control subjects without dementia in the descending order. Ages in Group 2 are in the range from 18 to 19, with an average of 18.44±0.56; Group 3 ages are from 25 to 59, with an average of 42.16±17.16; and Group 4 ages are from 59 to 94, with an average of 75.58±18.42.
The left of Fig. 5 shows the distribution of MRI data on the 3D subspace spanned by age, principal component 1 (PC1), and PC2. Here, PC1 and PC2 are reconstructed from decomposed components using PCA. And the right denotes distribution in 3D subspace of PC1, PC2, and PC3, same meaning as in the Left. The two figures indicate that brains are different, gradually changing with ages. The results are consistent with that presented by Marcus et al. .
After feature extraction and representation is done on an independent subspace, all MRI data will be diagnosed and clustered into two groups: AD and normal controls using a classifier based on SVM. We have conducted two experiments with the data. One is a conventional training and testing method, and the other is a leave-one-out method. The first method randomly selects half of the data as the training set, the rest as the testing data. The leave-one-out method leaves one data point as the testing data, the rest as the training data. The classification accuracy is shown in Tables 2 and and33.
In Table 2, the mean accuracy was obtained after we repeated the experiments over 100 times using the conventional training and testing method. In the same way, the mean accuracy in Table 3 was achieved after each sample was tested. From these tables, it is noted that the differences between Group 1 and other groups are more remarkable as the age gap increases. In Group 1, there are 100 subjects over age 60, and in Groups 2, 3, 4, the average ages are 18.44, 42.16, and 75.58, respectively.
After image normalization using SPM8 and feature extraction using ICA, the corresponding independent component coefficients are considered as the representations of MRI images on the subspace spanned by independent components, and they can be classified into two groups: AD vs. NCs and MCI vs. NC, using a SVM classifier.
As the SVM classifier is based on statistical theory, we divided all MRI image data (202 AD, 410 MCI and 236 NC) into two data sets: training set and testing set. Taking the influence of the number of training samples on classification accuracy into account, two ratios of training sets are compared: 75% training set vs. 25% testing set, and 90% training set vs. 10% testing set. All training sets were randomly selected from the ADNI MRI database with AD, MCI, and NC subjects. The best classification accuracy is shown in Table 4.
Only the gray matter of the brain was analyzed to check the degree of significance in discriminating AD and MCI from NC. First, the whole brain MRI image was segmented by segment module in SPM8 into three parts: gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF). In the study, we mainly analyzed the GM images. Then, the segmented GM images were decomposed by the FastICA algorithm abovementioned. The decomposed independent coefficients were viewed as the representations in the ICA subspace and fed into the SVM classifier for discriminating AD and MCI from NC. The experimental methods were the same as the whole MRI images. The classification results are shown in Table 5.
Herein we have demonstrated that the fully automatic method based on ICA for classification of MRI images is very useful in discriminating among AD, MCI, and NC subjects. This study is comparable with other related works.
ICA is one of the data-driven, multivariate, and unsupervised methods with an advantage of using no a priori information. It has become an increasing popular biomedical data-mining technique as well as processing method for functional and structural MRI data. To our knowledge, there are rare reports on the application of ICA to structural MRI data with AD patients. However, ICA might also be a useful tool for early AD diagnosis of sMRI data analysis because it has shown its usefulness in processing sMRI data from schizophrenia patients . Therefore, in this study, we have applied ICA to the analysis of AD-related sMRI data. Experimental results on MRI data from the OASIS and ADNI databases have indicated that the proposed method based on ICA is a useful tool for classifying AD, MCI, and NC data.
Marcus et al.  used the FAST program in the FSL software suite (www.fmrib.ox.ac.uk/fsl) to compute normalized whole-brain volume (nWBV) and a plotted nWBV distribution line across the adult life span in the OASIS database. Our results shown in Fig. 5 are similar to that presented in their paper, indicating that the feature extraction method based on ICA preserves the usefully changing information with the development of AD. Therefore, there are the same manifolds in both originally statistical space (nWBV-ages) and ICA subspace.
Garcia-Sebastian et al.  studied feature extraction processes based on VBM analysis to classify MRI volumes of AD patients and normal subjects. They applied SVM to perform classification on the MRI volumes of 98 females and obtained better results with 80.6~87.5% accuracy. Savio et al.  applied four different models of ANNs to the same dataset and reported the result of 83% classification accuracy. Zhou et al.  proposed a framework to analyze the hippocampal shape difference between AD and NC within the OASIS dataset and obtained the classification accuracy of 52.6~61.5%. Unlike their smaller sample size, we processed a total of 416 MR images from 100 AD and 316 control subjects and presented the more statistically reliable results. Our results showed that we achieved an average accuracy of 92.4% when we classified all AD and controls.
Our results from the ADNI database are also comparable with that presented by Chupin et al.  and Gutman et al. . Gutman et al. used a SVM classifier based on spherical harmonics to classify 49 AD patients and 63 controls with 75.5% sensitivity and 87.3% specificity, and with 82.1% overall correctness. Chupin et al.  proposed a fully automatic method using probabilistic and anatomical priors for hippocampus segmentation. They further used the obtained hippocampal volumes to automatically discriminate among AD patients, MCI patients, and elderly controls. They obtained an 82% classification rate, 75% sensitivity, and 89% specificity for classification among 29 AD and 30 NC subjects. In their experiments on the subjects with 67 AD, 143 MCI, and 123 NC subjects aged between 70 and 80 years old, the classification rates appeared to be 76–80% of AD vs. NC and 61–63% of MCI vs. NC.
In conclusion, our study has proven that the proposed ICA-based method may be useful for classifying AD and MCI subjects from normal controls. However, the achieved classification accuracy is still not optimal due to several factors. First, the ADNI is a multicenter database (approximate 50 centers using different voxel sizes and acquisition parameters), and we did not take scanner or center effects into account. Next, potential brain vascular lesions in the subjects may be a confounding factor. Third, due to the large number of images, we have not manually validated the normalization quality of these MRI images. Finally, the factors of age and gender have not been taken into account, and we have not yet examined these data in the context of amyloid burden in these subjects, using CSF markers. All of these aspects should influence the final classification accuracy.
Our main goal in the current study is to verify the performances of the proposed method based on ICA. Much work lies ahead, however. We have presented the basis functions, or features, but questions such as: What is the exact meaning of these features? Which feature is more important? How many features are related to AD? What are the effects of the age, gender, and amyloid pathology upon the features? etc., have not yet been addressed. Therefore, our future work will focus on answering these questions. Further, we would like to perform result comparison with the ones from other published semi-automated methods. Moreover, the ADNI has provided follow-up MRI data, and to apply the proposed method to the longitudinal analysis of MRI images is our next step.
The authors would like to express their gratitude for the support from the Shanghai Maritime University (Grant No. 20090175) and the research funds from the Radiology Department of Brigham and Women's Hospital (BWH). We are thankful for the manuscript editing by Ms. Kimberly Lawson of the BWH Radiology Department.
ADNI data collection and sharing for this project was funded by the Alzheimer's Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: Abbott, AstraZeneca AB, Bayer Schering Pharma AG, Bristol-Myers Squibb, Eisai Global Clinical Development, Elan Corporation, Genentech, GE Healthcare, GlaxoSmithKline, Innogenetics, Johnson and Johnson, Eli Lilly and Co., Medpace, Inc., Merck and Co., Inc., Novartis AG, Pfizer Inc, F. Hoffman-La Roche, Schering-Plough, Synarc, Inc., and Wyeth, as well as non-profit partners the Alzheimer's Association and Alzheimer's Drug Discovery Foundation, with participation from the U.S. Food and Drug Administration. Private sector contributions to ADNI are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer's Disease Cooperative Study at the University of California, San Diego. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of California, Los Angeles. This research was also supported by NIH grants P30 AG010129, K01 AG030514, and the Dana Foundation.
Partial data used in the preparation of this article were obtained from the ADNI database (www.loni.ucla.edu\ADNI). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators available at: http://www.loni.ucla.edu/ADNI/Collaboration/ADNI_Manuscript_Citations.pdf.
The authors declare no potential conflicts. Authors' disclosures available online (http://www.jalz.com/disclosures/view.php?id=).