Alzheimer's disease (AD) is a progressive and eventually fatal disease of the brain, characterized by memory failure and degeneration of other cognitive functions. Pathology may begin long before the patient experiences any symptom and often lead to structural changes of brain anatomies. With the aid of medical imaging techniques, it is now possible to study in vivo the relationship between brain structural changes and the mental disorder, providing a diagnosis tool for early detection of AD. Current studies focus on MCI (mild cognitive impairment), a transitional state between normal aging and AD. These subjects suffer from memory impairment that is greater than expected for their age, but retain general cognitive functions to maintain daily living. Identifying MCI subjects is important, especially for those that will eventually convert to AD (referred to as Progressive-MCI, or in short P-MCI), because they may benefit from therapies that could slow down the disease progression.
Although T1-weighted MRI, as a diagnostic tool, is relatively well studied, it continues to receive the attention of researchers due to its easy access in clinical settings, compared with task-based functional imaging 
. Commonly used measurements can be categorized into three groups: regional brain volumes 
, cortical thickness 
and hippocampal volume and shape 
. Volumetric measurements can be further divided into two groups according to feature types: voxel-based features 
or region-based features 
. In this paper, we focus on region-based volumetric measurements of the whole brain for the following reasons. Firstly, the abnormalities caused by the disease involved in our study are not restricted to the cortex, because, as shown by pathological studies 
, AD related atrophy begins in the medial temporal lobe (MTL), which includes some subcortical structures such as the hippocampus and the amygdala. Secondly, a whole brain analysis not restricted to the hippocampus is preferred, because early-stage AD pathology is not confined to the hippocampus. Also affected are the entorhinal cortex, the amygdala, the limbic system, and the neocortical areas. As has been pointed out in several studies 
, although the analysis of the earliest-affected structures, such as the hippocampus and the entorhinal cortex, can increase the sensitivity of MCI prediction, the inclusion of the later-affected temporal neocortex may increase the prediction specificity, and hence improve the overall classification accuracy 
. Thirdly, we focus on region-based volumetric features because voxel-based features are highly redundant 
, which may affect their discrimination power.
The determination of the Region of Interest (ROI) is the key for region-based analysis methods. Once ROIs have been determined either by pre-definition 
or by adaptive parcellation 
, the mean tissue densities of gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF) in each ROI are usually used as features for classification. Disease-induced brain structural changes may occur not at isolated spots, but in several inter-related regions. Therefore, for a more accurate characterization of the pathology, feature correlation between ROIs has to be taken into account. Measurement of such correlations may provide potential biomarkers associated with the pathology, and hence is of great research interest. However, for most existing approaches, the dependencies among features are not explicitly modelled in the feature extraction procedure, but only implicitly considered by some classifiers, such as the support vector machines (SVMs), during the classification process. For example, a linear SVM classifier models the dependency (inner product) of feature vectors between two subjects, instead of the interaction of two ROIs (via volumetric features) of a specific subject. These implicitly encoded feature dependencies become more difficult to interpret when a nonlinear SVM classifier is used. Based on this observation, we propose in this paper a new type of features derived from regional volumetric measurements, by taking into account the pairwise ROI interactions within a subject directly. To achieve this, each ROI is first characterized by a vector that consists of the volumetric ratios of GM, WM and CSF in this ROI. Then, the interaction between two ROIs within the same subject is computed as Pearson correlation of the corresponding volumetric elements. This gives us an anatomical brain network, with each node denoting an ROI and each edge characterizing the pairwise connection.
The correlation value measures the similarity of the tissue compositions between a pair of brain regions. When a patient is affected by MCI, the correlation values of a particular brain region with another region will be potentially affected, due possibly to the factors such as tissue atrophy. These correlation changes will be finally captured by classifiers and used for MCI prediction. An early work was presented in a conference 
. It is worth noting that by computing the pairwise correlation between ROIs, our approach provides a second order measurement of the ROI volumes, in contrast to the conventional approaches that only employ first order volumetric measurement. As higher order measurements, our new features may be more descriptive, but also more sensitive to noise. For instance, the influence of a small ROI registration error may be exaggerated by the proposed network features, which may reduce the discrimination power of the features. To overcome this problem, a hierarchy of multi-resolution ROIs is used to increase the robustness of classification. Effectively, the correlations are considered at different scales of regions, thus providing different levels of noise suppression and discriminative information, which can be sieved by a feature selection mechanism as discussed below for guiding the classification. Additionally, we consider the correlations both within and between different resolution scales. This is because the optimal scale is often not known a priori. We will demonstrate the effectiveness of the proposed approach with empirical evidence. In this study, we consider a fully-connected anatomical network, features extracted from which will form a space with intractably high dimensionality. As a remedy, a supervised dimensionality reduction method is employed to embed the original network features into a new feature space with a much lower dimensionality.
Without requiring any new information in addition to the baseline T1-weighted images, the proposed approach improves the prediction accuracy of MCI from
(of conventional volumetric features) to
(of hierarchical network features), evaluated by data sets randomly drawn from the ADNI dataset 
. Our study shows that this improvement comes from the use of the network features obtained from hierarchical brain networks. To investigate the generalizability of the proposed approach, experiments are conducted repetitively based on different random partitions of training and test data sets with different partition ratios. The average classification accuracy estimated in this way tends to be more conservative than the conventional Leave-One-Out approach. Additionally, although the proposed approach can be easily generalized to incorporate regional similarity measurements other than Pearson correlation, the experimental results reinforce the choice of Pearson correlation for our application, compared with some commonly used similarity metrics.
Before introducing our proposed approach, it is worth highlighting the advantages of the hierarchical brain network-based approach over the conventional volume-based approaches. Firstly, as mentioned above, our proposed method utilizes a second-order volumetric measurement that is more descriptive than the conventional first-order volumetric measurement. Secondly, compared with the conventional volumetric measurements that only consider local volume changes, our proposed hierarchical brain network considers global information by pairing ROIs that may be spatially far away. Thirdly, our proposed method seamlessly incorporates both local volume features and global network features for the classification by introducing a whole-brain ROI at the top of the hierarchy. By correlating with the whole-brain ROI, each ROI can provide a first order measurement of local volume. Fourthly, although our current approach uses Pearson correlation, it can be easily generalized to any other metrics that are capable of measuring the similarity between features of ROI pairs. Fifthly, the proposed method involves only linear methods, leading to easy interpretations of the classification results. Finally, for the first time, we investigate the relative speeds of disease progression in different regions, providing a different pathological perspective complementary to spatial atrophy patterns.