|Home | About | Journals | Submit | Contact Us | Français|
Recently, resting-state functional magnetic resonance imaging (R-fMRI) has emerged as a powerful tool for investigating functional brain organization changes in a variety of neurological and psychiatric disorders. However, the current techniques may need further development to better define the reference brain networks for quantifying the functional connectivity differences between normal and diseased subject groups. In this study, we introduced a new clustering-based method that can clearly define the reference clusters. By employing group difference information to guide the clustering, the voxels within the reference clusters will have homogeneous functional connectivity changes above predefined levels. This method identified functional clusters that were significantly different between the amnestic mild cognitively impaired (aMCI) and age-matched cognitively normal (CN) subjects. The results indicated that the distribution of the clusters and their functionally disconnected regions resembled the altered memory network regions previously identified in task fMRI studies. In conclusion, the new clustering method provides an advanced approach for studying functional brain organization changes associated with brain diseases.
Recently, resting-state functional magnetic resonance imaging (R-fMRI) has emerged as a powerful tool for investigating normal human brain functional organization (Biswal et al., 2010; Buckner and Vincent, 2007) and its changes in a variety of neurological and psychiatric disorders (Chen et al., 2011; Fox and Greicius, 2010; Fox and Raichle, 2007; Li et al., 2012; Rosen and Napadow, 2011; Xie et al., 2011a, b). Despite these advances, extracting valuable information hidden in the gigabytes of four dimensional images of hundreds to thousands of subjects remains a challenge (Biswal et al., 2010; The ADHD-200 Global Competition 2011).
R-fMRI data analyses are commonly categorized into preprocessing and postprocessing steps. While numerous studies have been conducted to optimize R-fMRI preprocessing steps (Chang and Glover, 2009; Chen et al., 2012; Fox et al., 2009; Weissenbacher et al., 2009), developing the optimum techniques for R-fMRI postprocessing data analysis remains a challenge (for a recent review of the procedure and limitations of the current techniques see: Margulies et al., 2010). For example, in independent component analysis (ICA) or conventional clustering approaches in R-fMRI studies, independent components (ICs) or clusters identified using subject groups with different brain statuses, are not identical (Tyszka et al., 2011; Schrouff et al., 2011; Helekar et al., 2010). The spatial distribution and even the total numbers of ICs or clusters are usually not exactly matched. As a result, it is not clear how the unmatched ICs or clusters should be chosen as reference ICs or clusters to quantify the functional connectivity differences. There is a way to define a single set of reference ICs or clusters, using data that combine normal and diseased subjects. However, if a normal IC or cluster has disappeared in the diseased group because of the effect of the disease, it is possible that this IC or cluster will not be identified using the combined data. As a result, this approach may not be sensitive in detecting functional connectivity differences. There is a need to develop the current techniques to better detect functional connectivity differences between groups.
In this study, we introduced a new clustering-based method that clearly defines the clusters to quantify the functional connectivity differences. The new method first produces group difference information, and then, uses the information to guide the clustering. As a result, there only will be one set of clearly defined reference clusters for the entire cohort. In addition, using the group difference information to guide the clustering ensures that the voxels within the reference clusters will have homogeneous functional connectivity changes above predefined levels.
In practice, this new method can be applied to various research interests. The clustering can be guided by the functional connectivity differences between healthy and diseased states, as demonstrated below. It also can be guided by the correlation between functional connectivity and behavior assessments or functional connectivity improvement and an outcome measure responding to a treatment, etc.
Seventeen amnestic mild cognitively impaired (aMCI) and 22 cognitively normal (CN) subjects were recruited through the Memory Disorders Clinic at the Medical College of Wisconsin (MCW). The study was conducted with MCW Institutional Review Board (IRB) approval and in compliance with Health Insurance Portability and Accountability Act (HIPAA) regulations. Written informed consent was obtained from each participant or his or her caregiver. The detailed inclusion and exclusion criteria for the two subject groups (aMCI and CN) have been described previously (Goveas et al., 2011; Xie et al., 2011a). Detailed demographic information related to these subjects is listed in Table I.
Imaging was performed using a whole-body 3T Signa GE scanner with a standard transmit-receive head coil. During the resting-state acquisitions, no specific cognitive tasks were performed, and the study participants were instructed to close their eyes and relax inside the scanner. Sagittal resting-state functional MRI (fMRI) datasets of the whole brain were obtained in eight minutes with a single-shot gradient echo-planar imaging (EPI) pulse sequence. The fMRI imaging parameters were: TE of 25 ms, TR of 2 s, flip angle of 90°; 36 slices were obtained without gap; slice thickness was 4 mm with a matrix size of 64×64 and field of view of 24×24 cm. High-resolution SPGR 3D axial images were acquired for anatomical reference. The parameters were: TE/TR/TI of 4/10/450 ms, flip angle of 12°, number of slices of 144, slice thickness of 1 mm, matrix size of 256×192. A pulse oxymeter and a respiratory belt measured physiological noise sources to minimize the potential artifacts in the low-frequency spectrum (Birn et al., 2006; Glover et al., 2000).
A series of preprocessing steps common to most fMRI analysis was conducted, using Analysis of Functional NeuroImages (AFNI) software (http://afni.nimh.nih.gov/afni/), SPM8 (Wellcome Trust, London, UK) and Matlab (Mathworks, Natick, MA). The preprocessing allows for T1-equilibration (removal of the first five volumes of fMRI data). It also involves cardiac and respiratory artifacts removal (3dretroicor, Birn et al., 2006; Glover et al., 2000); slice-acquisition-dependent time shift correction (3dTshift); motion correction (3dvolreg); detrending (3dDetrend); despiking (3dDespike); segmentation (SPM8) of white matter and CSF components from each subject’s SPGR image; white matter and CSF mask creation (3dcalc and 3dfractionize); averaged white matter and CSF signal retrieval (3dROIstats); white matter, CSF signal and motion effect removal (3dDeconvolve); low-frequency band-pass filtering (3dFourier); and spatial normalization (original space to Talairach space, adwarp; the anat parent dataset in Talairach space used in adwarp was obtained using @auto_tlrc and the SPGR image in original space). The result was the preprocessed R-fMRI time courses.
Functional connectivity difference information between the aMCI and CN groups was used to produce voxelwise clusters. The analysis excluded motor regions and occipital cortex regions, and was applied to the remaining 10,988 voxels of the cortical and subcortical brain regions. The functional connectivity of each voxel pair was calculated from the corresponding preprocessed R-fMRI time courses, using Pearson product-moment correlation coefficient (r). The result was a 10988 x 10988 r-matrix for each subject (Figure 1A, left). The Fisher transformation (m=0.5ln[(1+r)/(1-r)], Zar, 1996), which yielded variants of approximately normal distribution, was applied to the individual r-matrix to generate the m-matrix. The functional connectivity difference was calculated, using the m-matrices and one-tailed two-sample t-test. Specifically, let mij(aMCI1,…,aMCIn) and mij(CN1,…,CNn) be the (i,j)-th element of the m-matrices for aMCI and CN subjects. A one-tailed two-sample t-test was conducted to examine whether mij of the aMCI group was significantly less than that of the CN group. Such a test is repeated for all elements of the m-matrix, yielding 10988 x 10988 p-values. The one-tailed test was employed because we only considered the reduced connectivity in this study to demonstrate the method. The p-values were transformed to Z-values, using the inverse of the normal cumulative distribution function. The result was a 10988 x 10988 Z-matrix. The Z-value was then thresholded at Z<−1.96 (p<0.025, uncorrected for multiple comparisons) to keep the pairs that showed reduced connectivity when the aMCI group is compared to the CN group. Because of the problems of executing multiple comparisons, each voxel is expected to have a reduced connectivity to 2.5% (10988 x 2.5% ≈ 275) of voxels by chance. To reduce computation cost and the chance of false discovery, voxels with reduced connectivity to less than 275 voxels were removed from further analysis. As a result, 4,442 voxels were retained. The retained thresholded Z-matrix is shown in Figure 1A (right). Note that each voxel has 4,442 thresholded Z-values, corresponding to a particular row in the Z-matrix. We call each row of 4,442 thresholded Z-values the “Z-vector.”
The hierarchical relationship of the 4,442 voxels (Figure 1B) was obtained based on the similarity of their Z-vectors. The clustering analysis was implemented, using the functions “linkage” and “dendrogram” in the MATLAB Statistics Toolbox. The 4,442 voxels’ Z-vectors were sent to the “linkage” function as input. The “correlation” method was used in the “linkage” function to calculate the distance (d) between voxels. Specifically,
where, rij is the correlation coefficient between the Z-vectors belonging to voxel i and voxel j. Obviously, the more similar the Z-vectors are, the less distance voxels i and j will have. A detailed description of the hierarchical clustering method can be found in the reference (Friedman et al., 2009).
Each level of the hierarchy represents a particular grouping of the voxels into a unique cluster (Figure 1C). To determine the functional homogeneity of a cluster, a homogeneity index was defined:
where rij is the correlation coefficient between voxel i’s Z-vector and the cluster’s Z-vector (defined as the average Z-vector of all the voxels in the cluster). Figure 1D (left) shows the posterior cingulate cluster with homogeneity threshold of 0.2, for an example. The colored pixels in Figure 1D (right) shows where the Z values are less than −1.96 (p<0.025, uncorrected).
To account for false-positive findings, a list of nine homogeneity thresholds (0.1, 0.2, …, 0.9) and nine corresponding cluster size thresholds (minimum number of voxels in a true-positive cluster) were used. To compensate for the multiple comparisons of the nine thresholds, a false-discovery rate of p<0.05 (corrected) was employed (FDR, Benjamini et al., 2006; Benjamini and Hochberg, 1995), which required a p<0.047 for an individual cluster size threshold (uncorrected). The individual cluster size threshold was set so that there was less than a 4.7% chance (uncorrected) to identify a cluster that satisfied all of the following conditions in randomly grouped subjects: 1) the voxels in the cluster were all spatially connected; 2) the number of voxels in the cluster was more than the cluster size threshold; 3) the homogeneity index of the cluster was above the corresponding homogeneity threshold.
To obtain the cluster size thresholds, a subject permutation test was performed. Specifically, the clustering procedure was applied to 100 iterations of randomly grouped subjects. Since there were 17 aMCI subjects and 22 CN subjects, we assigned 17 subjects in pseudo group A and 22 subjects in pseudo group B. To minimize the difference between pseudo groups, the CN to aMCI subject number ratio (22/17≈1.3) was kept in both pseudo groups. Clearly, the numbers of subjects have to be integers (i.e., we cannot assign 9.6 CN / 7.4 aMCI to pseudo group A and assign 12.4 CN / 9.6 aMCI to pseudo group B). To achieve the desired ratio on average, the pseudo groups were created, as described below. In 59 iterations, 10 randomly selected CN and seven randomly selected aMCI subjects were assigned to pseudo group A, and the remaining 12 CN and 10 aMCI subjects were assigned to pseudo group B. In the remaining 41 iterations, nine randomly selected CN and eight randomly selected aMCI subjects were assigned to pseudo group A, and the remaining 13 CN and nine aMCI subjects were assigned to pseudo group B. As a result, the average ratio of CN to aMCI subject number is ((10/7)*59+(9/8)*41)/100 ≈ 1.3 in pseudo group A. The average ratio of CN to aMCI subject number is ((12/10)*59+(13/9)*41)/100 ≈ 1.3 in pseudo group B. Any spatially connected cluster identified, using the pseudo groups and the homogeneity thresholds, was considered a false-positive finding. The false discovery rate was calculated at voxel size threshold 1, 2, …, 400. The cluster size threshold that yielded 4.7% false-positive findings was selected for each homogeneity threshold by linear interpolation. The subject permutation test was repeated four times. The average cluster size thresholds were employed to define the reference cluster. It is possible that a bigger reference cluster may contain smaller subclusters that also satisfy the cluster criteria. In such cases, the smaller subclusters are considered part of the bigger cluster. The proposed method discussed above is fairly complex. A method flowchart was created to aid readers. Please see Supplementary Figure 1 for details.
A connectivity index was defined to quantify the decreased connections in each subject. To avoid double-dipping or bias from circularity, we performed the leave-one-out procedure. Specifically, the clusters were determined by using all subjects except the one to be evaluated. The thresholded cluster Z-matrices (Figure 1D, right, for example) were used to create decreased functional connectivity masks for all Z-values < −1.96 (p<0.025). Then, all m-values in the masks of the excluded subject, were averaged. The averaged m-value was defined as the subject’s connectivity index. This entire process was repeated for each subject, thereby yielding one connectivity index per subject for cross-validation.
Figure 2 shows the cluster size threshold (mean ± standard deviation) at 5% false positive discovery rate obtained from the permutation tests. The curve was used to determine spatially connected clusters with both larger cluster size and higher homogeneity index than the values on the curve.
Based on the false-positive discovery rate curve in Figure 2 and the functional connectivity difference information between aMCI and CN groups, three clusters above the curve were identified, as demonstrated in Figure 3. The three clusters were identified by their anatomical locations. Specifically, they are the posterior cingulate cluster (Figure 3A), the thalamus cluster (B) and the retrosplenial cingulate cluster (C). The voxels in each of the three clusters are all spatially connected. The homogeneity indices of these clusters are 0.221, 0.202 and 0.258, respectively. The cluster sizes (number of voxels in a cluster) of the three clusters are 93, 74 and 71, respectively.
To avoid potential circularity (define the reference clusters, using group difference information, and then, detect group difference, using the same clusters as “seed” regions), we did not use the clusters as seeds for further group comparisons. Rather, we directly employed the group difference in Figure 1D, for example, to visualize the decreased connectivity maps. Figure 4 shows the decreased connectivity patterns for each of the three clusters. The color bar indicates the disconnection percentage. For example, light blue indicates that a voxel is disconnected from a corresponding cluster by 75% to 100%. The percentage was obtained by the number of voxels whose Z-values were less than −1.96, divided by total number of voxels in the cluster. Most regions showing lower connectivity to the three clusters resembled the altered memory network regions (Machulda et al., 2009).
Figure 5 shows the connectivity indices of the 17 aMCI and 22 CN subjects. These were calculated using the leave-one-out procedure. The connectivity indices of the aMCI and CN group were compared using a one-tailed two-sample t-test. The connectivity indices of the aMCI group are significantly lower than those of the CN group (p<0.013).
We introduced a new clustering-based method that can clearly define the reference clusters. Using this method, clusters with homogeneous functional connectivity changes between the aMCI and CN groups were identified. The distribution of these clusters, as well as their disconnected regions (the areas that had decreased connectivity to these clusters), resembled the altered memory network regions identified in task-fMRI studies. Since intrinsic functional architecture may predict task-induced BOLD activity (Mennes et al., 2010; Vincent et al., 2007), the results may provide a strong physiological basis that can explain the memory task performance differences in aMCI and CN subjects.
The clustering method can advance our knowledge about the complex relationship among diseases, behavior symptoms and the neural networks. As noted, the functional connectivity difference between healthy and diseased states can guide clustering, as demonstrated here. It also can be guided by the correlation between functional connectivity and behavior assessments or the correlation between functional connectivity improvement and an outcome measure responding to a treatment, etc. Furthermore, the method may be generalized to multiple groups to detect functional connectivity differences among subjects with Alzheimer’s disease, aMCI and CN status.
A list of nine homogeneity thresholds (0.1, 0.2, …, 0.9) were used to define clusters that were functionally homogeneous above each threshold to avoid potential bias by choosing a single-homogeneity threshold. The threshold-free cluster enhancement (TFCE, Smith and Nichols, 2009) method may prevent the problem of choosing thresholds. Although the TFCE method requires height and extent parameters optimization, it may provide a way to simplify the clustering method in future studies.
To detect group differences, the Fisher transformation (Zar, 1996) and the two-sample t-tests were employed. The approach is only optimal when the data are perfectly Gaussian after the transformation. If they are not, other tests (e.g., the rank sum test) may yield higher power. To further validate the findings of the current study, we performed the rank sum test on the whole data set. The identified clusters were the same. However, the computation time was 25 times longer. Therefore, we did not use the rank sum test in the leave-one-out procedure or in the subject permutation test.
The proposed method has limitations. First, it increases the computation cost. In ICA or conventional clustering methods using time course similarity, the data size is proportional to nV (#voxels) x nT (#time points). In this proposed method, the data size to be analyzed is proportional to nV x nV. This can easily increase the computation cost by the order of hundreds (nV / nT). With the ever-increasing computation power, this difficulty can be alleviated. Still, it may affect the feasibility of the method. Second, the method may only be sensitive to pairwise-type correlations. The connectivity differences may be too small to be detected at this level. They may be more discernible if one uses the full multivariate information, as in the case with the ICA. However, the number of detectable ICs also depends on the data (Smith et al., 2009). Similarly, different data-driven approaches may fail to detect connectivity differences at some level. These are important considerations that deserve a full-length study in the future.
In conclusion, this clustering-based method clearly defines the reference clusters for detecting functional connectivity differences between aMCI and CN subjects. The functional connectivity changes of all voxels within the reference clusters are guaranteed to be homogeneous above desired levels, which is not warranted in the current available methods. The distribution of the reference clusters, as well as their disconnected regions, resembled the altered memory network regions identified in task-fMRI studies. The connectivity indices obtained from the leave-one-out analysis were significantly different between aMCI and CN subjects. The method has the potential to identify brain connectivity biomarkers, which can be used to classify disease status, predict individual behavioral performance, and monitor the efficacy of disease-modifying therapies.
The authors thank Carrie M. O’Connor, M.A., for editorial assistance, Judi Zaferos-Pylant, B.S.M., and Yu Liu, M.S., for MRI technical support. This work was supported by National Institutes of Health grants: NIH R01 AG20279, NIH R01 DA10214, NIH-NCRR CTSA program grant 1UL1RR031973.
Disclosure Statement: No authors of this paper have reported any possible conflict of interests.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.