|Home | About | Journals | Submit | Contact Us | Français|
To develop a multivariate machine learning classification-based cerebral blood flow (CBF) quantification method for arterial spin labeling (ASL) perfusion MRI.
The label and control images of ASL MRI were separated using a machine-learning algorithm, the support vector machine (SVM). The perfusion-weighted image was subsequently extracted from the multivariate (all voxels) SVM classifier. Using the same preprocessing steps, the proposed method was compared to standard ASL CBF quantification method using synthetic data and in-vivo ASL images.
As compared to the conventional univariate approach, the proposed ASL CBF quantification method significantly improved spatial signal-to-noise-ratio (SNR) and image appearance of ASL CBF images.
the multivariate machine learning-based classification is useful for ASL CBF quantification.
Arterial spin labeled (ASL) perfusion MRI is a noninvasive technology that allows absolute quantification of blood flow(1,2). In ASL, inflowing arterial blood water is magnetically labeled proximal to the tissue of interest, and perfusion is determined by pair-wise comparison with separate images acquired with control labeling using various subtraction approaches (3–5). While ASL MRI has been widely used for assessing baseline CBF by averaging the perfusion signal acquired in a series of ASL acquisitions, it has also been used to assess dynamic brain function, especially for experiments with long duration designs for which the standard blood-oxygen-level-dependent (BOLD) contrast-based fMRI is hampered due to the low frequency MRI signal drift (3,6). However, ASL MRI has a relatively low signal-to-noise-ratio (SNR) because only a small fraction of tissue water is labeled (7), resulting in a big challenge for postprocessing. Several signal processing approaches have been proposed to either improve the spatial SNR or temporal stability of ASL perfusion signal using filtering (5,8–11), noise regression or deconvolution (12–16), or wavelet denoising (11,17). Regardless of which method is used, ASL CBF quantification still relies on pairwise subtraction of the spin labeled images and control images at each voxel in a univariate manner. But by treating each voxel as an independent unit, the univariate approach ignores the abundant spatial correlations existing in ASL MRI due to the systematic labeling and control labeling (L/C) signal intensity modulation. Those correlations can be exploited to improve CBF quantification (ASLQ) quality. For example, spatial smoothing (9) can partly take the spatial correlation into account but at the expense of decreased spatial resolution, which may also suppress small activations. To better consider the spatial correlations, a multivariate method is desired. Another unique ASL MRI feature that can be exploited in post processing is that the control versus label signal difference-based CBF quantification mimics a standard 2-category data classification process that can be solved with a state-of-art auto-classification.
Based on the two features described above, we proposed to apply a multivariate machine-learning algorithm, the support vector machine (SVM) (18), to ASL CBF quantification (SVMASLQ). SVM is a powerful classification algorithm (18) that has been increasingly used in fMRI data analysis to search for spatial patterns that can maximally differentiate the compared brain states or the compared populations (19–27). In this study, SVM was used to derive an optimal classifier for differentiating the L/C (label and control) images and subsequently calculate ASL CBF difference from the optimal separating hyperplane of the trained classifier.
The MRI scan protocol was approved by local IRB. Thirteen young healthy subjects (mean age=25.04±3.92, 7 male) were scanned with signed written consent forms. Two patient’s data were identified from our existing database.
MR imaging was conducted in a Siemens 3T Trio Tim whole-body scanner (Siemens Medical Systems, Erlangen, Germany). High-resolution structural images were acquired for spatial brain normalization using a 3D MPRAGE sequence (TR/TE/TI = 1620/3/950ms). ASL images were scanned using an amplitude modulated continuous ASL (CASL) perfusion imaging sequence optimized for 3.0 T (28) with a standard transmit/receive (Tx/Rx) head coil (Bruker BioSpin, USA). The head coil and foam pads were positioned carefully to reduce movement. Acquisition parameters were TR=3.8 s, TE=17 ms, FOV= 220×220 mm2, matrix=64×64×12, slice thickness=7 mm, inter-slice space=2.35 mm, labeling time=2 secs, post label delay time=1 sec, bandwidth=3 kHz/pixel, flip angle=90°. 50 label/control image pairs were acquired for each subject. Participants were asked to lie still in the scanner at rest and to keep their eyes open.
All ASL data processing was performed using the SPM8 (http://www.fil.ion.ucl.ac.uk/spm) based batch scripts (29). Head motions were corrected using an ASL MRI customized motion correction routine (14). ASL images were then smoothed with a medial isotropic Gaussian filter (FWHM = 5mm), and filtered with a high-pass Butterworth filtering (cutoff frequency = 0.01Hz). An image mask was generated by thresholding the mean image with a threshold of 20 percent of the maximum. This mask was used to remove most extracranial voxels for SVM data classification and to calculate the global signal. Residual motion and global signal were regressed out from the ASL image series (14). The component-based physiological nuisance correction (CompCor) method (12) was used to suppress the physiological noise estimated from cerebral spinal fluid (CSF). Temporal standard deviation map (TSTD) was calculated using the temporally filtered and spatially smoothed data. Image data from voxels with the top 2% TSTD were grouped into a matrix and decomposed using singular vector decomposition (30). The first 6 principal components were projected out from the raw ASL images using a linear regression model. To avoid removing ASL signal, the 6 nuisance components were orthogonalized to the zig-zaged spin labeling function before entering the regression model.
After preprocessing, CBF was derived using standard univariate subtraction-based method and SVMASLQ. Denoting the labeled images with L and control images with C, the univariate voxel-wise subtraction-based ASL CBF quantification is equivalent to finding the mean of all L (the “mL” spot in Fig. 1) signal and the mean of all C (the “mC” spot in Fig. 1) signal at each voxel dimension and then take their difference. In SVMASLQ, the perfusion difference signal is derived from the separating hyperplane which is determined by all dimensions of the support vectors. The direction of the hyperplane is not necessarily in parallel to the one obtained by the univariate approach, but rather the one that can maximally represent the global L/C difference. And voxels with weak L/C difference are partly enhanced by that separating vector direction tuning process. For example, if the L/C difference along the horizontal direction in Fig. 1 is close to 0, the direction of the univariate approach derived L/C difference image vector will be mainly determined by L/C difference along the vertical direction. Using SVM, the direction of the L/C difference vector will be partly tiled toward the horizontal axis, meaning that the projection of the L/C difference vector on the horizontal axis will be increased.
We used a linear SVM for SVMASLQ in this paper. Similar to the discussion given in (23), ASL images are most probably aligned along certain direction in the hyperspace (a hyperplane parallel to the first primary eigenvector of the image samples). Given the systematic labeling/control labeling-induced signal changes, the primary direction of the label images (a hyperplane parallel to the first primary eigenvector of the label images) within the hyperspace should be similar to that of the control images. Because there are usually only a limited number of images acquired in ASL MRI, these two primary hyperplanes can be approximately treated as two parallel hyperplanes, and a linear hyperplane can be then found to separate these two groups of data with high confidence. In case of highly varied spin labeling or large baseline signal variations that can not be completely corrected in pre-processing, a nonlinear SVM might be required to classify ASL images but the inverse transform from the high dimensional features space to the original image space is usually intractable.
As illustrated in Fig 1, a linear SVM for ASLQ is to derive a hyperplane to maximally separate the L and C images. The separating vector w (see Fig 1) of the hyperplane captures the maximum discriminance between the control and label images, which is proportional to the underlying perfusion difference signal but is estimated in a multivariate way since each L or C image is manipulated as a single data entity during the process of finding the optimal separating hyperplane. By fitting w into the CBF calculation model, we can get the final SVM-derived CBF map. In this paper, SVM classifications were performed using the SVMlight package (31). The SVM-based ASL data classification procedures were directly adopted from previous work on fMRI data analysis (23,24).
Fig. 2 illustrates the entire process of SVMASLQ. For each subject, the intracranial voxels of the preprocessed ASL raw data were stacked into a column vector and all time-series were formed into a large 2-dimensional data array. The column vectors were subsequently zero-meaned and projected into the space spanned by all of its nonzero eigenvalue associated eigenvectors and thereby generated the coefficient vector in the columns of the coefficient array. The coefficient vectors of the L/C images (L means labeling, C means control labeling) were marked with either −1 or 1, respectively. Similar to the inversion process in (23), the separating weight W of SVM-classifier was transformed back to the native image space and used as the surrogate perfusion map ΔM=EW (E is the PCA decomposition matrix). For CASL data, CBF was calculated with
where f is CBF, R1a is the longitudinal relaxation rate of blood, τ is the labeling time, ω is the post labeling delay time, α is the labeling efficiency, and λ is blood/tissue water partition coefficient (6).
Synthetic ASL data were generated using the baseline MR image in Fig. 3A and the synthetic perfusion difference image in Fig. 3B. Both images were first copied for 50 times. Temporal MR signal fluctuations were simulated with a 1/f noise and were added to the 50 baseline MR images to create the pseudo control images. Both the 1/f noise and Gaussian noise were added to the 50 perfusion image series which were subsequently subtracted off from the pseudo control images to create the pseudo label image series. Both the label and control images were then combined to form a 100 ASL L/C image series.
SVMASLQ was compared to the standard univariate quantification method using both synthetic data and in-vivo data. SNR was calculated as the ratio of mean/standard deviation of the intracranial perfusion difference signal. Two clinical ASL data were also selected from existing database to visually show the quality improvement using SVMASLQ. Patient 1 had Moyamoya syndrome and had stroke in the right hemisphere. The post-labeling delay time in the ASL acquisition was 2 sec, which is long enough for normal brain tissue, but might not be sufficient for regions with stroke, further reducing SNR and perfusion image quality. Patient 2 had Alzheimer’s Disease, which is associated with reduced CBF as compared to normal subjects, again resulting in reduced SNR and image quality.
Fig. 3 shows the CBF quantification results for the synthetic data. The CBF map produced by SVMASLQ (Fig. 3C) showed less noise (SNR=3.42) than that of the standard univariate pairwise subtraction approach (Fig. 3D) (SNR=3.32) in both grey matter and white matter (more obvious in white matter).
Fig. 4 shows SNR of the perfusion maps derived using conventional univariate pairwise subtraction approach and SVMASLQ. For both sessions, SVMASLQ significantly (Paired-t testing two-tailed, p=0.00041 and 0.00072 for session 1 and 2, respectively) improved SNR of the whole brain perfusion signal. Fig. 5 shows CBF maps of the two patients. As shown in the 1st row of Fig. 5, the standard univariate method failed to estimate CBF in the large portions of the stroke patient’s right hemisphere, but SVMASLQ recovered perfusion signal in those regions as well as in other areas as marked by green ovals (the 2nd row of Fig. 5). As shown in the 3rd and 4th row of Fig. 5, SVMASLQ improved CBF quantification quality in several brain regions as marked by the green ovals.
This paper introduced a novel ASL CBF quantification method using the multivariate machine learning. Rather than deriving the perfusion signal by pairwise L/C image subtractions, the proposed SVMASLQ determine the ASL perfusion signal of the entire brain from the spin labeling paradigm-guided image classification. Both simulations and in-vivo data were used to evaluate the proposed method. As compared to the standard voxel-wise L/C subtraction-based approach, SVMASLQ improved spatial SNR for both the synthetic ASL data and CBF maps for two scan sessions obtained 2 months apart. In two clinical ASL datasets, SVMASLQ showed superior CBF quantification quality as compared to the standard univariate approach. The performance enhancement obtained with SVMASLQ likely reflects the automatic incorporation of spatial correlations during the entire ASL image-based SVM classification. Rather than calculating perfusion signal in each voxel separately as in the standard univariate approach, SVMASLQ derives perfusion signal of all voxels simultaneously. In other words, the perfusion-weighted SVM separating hyperplane is determined by the overall image difference between the L and C images for all voxels, which reversely introduces a global guidance for perfusion calculation at each local voxel so that voxels with lower SNR would be enhanced by others with higher SNR.
Several denoising methods have been proposed for ASL CBF quantification, including spatial smoothing (9), temporal filtering (3,10,12,14), wavelet denoising(17), independent component analysis (ICA) (11) and PCA(32). Spatial smoothing and temporal filtering are not exclusive with SVMASLQ and were used as preprocessing steps in this paper. The latter 3 are multivariate. Wavelet denoising could be applied to the mean CBF map to further reduce noise too though we didn’t do that in this report. ICA denoising requires visual inspections for determining noise components, which may present large variations across different subjects and is not practical for routine use. Moreover, the perfusion signal component might be split into different components depending on the prior number of components specified in the algorithm, which might be subsequently treated as noise components. PCA has been used for ASL denoising by our group (32), but it faces a similar problem for identifying the noisy component as well. Nevertheless, it would be interesting to compare various multivariate methods for ASL CBF quantification in future studies.
While SVMASLQ was used to improve quality of the mean CBF map in this paper, it indeed can be extended for ASL fMRI analysis especially for resting state fMRI analysis. As we have shown in (33), the distance of each L or C image to the separating hyperplane reflects the spin labeling fluctuations. After being orthogonized to the boxcar spin labeling function (regressing out the boxcar function), the residuals can be regressed out from the L/C time series data to temporally stabilize the perfusion time courses, and subsequently improve the temporal SNR.
Two caveats that need to be considered in SVMASLQ are: First, the separating hyperplane may only represent part of the information contained in the original data because the number of support vectors is generally much smaller than the number of data points (34). As discussed above, ASL adds a strong signal intensity modulation to the entire brain tissue. Therefore, the label images and control images present a systematic global difference and they most likely align with the two parallel boundary hyperplanes, yielding more support vectors than in other classification cases. In our data, the ratio of number of support vectors to the total number of images was greater than 30%. Moreover, we used a PCA decomposition before SVM classification. The eigen vectors of the original data actually captured information from all samples. As a result, the final surrogate perfusion signal after back projection to the original image space does contain information from all data samples. Second, the parameter selection for SVM can affect CBF quantification. Linear SVM uses a parameter C to control for classifier complexity and the prediction accuracy which would decrease in case of small C (35). Although the data presented in this paper were based on C=1, we tested a range of different C values from 0.01 to 100 and found very similar results (results not shown). One reason for this apparent “insensitivity” of parameter selection could be that, spin labeling represents a global signal modulation, which causes a systematic global signal difference between the label and control images. That systematic global signal difference can be easily captured by the multivariate classification of SVM, so that the classification accuracy for the L/C images is generally high (>90% for the data tested in this study).
In summary, machine learning-based ASL image classification was demonstrated to be useful for ASL CBF quantification though these benefits need to be further confirmed with large sample data.
This research was supported by NIH grants: R21DC011074, R03DA023496, and 8P41-EB015893. The author thanks Dr. Ronald Wolf for providing the stroke patient’s ASL data and thanks Dr. John A. Detre for commenting on the method and manuscript.