|Home | About | Journals | Submit | Contact Us | Français|
To develop a bootstrap method to assess the quality of High Angular Resolution Diffusion Imaging (HARDI) data using Q-Ball imaging (QBI) reconstruction.
HARDI data were re-shuffled using regular bootstrap with jackknife sampling. For each bootstrap dataset, the diffusion orientation distribution function (ODF) was estimated voxel-wise using QBI reconstruction based on spherical harmonics functions. The reproducibility of the ODF was assessed using the Jensen-Shannon divergence (JSD) and the angular confidence interval was derived for the first and the second ODF maxima. The sensitivity of the bootstrap method was evaluated on a human subject by adding synthetic noise to the data, by acquiring a map of image signal-to-noise ratio (SNR) and by varying the echo time and the b-value.
The JSD was directly linked to the image SNR. The impact of echo times and b-values was reflected by both the JSD and the angular confidence interval, proving the usefulness of the bootstrap method to evaluate specific features of HARDI data.
The bootstrap method can effectively assess the quality of HARDI data and can be used to evaluate new hardware and pulse sequences, perform multi-fiber probabilistic tractography, and provide reliability metrics to support clinical studies.
Magnetic resonance diffusion-weighted imaging enables the mapping of axonal architecture in the central nervous system by exploiting the anisotropic properties of water diffusion in the white matter (1). Recently, High Angular Resolution Diffusion Imaging (HARDI) has been suggested as a means to overcome some limitations imposed by diffusion tensor imaging (DTI), notably in presence of crossing fibers (2). HARDI-based reconstruction methods such as Q-Ball imaging (QBI) have proven efficient in the detection of subtle axonal connections in the cerebrum (2) and spinal cord (3).
Whilst diffusion measurements convey a variety of useful information, they are also hampered by a very low signal-to-noise ratio (SNR). This especially holds true when going to HARDI techniques, where larger b-values are needed to depict low angle crossing fibers (4–6). The low sensitivity of diffusion measurements therefore motivates a more thorough evaluation of their reproducibility. In particular, having a reliability assessment for the orientation distribution function (ODF, the probability of water molecules to diffuse as a function of orientation) and its maxima is of major interest. In addition to aiding the optimization of acquisition methods, a measure of statistical confidence on the reconstructed ODF is useful for supporting clinical findings and for performing probabilistic tractography.
Multiple DTI-based methods have been proposed to assess the reproducibility of diffusion measurements (7–12). Most of these methods rely on bootstrap; a nonparametric technique that randomly re-samples the acquired data to generate new sets of data. This provides a distribution of measurements, which can be used to assess the statistical errors of a given metric with minimal assumptions about the noise in the original data. The basics of bootstrap applied to diffusion-weighted MRI are to acquire M datasets with the same parameters. Then, N new datasets are generated by randomly selecting subsets of data from each of the M original datasets (12). DTI reconstruction is then applied to each of the N bootstrap datasets, yielding N tensors with their derived metrics (e.g., fractional anisotropy, mean diffusivity, axial/radial diffusivities). Out of these N sets of processed bootstrap data, one is able to generate a distribution of a given metric and compute statistics on each of these distributions (e.g., mean, median, variance). If the quantity of interest is a vector (such as the principal diffusion direction), angular uncertainty can be assigned based on the confidence interval of the angular dispersion (7). Recent developments have extended the bootstrap method to estimate the uncertainty of the ODF first maximum (13–15), or used estimates from a Bayesian framework (16). While authors introduced original frameworks, these methods focus on the directionality of a single principal ODF maxima for direct application to probabilistic tractography.
Due to the relatively low number of original data used in repetition bootstrap (~5–10), there is a downward bias when estimating the angular uncertainty (17). To overcome this issue, Chung et al. have introduced the “repetition bootknife”, which combines repetition bootstrap with jackknife sampling (18). Given M repetitions, the key concept is to randomly omit one sample from the original dataset, and perform bootstrap on the remaining (M-1) samples.
In the present study, we extended the work of bootstrap applied to HARDI with QBI reconstruction to provide a more generic measure of ODF reproducibility which, in addition to being useful for voxel-wise assessment of complex fiber regions, is potentially valuable for supporting HARDI-based metrics such as the Generalized Fractional Anisotropy (GFA) (2). The sensitivity and robustness of this method were evaluated using simulation studies where variable amount of noise was added to real data. Original contributions include 1) a confidence interval measure for secondary diffusion directions and 2) the study of the impact of acquisition parameters on the reproducibility of HARDI measurements, namely the echo time (TE) and the diffusion sensitization (b-values).
One healthy volunteer was scanned at 3 Tesla (TIM Trio, Siemens Medical Systems) using a 32-channel RF head coil (19). High angular resolution diffusion images were acquired using a twice refocusing EPI sequence with the following parameters: Axial orientation, Field of view = 207 mm, Number of slices = 18, covering the corpus callosum and aligned with the anterior commissure / posterior commissure (AC-PC), Slice thickness = 1.5 mm (no gap), TR/TE = 3000/121 ms, In-plane resolution = 1.5×1.5 mm2, Parallel imaging (R = 2, GRAPPA reconstruction with 24 reference lines), Phase-encoding direction = antero-posterior, Bandwidth = 1510 Hz/pixel, echo spacing = 0.76 ms, b-value = 3000 s/mm2, Number of diffusion-encoding directions = 60 (plus three b=0 images), Acquisition time = 3 min 12 s.
The acquisition was repeated 8 times, yielding a total acquisition time of ~25 min. The diffusion-gradient encoding set was built from the electrostatic repulsion algorithm that provided an evenly distributed diffusion-gradient scheme (20). Gradient vectors are listed in Table 1. Here we used 60 directions as a compromise between the minimum number of directions required to robustly reconstruct the Q-Ball ODF (21) and the total acquisition time acceptable for a human subject. Also, 60 directions were used by Whitcher et al. for performing wild bootstrap on Q-Ball ODF (22). In order to estimate subject motion we used interspersed b=0 images every 20 diffusion-weighted volumes. Therefore, three b=0 images were acquired per repetition (24 in total). No cardiac gating was used.
To demonstrate the efficiency of bootstrap metrics in assessing the quality of HARDI measurements, additional datasets were acquired with different imaging parameters. Since the MR signal decreases by a factor of , increasing the TE decreases the SNR of diffusion-weighted data and therefore the ability to reconstruct the ODF robustly. Another parameter affecting ODF reconstruction is the b-value (4–6). Increased b-value yields lower SNR but higher contrast-to-noise ratio for diffusion weighting. We therefore acquired two additional datasets with the exact same parameters and slice prescription as before, one with a TE of 121 ms and the other with a TE of 91 ms, both with a b-value of 1000 s/mm2. Number of measurements was 6 for both additional datasets.
To study the relationship between reproducibility metrics and the thermal SNR in HARDI measurements, we acquired an SNR map at the same slice location used to generate bootstrap metrics. The SNR map was obtained by measuring the coil sensitivity and noise covariance to reconstruct the optimum SNR coil combination (23). The optimum SNR map was then corrected on a pixel-by-pixel basis to account for the SNR bias introduced by the magnitude detection. This correction for magnitude bias in array data was described by Constantinides et al. (24) and is dependent on both the SNR and the number of array channels. We extended the analysis of Constantinides et al. to include 32-channel data and applied the correction as a look-up table to the SNR maps.
One sensitive aspect of the long acquisition time imposed by regular bootstrap is subject motion. One has to correct for subject motion before processing the data, otherwise the measured reproducibility will likely reflect subject motion rather than other parameters of interest. Standard methods exist to register diffusion-weighted images on a target image, which could either be the averaged diffusion-weighted image or the b=0 image (25,26). However, these motion correction techniques are no longer robust when going to large b-value (>3000 s/mm2), where the SNR in individual diffusion-weighted images is lowered. This especially holds true in data acquired with small voxel size, as done in this study (1.5 mm isotropic).
For these reasons we relied on b=0 images regularly interspersed within each diffusion-weighted dataset in order to robustly estimate subject motion. Motion correction was conducted in two steps using FLIRT, part of the FSL package (27). The first step consisted of inter-repetition registration and the second step consisted of intra-repetition registration. First, all b=0 images have been registered to the first b=0 image of the first repetition (rigid-body registration with 6 degrees of freedom). To limit the amount of image interpolations, only the transformation matrix has been saved and used in the following step. Second, each of the three b=0 images of a given repetition was co-registered to the first b=0 image of that repetition (rigid-body registration with 6 degrees of freedom). Each transformation matrix was then multiplied by the respective inter-repetition transformation matrix (translation matrices were added and rotation matrices were multiplied). The resulting transformation matrices were then applied to diffusion-weighted images.
Figure 1 shows a workflow of the overall bootstrap procedure with metric estimation. Here we used repetition bootknife and introduced two reproducibility metrics. One metric assessed the reproducibility of the overall shape of the ODF using the Jensen-Shannon divergence, while the other provided an angular confidence interval associated with the directions of the local maxima of the ODF; a role analogous to the angular confidence intervals of the DTI fiber direction in a conventional DTI bootstrap analysis.
The bootstrap implementation followed three main steps: 1) generation of bootstrap data, 2) reconstruction of Q-Ball ODFs and 3) estimation of bootstrap metrics. The method has been implemented in Matlab (The MathWorks, USA) using compiled C-functions for faster computation. On a 30×30 pixels ROI, it takes about 30 minutes to run the whole pipeline on a MacBook Pro, CPU 2.4 GHz with 4GB RAM. The developed software is platform-independent and is freely available upon request.
Out of the 8 original datasets, 500 bootstrap datasets were created by randomly picking a value among one of the original datasets using jackknife sampling. This was achieved on a voxel-by-voxel basis.
For each bootstrap dataset, Q-Ball diffusion ODFs were estimated using the method described in (6). The HARDI signal was expressed as a spherical harmonic (SH) series of order L and the Funk-Radon transform was solved using the Funk-Hecke theorem. The final ODF reconstruction, Ψ, in direction (θ,) can be written as
where Ykm denote SH of order k and degree m, ckm are the SH coefficients describing the input HARDI signal and Pk is a Legendre polynomial of order k. We imposed a Laplace-Beltrami regularization criterion while estimating the SH coefficients ckm, as done in (28). In this study we used spherical harmonic decomposition of order 4 and a regularization parameter of 0.006 (28). In the literature the ODF has also been referred to as the “fiber ODF”, which is a sharp version of the diffusion ODF that could be obtained using deconvolution methods (29–31). In this study we only used the diffusion ODF, although the methodology could, in principle, be used with any other HARDI reconstruction technique. Therefore in the rest of the manuscript, the term ODF will refer to the diffusion ODF from QBI.
From the reconstructed ODF, we computed the generalized fractional anisotropy (GFA) (2), which is a HARDI anisotropy measure similar to the popular DTI fractional anisotropy (FA) (32). As an extension of the FA, the GFA is defined as the standard deviation divided by the root mean square of the ODF. Hence, it is also a measurement of the anisotropy but it is generalized throughout more than three eigenvalues. Thus we have
is the mean of the ODF.
In this paper we extended the bootstrap method from the evaluation of uncertainty in scalars (such as FA and eigenvalues) or vectors (such as the direction of the principal eigenvector), to comparing distributions such as the ODF itself. Since the ODF summarizes all physical findings of the single-shell HARDI diffusion measurement it is desirable to have a quantitative assessment of its reproducibility. In general, several distance measures have been used for comparing the distributions relevant to diffusion data. For example, the root mean sum-of-squares (6,28,33,34), the Kullback-Leibler divergence (2,4,35) or the Jensen-Shannon divergence (36) have been used to compare the ODFs generated from different reconstruction methods or to quantify fiber complexity. A recent study supported the use of Euclidean metrics to compute distances between diffusion tensors (37).
Here we used the Jensen-Shannon divergence (JSD) to quantify the similarity between the distribution of each bootstrap ODF and the distribution of the mean ODF. The mean ODF was generated by first converting the estimated Q-Ball ODF from the spherical harmonics into samples on a sphere. From the 60 direction measurements, we estimated our spherical functions with an SH series of order 4, which has 15 coefficients. Then, we can project these 15 SH basis coefficients to any number of spherical values on sphere. Here, we chose 724 spherical values as a compromise between the desired angular precision and computational time, in agreement with the literature (13,14,38). The 724 samples have been distributed equally on the sphere using the algorithm proposed in (39). Then, ODF from each bootstrap sample was averaged on a sample-by-sample basis. Note that this procedure was equivalent than computing the ODF from the averaged data, due to the linearity of the transformation.
Similarly to the root mean sum-of-square, the JSD measures the distance between two probability distributions. In this case, the probability distribution is the ODF as defined by discrete sampling on the sphere. The JSD is defined as:
where P(i) and Q(i) are the distributions of the discrete ODF reconstructed from HARDI data, i is the index of sampling of the ODF (i.e., i=1..724), and DKL(•,•) is the Kullback-Leibler divergence, defined as:
The notable advantage of using the JSD over the Kullback-Leibler divergence is that the former metric is symmetric, i.e., JSD(P,Q) = JSD(Q,P).
One goal of diffusion MRI is to map brain connections using fiber tractography. It is therefore desired to estimate the reliability of the reconstructed fiber pathways. This could be achieved by computing uncertainty measures on ODF maxima, similarly to what have been proposed for DTI (8). Only here, the difficulty was the potential presence of more than one diffusion direction, as opposed to a single principal direction in DTI. One original contribution of this paper was to derive maps of angular uncertainty, not only for the first maximum but also for the 2nd and eventually the 3rd maximum – providing they existed. In order to achieve this, one solution is to first sort the maxima of each bootstrap data based on their value on the ODF, and then compute the angular error on a rank-wise basis. Considering maxima of rank i, the angular error θi,j (in degree) would be computed between the maximum vector i of the mean ODF and the maximum vector vi,j of the jth bootstrap ODF:
However, achieving this is problematic since variability of the bootstrap samples can potentially lead to switching between ranks of ODF maxima. In the case where the first and second maxima are similar in size, variability would cause them to be ranked differently across bootstraps and the angular uncertainty of each maximum would be overestimated (just their labels would be reversed). Such situation would happen in regions containing crossing fibers with similar number and density. In the latter case, angular uncertainty would be overestimated – especially when fibers cross at ~90°, as supported by our simulations (data not shown). To overcome this undesirable bias of the metric, we did not sort the maxima on each bootstrap ODF, but only on the mean ODF. Then, the rank of each maximum was defined so that the sum of the angles between each maximum of the mean ODF and each maximum of the bootstrap ODF was minimized. Hence, the rank was obtained by finding the permutation function f that minimized the sum of the dot product:
where n is the number of maxima for the jth bootstrap ODF. This approach certainly induces a downward correction in the estimation of the angular dispersion, but better supports the use of CI measures to derive cones of uncertainty for tractography applications (7) when all maxima of the ODF – no matter what their ranks are – are used in probabilistic tractography. In that application the rank of extracted maxima is not pertinent. Of greater interest is to compute CI of all principal directions within each voxel, to be used as inputs for multiple-fiber probabilistic tractography algorithms. According to this rationale, the angular error θi,j was then defined as:
where i,j = vf (i),j is the maximum vector of the jth bootstrap ODF as defined by Equation . Note that if i,j did not exist, its value was omitted and the angular error was computed from a reduced number of bootstrap samples. A 95% CI was then calculated using the percentile method, as done for DTI (7). The i angular errors were ordered and the angle corresponding to 95% of the total number of samples were selected to obtain a statistical threshold of P=0.05. In the present study, CI maps were computed for the first and second maxima of the ODF.
Extraction of ODF maxima was performed via an algorithm that finds local maxima on the discrete ODF. Local maxima were identified by testing each sample with its neighbors. The number of neighbors was set to s/30, s being the number of samples on the ODF. For a sampling of 724 points around the sphere, the number of neighbors was therefore 24. Here, Q-Ball ODFs were reconstructed using up to the 4th order of spherical harmonics, which produced relatively smooth ODF. Based on preliminary results, finding local maxima on smoothed ODF using an angular threshold of 12° (corresponding to 24 neighbors) has been found to robustly detect local maxima.
Due to the axial symmetry of the ODF, most maxima had their antipodal correspondent. This was taken into account to estimate the “true” number of maxima reflecting the number of crossing fibers at a given voxel. Hence, after finding local maxima, antipodal pairs were identified based on the angle between them (~180°). Following this step, the bisection between two antipodal maxima was defined as one maximum diffusion direction. Using this method with a spherical harmonic order of 4, each voxel usually produced one to three maxima. Among identified maxima, those within 45° of a local maximum were discarded, the local maximum being defined as having the highest value on the ODF. This was done to limit the impact of ODF peaks due to noise, given that spherical harmonic order of 4 does not allow to resolve crossing fibers at less than 65° (33). Also, the identified local maxima with magnitudes lower than 33% of the maximum ODF value were discarded, as done in (13). This threshold was empirically set to avoid picking maxima out of noise. Maxima extraction was performed voxel-wise on each bootstrap dataset and on the mean bootstrap dataset.
Maxima of the mean ODF were then sorted with respect to the probability of diffusion direction. This step consisted of labeling the first, second, third and subsequent diffusion directions, based on the likelihood of water diffusion given by the ODF. As mentioned previously, this ordering step was not required for computing CI maps. Therefore, it was not applied to each bootstrap ODF – only to the mean ODF. After maxima extraction, angular error was computed between the maxima of the mean ODF and the maxima of each bootstrap ODF. CI maps were then generated for the first and for the second maxima of the ODF.
Variable levels of Rician noise were added to the original dataset (TE = 121 ms, b-value = 3000 s/mm2) to assess the sensitivity of the bootstrap metrics to image noise levels. First, SNR was measured on a single b=0 image by dividing the mean signal on a 10×10 region of interest (ROI) in the white matter, by the standard deviation of another 10×10 ROI in the background. The latter standard deviation was corrected for Rayleigh distribution, dividing its value by . The resulting SNR was 7.9. Note that due to the use of a multichannel array, this method did not properly estimate the true SNR. However in that case, the objective was only to get an approximate starting value used as a reference for the noise simulations. Then, fictive Rician distributed noise was added to the data with an amplitude of 50% 100% or 200% of the original background standard deviation as in (11). These datasets with increased noise levels were then analyzed by the bootstrap method.
Robustness of the bootstrap method was evaluated as a function of the number of original datasets and the number of bootstrap datasets generated. The analysis was rerun using between two and eight of the original datasets to generate 200 bootstraps. In a separate assessment, the eight original datasets were used to generate between 50 to 1000 bootstrap datasets. These evaluations were repeated for the various levels of added noise: 0%, 50%, 100% and 200%. The number of original or bootstrap data required to reach a steady state was evaluated by computing the root mean square error (RMSE) between the i+1 and the i iteration. This computation was done on a 30×30 axial ROI, positioned on the posterior-left region and covering part of the ventricle, grey matter and white matter with single and crossing fibers (see Figure 2b).
The bootstrap metrics were evaluated for two acquisition parameters: TE and b-value. To evaluate the effect of TE, data were acquired at TE of 91 ms and 121 ms, both at b-value of 1000 s/mm2. To evaluate the effect of b-value, data were acquired at b-value of 3000 s/mm2 and 1000 s/mm2, both at TE of 121 ms. Each comparison was done using six original datasets. Due to potential subject motion between datasets, a third step was added to the motion correction pipeline, which consisted of co-registering the three datasets together (using the first b=0 image in each dataset). JSD and CI maps were then estimated on a single slice for each dataset as described previously in the methods.
Note that we focused on a single slice rather than on a specific tract, because tractography was not the only application of our method, which could also be used voxel-wise for supporting the reliability of HARDI-based metrics in the context of clinical studies. Also we did not want to limit the applicability of our findings to a particular tractography method and therefore focused on the ODF (which forms an input to most tractography methods).
Q-Ball ODFs were successfully reconstructed and maxima extracted from each bootstrap dataset. To illustrate the variability of Q-Ball estimate across bootstraps, Figure 2 shows maps of ODFs and extracted maxima on a region that includes single and crossing fibers (see FA Colormap on Figure 2d for anatomical landmarks). Variability in both the shape and orientation of the ODF is noticeable across bootstrap (Figure 2c). Extracted maxima are also plotted for ten bootstrap samples and overlaid on the GFA map for the same region (Figure 2e). Large maxima dispersion is noticeable in isotropic regions such as in the ventricles and the grey matter. Dispersion of the first maximum (green) is the lowest in the corpus callosum, with a less coherent dispersion of the second maximum (red) attributed to noise. In a region of crossing fibers, both the first and second maxima exhibit low dispersion.
Figure 3b shows JSD map for an axial slice. The JSD is higher in the ventricles and in the inter-hemispheric fissure, as expected due to the isotropic nature of water diffusion in the cerebrospinal fluid. In the parenchyma, JSD values are the lowest with no clear distinction between the gray and the white matter. Interestingly, JSD values tend to increase toward the center of the brain, possibly reflecting the lower sensitivity of array coils in deep tissues, as illustrated by the map of thermal SNR (Figure 3c). Similarly mirroring the SNR map, patterns of high JSD value extend toward the frontal and sensori-motor regions of the left hemisphere. This comparison suggests a direct relationship between the reproducibility of ODF reconstruction and the SNR, a hypothesis further confirmed by the added-noise simulation studies. Maps of CI for the first and second maxima are shown in Figure 3e–f. The location of the white matter is clearly visible on the CI map of the first maxima, where 95% angular confidence is achieved with a CI of 0–10°, whereas angular confidence in the CSF or grey matter typically reach values of about 70–90°. In the CI map of the second maxima (where it exists), most values range from 70° to 90°, with no clear delineation of anatomical structure such as the cortical grey matter, which was expected to exhibit better confidence intervals in secondary directions when compared to purely isotropic regions such as the cerebrospinal fluid. However, reduced confidence interval of the second maxima is noticeable in known regions of crossing fibers, such as at the intersection between the corpus callosum and the posterior corona radiata (Figure 3g–i). The zoomed panel is a good representation of possible cases of single and crossing fibers. ODFs on the outer columns mostly contain the corpus callosum (left) and longitudinal tracts (right), therefore exhibit low uncertainty for the first maximum and either no (left) or very high uncertainty (right) for the second maximum. The middle column however exhibits ODFs with two distinct maxima, as reflected by the low uncertainty of the first and second maxima. Note that red lines (indicating the secondary maxima) exist for the voxel on the left bottom corner of Figure 3g while its corresponding voxel is represented by the black color on Figure 3i (indicating no secondary maximum). The reason is that the number of maxima attributed to each voxel was based on the mean ODF (averaged across 500 bootstraps), whereas the maxima that are shown on Figure 3g arise from 10 individual bootstrap samples. Therefore it is likely that for some bootstrap samples, a 2nd maximum was discovered, although there was no 2nd maximum on the mean ODF.
Sensitivity of the bootstrap method was evaluated using realistic simulations with variable amount of added noise (0%, 50%, 100%) (Figure 4). In this figure, ODFs and maxima dispersion show the impact of noise on the Q-Ball reconstruction. Results of the simulation show a direct relationship between the level of SNR and JSD values, as suggested by the spatial similarity between the JSD and image SNR maps. Similarly, higher noise levels yield higher uncertainty in the CI maps of the first maxima.
Figure 5 shows the robustness of the bootstrap method as a function of the number of original data (2,3,6 or 8 original data sets and 500 bootstraps) for various levels of added noise. The region of interest is the same as that one shown in Figure 2b. Results suggest that JSD and CI estimation is inaccurate when insufficient number (<4) of original data is used. Figure 6 illustrates how the number of bootstraps (50 to 1000 from 8 original datasets) impacts JSD and CI estimates. Similarly to the previous simulation, estimation of JSD and CI is inaccurate with low number of bootstraps, although the differences between the results with low or high number of bootstraps is relatively small. Figure 7 shows the RMSE between the i and the i-1 maps, where i is either the number of original datasets or the number of bootstraps. With more than 5 datasets, change in RMSE for JSD and CI exhibits a relatively small monotonic decay, meaning that there is no substantial gain in accuracy in the estimate of JSD and CI maps when adding more than 5 datasets. With >500 numbers of bootstraps, RMSE seems to reach a plateau, therefore suggesting that 500 bootstraps achieve a robust estimate of JSD and CI metrics.
Note that the trend observed on the CI is somewhat counterintuitive, as for a given number of dataset or for a given number of bootstrap, the RSME of CI decreases when the amount of noise increases. What is showed on this plot is not the actual CI value, but rather the difference in CI between two conditions (with an incremental dataset or number of bootstraps). Results suggest that for highly noisy data (e.g. 200%), the overall CI (i.e. CI averaged within an ROI encompassing white matter, gray matter and CSF) does not change much when additional data or bootstrap samples are provided. This is due to the fact that for noisy data, the CI is already high (>80°), no matter how many data or bootstrap samples are provided.
The sensitivity of the bootstrap method to pulse sequence parameters was evaluated by acquiring HARDI data at two different TEs (91 ms and 121 ms), where other parameters were kept the same (b-value was 1000 s/mm2). The SNR ratio between the two datasets could be evaluated using the following relationship:
Assuming a T2 of 74 ms in the white matter at 3T (40), the TE=91 ms dataset has about 50% more SNR when compared to the TE=121 ms dataset. Due to this increase in SNR at the lower TE, the added-noise simulations suggest that improved reproducibility of ODF estimate is expected for the low TE dataset. Results of JSD and CI maps are shown in Figure 8 verifying a higher ODF reproducibility and lower angular uncertainty at TE=91 ms versus TE=121 ms.
To better characterize differences between the two TE measurements, a white matter mask was created from one dataset and used to generate white matter histograms of JSD and CI measures for both conditions (Figure 8c). The mask was generated using an intensity threshold of 0.12 on the GFA axial map from the TE=91 ms dataset. The JSD and CI histograms were distinct for the two different TE values as determined by a 2-tailed, 160 degrees of freedom Student T-test performed on the assumed Gaussian data. Results of the T-test support unequal means between the two distributions for the JSD (T=9.66, P<10−12) and for the CI (T=6.08, P<10−8). Although these distributions exhibit a slight positive skew, Gaussian fit was used in order to perform statistical comparison between means, as well as to compare the shape of either distribution.
Figure 9 shows analysis of the JSD and CI metrics applied to HARDI data acquired at two different b-values (3000 s/mm2 and 1000 s/mm2). Due to higher SNR of diffusion-weighted images at lower b-value, reproducibility of ODF estimate was expected to be higher. This hypothesis is verified in the JSD map, which exhibits a higher overall reproducibility at b=1000 versus b=3000. The contrast of the GFA map is also higher at low b-value, providing better delineation of the white matter structure.
A trade-off for using low b-value however was the lower angular resolution for detecting multiple fiber orientations. Again, this hypothesis is verified in the CI maps, which exhibits lower uncertainty in white matter regions at higher b-value. This observation is especially marked in regions of crossing fibers, as illustrated by the zoomed panel of Figure 9, which focuses on the intersection of the corpus callosum, the anterior corona radiata and longitudinal tracts (black circle). Interestingly, CI values in gray matter and CSF are higher at b=3000 s/mm2. Thus, the higher contrast in CI value between white matter and non-white matter at high b-value suggests an improved ability for probabilistic tractography. This result highlights the relevance of CI maps both inside and outside the white matter for assessing acquisition parameters.
The white matter mask of Figure 8 was used to compute histograms of JSD and CI for both b-values (Figure 9c). Histograms were fitted to a Gaussian distribution and Student T-test was performed (2-tailed, 190 degrees of freedom). Results of the T-test support unequal means between the two distributions for the JSD (T=9.94, P<10−12) and for the CI (T=2.62, P<0.01). Unlike the TE case, the shape (i.e., standard deviation of the fitted Gaussian function) of JSD histograms is very similar between the b-value conditions and only the mean was significantly upward shifted for the high b-value condition. This observation suggests that the shape of the JSD histogram may reflect differences in the biophysical properties of the measurement (beyond simple SNR).
In this paper we presented a repetition bootstrap method to assess the quality of HARDI data. The proposed metrics targeted two different levels of quality assurance: the JSD metric measures the reproducibility of ODF shape and the angular CI measures the uncertainty of ODF maxima. Results from brain data showed a direct relationship between bootstrap metrics and thermal SNR. Acquisitions at variable TEs and b-values demonstrated the efficiency of the proposed method for characterizing the reproducibility and accuracy of the ODF and its derived maxima. This section will first address some key features of the bootstrap method, including its robustness and limitations. Then, a possible extension of the method is proposed regarding the type of bootstrap method. Finally, various applications of the method are proposed.
Evaluation of bootstrap metrics was mainly focused on the impact of noise as assessed by adding noise to the data, by acquiring data at two different TEs and by comparing JSD and SNR maps. In the latter case, we found that JSD maps reproduced the approximate spatial pattern of the sensitivity profile of the 32-channel array coil used here, i.e., lower sensitivity in central regions relative to peripheral regions (19). This finding is supported by previous studies that aimed at measuring the reproducibility of diffusion-weighted measurements in DTI using multichannel array coils (37,41). Similar findings were observed using the Kullback-Leibler divergence to measure the accuracy of the Q-Ball ODF at various levels of noise (38).
Another interesting finding is that JSD histograms exhibited very different shape between acquisitions at different TEs and b-values. Comparing two different TEs, the mean and standard deviation of the low-TE histogram were lower (Figure 8c). However, comparing two different b-values, the mean was decreased (lower b-value) but the standard deviation was approximately unchanged (Figure 9c). In both cases (longer TE and higher b value) the noise was increased. These changes in histogram shape highlighted differences in ODF reproducibility derived from differences in the acquisition beyond just the SNR; for example the reproducibility associated with forming an ODF from the different biophysical sensitivity associated with different b-value acquisitions. It should be mentioned that T2 relaxation in the white matter can significantly vary across regions (42). Therefore, a change in the TE had a different impact on the SNR throughout the whole brain.
Noise also increased the angular uncertainty of ODF maxima as measured in the CI maps when noise was added or when the TE was increased (producing noisier images). When the b-value was increased from 1000 to 3000 however, the CI histograms and maps were improved (i.e., lower angular uncertainty), even though the higher b-value produced noisier images. Measuring CI of FA in DTI, Heim et al. found similar differences between histogram distributions at various levels of noise and smoothing (11).
The achievable angular resolution in HARDI depends both on the diffusion gradient scheme and the b-value. While other studies suggested the use of large b-value for better accuracy in depicting diffusion maxima (2,5), the present study demonstrated that larger b-value also provides better angular confidence interval of the ODF maxima (Figure 9). One side effect of using higher b-value is that it increases the minimally achievable TE, therefore further reducing the SNR. Metrics provided in this study may help optimizing the trade-off between the TE and the b-value.
One of the original contributions in this study was the introduction of confidence interval maps of secondary diffusion directions. While CI maps of first maxima provided very consistent white matter maps, the relevance of CI maps of second maxima was more difficult to assess. Most values exhibited very large confidence intervals (>70°) with no clear delineation of anatomical structure. However, a better confidence interval was noticeable in known regions of crossing fibers, such as at the intersection between the corpus callosum and the posterior corona radiata (Figure 3i). These findings suggest that the CI of the second maxima could be derived from HARDI measurements, providing consistent measure of uncertainty in regions with more than one principal diffusion direction.
CI maps of secondary maxima may find applications in multiple-fiber probabilistic tractography (13,14,16). Although they would not dramatically change the outcome of tractography results in dominant white matter pathways, CI measurements of secondary maxima may be of interest in regions with higher complexity. Future investigations are needed to evaluate the pertinence of this index using other reconstruction methods that provide sharper versions of the ODF, therefore better discrepancies between maxima.
Rigid registration in multiple directions diffusion-weighted imaging should be, in general, accompanied with an appropriate rotation of the B-matrix (43). For the within-dataset case, motion correction was performed between repetitions (i.e., between the first and the eighth repetitions). Therefore, we could not easily have applied a rotation of the B-matrix as for a given voxel, the data were randomly picked from one of the eight repetitions, and a single B-matrix was then applied to the whole volume. In other terms, each voxel of the bootstrap-generated volume experienced a different motion, making the B-matrix correction approach less easy to apply. Although we acknowledge the fact that motion might have biased the estimation of ODF, the amount of rotations as assessed by the motion correction parameters given by FLIRT was marginal (less than 2°).
For the between-dataset case, the registration was performed between one dataset (e.g. b=1000, TE=91ms) and another dataset (e.g. b=3000, TE=91). Therefore, correcting for the global registration by applying the same rotation to each row of the b-matrix wouldn’t have changed our results, as the present study focused on the individual voxel, and not on the neighboring (e.g., tractography). Therefore, the derived JSD and CI metrics were not affected by potential rotations between datasets.
We evaluated the number of datasets needed to obtain reliable estimate of bootstrap metrics. Given the potential dependence of this parameter on the SNR of the measured data, we used simulations with variable amount of noise. Results suggest that a minimum of five datasets is required to reach robust estimate of reproducibility metrics. Similarly, for DTI using regular bootstrap, O’Gorman and Jones suggested at least five repetitions in order to have a robust estimate of reproducibility metrics (44). However it should be noted that the number of directions in HARDI acquisition impacts the quality of the ODF reconstruction for each bootstrap sample (35). O’Gorman and Jones investigated the impact of the acquisition scheme for DTI reconstruction (44). For 8, 16, 32 and 64-direction schemes, they found the minimum number of repetitions required to obtain robust estimate of the standard deviation of the trace of the tensor to be about five. Keeping in mind that HARDI Q-Ball and DTI are very different reconstruction techniques and that bootstrap-based evaluation of reproducibility underlies different assumptions about the noise, results of our simulations and those from O’Gorman and Jones therefore suggest that the proposed bootstrap method requires at least five repetitions to ensure accuracy in computation of bootstrap metrics. Regarding the number of bootstraps requires to derive reliable metrics, our results suggest to use at least 500 bootstraps. In contrast, 100–200 bootstraps were suggested for DTI (11,44).
Here we used 60 directions for our HARDI acquisition protocol. We expect a bias in the estimate of the reproducibility of the Q-Ball reconstruction due to the selection of the diffusion-encoding set, given that increasing the number of directions will increase the accuracy of the ODF reconstruction (35), with a plateau after ~100 directions (45). Also, previous studies showed how the number of diffusion-encoding directions affect bootstrap estimates reproducibility (18,34). Therefore, any comparison should be conducted using the same encoding scheme.
Previous studies have also shown that angular uncertainty and ODF reproducibility depend on the b-value (38) as well as on the SNR (11,13,14,41). These parameters are therefore expected to significantly impact the required number of data for bootstrap (34).
In this study we used regular bootstrap with jackknife sampling. We showed that at least 5 repeated measurements were required, leading to 16 min of acquisition for covering a portion of the brain. If covering the whole brain (usually 60–70 slices for a thickness of 2 mm), TR would be at least much longer and the scan time would be at least three times longer. Unless fast imaging technique is used, such as compressed sensing or simultaneous multislice imaging (46), it is hard to apply the repetition bootstrap for real human data in a routinely basis. In contrast, model-based bootstrap techniques, such as residual bootstrap (13,14,18) and wild bootstrap (22,25,47), can provide equivalently robust estimations of measurement uncertainty as those from repetition bootstrap while only one complete DTI/HARDI data is normally required.
One advantage of regular bootstrap is its applicability to model-free approaches such as diffusion spectrum imaging (DSI). Also, acquiring independent HARDI measurements without any noise assumption potentially provides a less biased measure of the reproducibility of HARDI measurements. It should be mentioned however that previous studies showed similar results between residual and repetition bootstrap (14,18,22,34,47). Lastly, it should be noted that the metrics proposed in this study could also be derived using residual or wild bootstrap techniques.
As researchers aim to push the limits of diffusion-weighted imaging with higher b-values and smaller voxels, there is an increasing need to evaluate the reproducibility of these measurements. Recently, the proposed method has been applied to evaluate the benefit of a head gradient for diffusion imaging (48). The lower JSD reflected the higher SNR achievable by lowering the TE while keeping the same b-value. The proposed method has also been applied to evaluate a new diffusion pulse sequence based on simultaneous multislice imaging (46). By providing quantitative metrics to evaluate the reproducibility of the ODF, the authors demonstrated the equivalence in the quality of HARDI data, with and without enabling the multislice acquisition.
A second application domain for the proposed method concerns ODF reconstruction techniques. As previously mentioned, the JSD and CI metrics only need the discrete ODF to be generated, therefore other methods than QBI could be evaluated and compared for their sensitivity to noise and ability to produce maxima with low angular uncertainty. The measure of angular uncertainty derived from the bootstrap method could also be used as input for HARDI-based probabilistic tractography (13,31). Note that the reduced coverage used here might be limited for some tractography applications. The main reason for not covering the entire brain was to reduce the acquisition time to a reasonable 3 minutes per dataset, allowing multiple acquisitions possible for the bootstrap validation. Moreover the goal here was to demonstrate the utility of a bootstrap method to assess the reproducibility of the ODF. Researchers can adapt the proposed protocol by modifying the coverage according to their needs.
Finally, clinical applications will benefit from a statistical assessment on the reliability of HARDI datasets. Indeed, ODF-based metrics such as the GFA not only depend on the integrity of the white matter but also on the SNR and diffusion-gradient scheme (b-value and number of directions) of diffusion-weighted data (7,25). Therefore, reporting indices of reliability along with quantitative metrics is a possible avenue for researchers to gain confidence in HARDI studies. Also, indices of reproducibility can be used as quality insurance in longitudinal or multisite studies and to guide power analysis in clinical trials.
In conclusion, we presented a bootstrap method to assess the reproducibility of the ODF and angular uncertainty of the first and second ODF maxima derived from HARDI data. Quality assurance metrics were derived from the orientation distribution function, and proved to be efficient in relating bootstrap metrics with the SNR and b-value of HARDI data. This method may find applications in evaluating new hardware and pulse sequence in diffusion MRI, comparing ODF reconstruction techniques, developing probabilistic tractography and reporting reliability indices on quantitative metrics used for clinical studies.
National Institute of Health (P41RR14075, U01MH093765), Association pour la Recherche sur la Sclérose en Plaques (ARSEP).