|Home | About | Journals | Submit | Contact Us | Français|
Optimal interpretation of magnetic resonance image content often requires an estimate of the underlying image noise, which is typically realized as a spatially invariant estimate of the noise distribution. This is not an ideal practice in diffusion tensor imaging because the noise distribution is usually spatially varying due to the use of fast imaging and noise suppression techniques. A new estimation approach for spatially varying noise fields is proposed in this article. The approach is based on a noise invariance property in scenarios in which more than one image, each with potentially different signal levels, are acquired on each slice, as in diffusion weighted MRI. This technique leads to improved noise field estimates in simulations, phantom experiments and in vivo studies when compared to traditional noise field estimators that use regional variability or background intensity histograms. The proposed method reduces the noise field estimation error by a factor of 100 in simulations, shows a strong linear correlation (R2 = 0.99) between theoretical and estimated noise changes in phantoms, and demonstrates consistent (<5% variability) noise field estimates in vivo. The advantages of spatially varying noise field estimation are demonstrated for power analysis, outlier detection, and tensor estimation.
Consideration of how noise affects the true signal is essential to proper interpretation and analysis of magnetic resonance (MR) images (1). Many methods have been developed for determining the noise characteristics in scalar MR images. Yet these estimation methods are suboptimal for DTI because the noise structure in DTI is spatially varying and correlated due to fast imaging and noise suppression techniques. Observed signals in MRI can be considered to be a combination of two sources: a deterministic “noise-free mean” component (which would be the signal observed at infinite SNR) and a stochastic noise component. Since the noise can be spatially variable, the stochastic component is a random field, which we call the noise field (NF). In DTI, accurate modeling of the NF has been shown to play a crucial role in estimating the reproducibility of measured quantities like diffusivity and anisotropy (2-4), determining outliers (5), and fitting likelihood-based tensor models (6). In this article, we present and evaluate a new estimation approach based on a noise invariance property in diffusion weighting (6) and demonstrate that this technique leads to improved NF estimates in simulations, phantom studies, and in vivo acquisitions. The new approach substantially improves analyses for DTI reliability, outlier detection, and tensor fitting by producing robust, spatially varying NF models.
Noise in magnitude MR images is well known to exhibit a signal dependent Rician distribution (7). A Rician random variable (RV) can be derived as the absolute value of a complex-valued RV where the real and imaginary components are independently and identically distributed (i.i.d.) as Gaussian RVs. The variance of a Rician RV depends on both the NF and the underlying noise-free mean. In the remainder of this article, the NF of a Rician RV refers to the NF of the underlying Gaussian RV. At low signal-to-noise ratios (SNRs), the mean of Rician distributed observations is substantially biased relative to a noise-free mean. Previous approaches to estimate and correct for “Rician bias” have focused in three areas: 1) constant or signal-free regions, 2) image histograms, and 3) difference images from repeated acquisitions (1,8). To briefly review, the limit of a Rician distribution in the absence of signal is a Rayleigh distribution, from which the NF can be estimated directly (9). If the noise is i.i.d. with a constant level across the image, then voxels in the background can be pooled to estimate the NF (7,10). Similarly, if a region has a constant true intensity then the observed signal can be pooled to estimate the noise parameters. Histogram methods pool information without spatial consideration and seek to identify Rayleigh distributed voxels (8). When two images are available, double image methods may use the difference image (11,12) or the joint histogram (13). If the original complex data are available, improved NF estimation is possible (14). Alternatively, additional scans may be performed to compute the spatially varying NF (15,16).
Prior approaches to NF estimation and bias correction in DTI have used scalar image intensity correction schemes and spatial regularization (17-20) and have mostly neglected the particular challenges of estimating the NF in DTI data. DTI requires many images with different diffusion sensitization, so noise suppression and fast imaging techniques are essential to achieving sufficient SNR with clinically feasible sequences. These approaches introduce spatial variability and correlation into the noise structure, which violate the underlying assumptions of many traditional noise estimators. The NF cannot be directly computed from background regions when using background suppression (e.g., Constant Level Appearance, CLEAR, Philips Medical Systems, Best, The Netherlands) because the signal from the background is actively suppressed. Parallel imaging utilizes multiple coils with variable spatial sensitivity, and observations are combined in a spatially variable manner to form a single image. Hence, methods that assume a constant NF are inappropriate. Additionally, acquisitions are commonly up-sampled by zero-padding the Fourier coefficient images to mitigate frequency artifacts from the magnitude operator, and the local noise structure is highly correlated, which complicates estimates based on local regions.
Our proposed NF estimation procedure takes into account the complexities that are associated with spatially correlated noise. A significant contribution of the proposed method is that it estimates a spatially varying NF; this has not been widely studied before. It improves estimation accuracy, does not depend on spatially uncorrelated noise or the existence of a background region, and is robust against background suppression. The method makes use of repeated acquisitions (i.e., a complete duplicate set of reference and diffusion weighted DW images) which are commonly acquired in clinical scans to improve the SNR. To demonstrate the limitations of methods that produce a single NF estimate for a given image, we present three common analysis tasks and compare the results of carrying out each task using first classical constant NF estimate and then using our spatially varying NF estimate. With the widespread use of parallel imaging methods, our NF estimator—while specifically developed for tensor estimation—should have far wider utility beyond DTI.
This article is organized as follows. We first review noise propagation in MR imaging and then discuss several existing methods for estimating the NF. Next, we describe our method for estimating a spatially variable NF and present experiments comparing the methods in simulation, in a phantom, and in vivo. We conclude by demonstrating the importance of spatially varying NF estimation by presenting simulations that show how an understanding of the spatial variability of the NF could improve statistical power analysis, outlier rejection, and tensor estimation.
MR noise in the observed complex-valued signals (i.e., the observed k-space coefficients) that are obtained using single coil imaging protocols generally exhibit i.i.d. Gaussian distributions (7). To increase apparent resolution and mitigate artifacts, acquired k-space coefficients from each coil are zero-padded and reconstructed into the complex image domain (interpolating image intensities with a sinc kernel), which introduces spatial correlation. To merge signals from multiple coils, the images are normalized and combined according to each coil's spatial sensitivity, which introduces spatial variability and additional correlation (21). Finally, the magnitude operator discards the phase information and results in a real-valued image, which transforms the voxel-wise noise into a Rician distribution (1),
where S is the observed signal, ν is the true signal, σ2 is the NF squared in the complex-valued images, and I0 is the modified Bessel function of zeroth order. The true signal and NF are anatomically dependent, spatially varying, and, in general, unknown.
For DTI, the Stejskal-Tanner equation gives the true signal as a function of the reference intensity S0),
where diffusion weighting is applied along the unit direction g with a b-value of b (a control parameter), and D is the local diffusion tensor. Neglecting potential eddy current effects and other hardware imperfections, the noise and underlying signal of the observed Fourier coefficients are independent. Thus, while the noise-free signal intensity is dependent on both spatial location and diffusion weighting, the NF is only dependent on spatial location.
The following sections describe several NF estimation methods along with implementation details for our experiments. Table 1 reviews the assumptions underlying each method.
Perhaps the most straightforward manner to compute noise is to identify a region of interest (ROI) with a constant true signal intensity and take the standard deviation of pixel intensities (10),
where R is the set of voxels in the ROI and Si is the observed signal at voxel i To generalize this method to multiple repetitions and multiple diffusion weighted directions, the signals within ROIs are pooled across repetitions and the SNR estimates for each direction are averaged. For results reported in this article, constant signal regions were manually delineated to include all signal regions in the simulations and phantoms and a section of the corpus callosum for the in vivo acquisitions.
When a repeated acquisition is available, the impact of a spatially variable signal in the SDU may be mitigated by considering the standard deviation of the difference image (22),
where di is the difference between repeated acquisition at voxel i. To generalize this method to multiple DW directions, the signals within ROIs are pooled for all DW images so that all data are simultaneously considered. For results reported in this article, the ROIs were manually drawn to include all signal regions (i.e., non-background).
By definition, there is no signal in the background, so background observations follow a Rayleigh distribution. If the NF is constant throughout the background, then it can be estimated by (7),
where B is the set of background voxels and Si is the observed signal intensity at voxel i. To maximize the fairness of comparison and to generalize this method for the repetitions and multiple DW directions present in this study, background regions are chosen to span all images and all background voxels are simultaneously considered. For results reported in this article, the background regions were manually delineated.
MR images tend to exhibit substantial intensity differences between background (no signal) and objects (high signal). Thus, the image histograms tend to have a large contribution of low intensity observations (which resemble a Rayleigh distribution) on the left part of a histogram and variable object content to the right part of a histogram. To avoid manual identification of background regions, Brummer et al. proposed to fit a Rayleigh distribution to the low intensity portion of the histogram in order to identify the NF (23),
where w is the bin width, fc is the cutoff threshold, and K is a normalizing constant. The cutoff threshold is initialized to twice the SDD NF estimate, and w is chosen to result in visually smooth histograms of signal intensities for the simulation, phantom, and in vivo studies (128 bins). All data were simultaneously considered.
When many samples are available, it is possible to connect the empirical moments with the theoretical moments to estimate the NF in a method of moments (MOM) approach. Briefly, for a Rician distribution (Eq. (1)), the second moment is E[x2] = 2σ2 + ν2 and the fourth moment is E[x4] = 8σ4 + 8σ2ν2 + ν4. To estimate the NF, we replace the expected values of each moment with the empirically observed moment. Solving for the NF, we find
where · denotes the mean and S are the observations at individual voxels. Bias was empirically evaluated and found to be less than 3% for an SNR greater than 3:1 with 15 observations. However, it is possible for the above computation to yield a complex result; when this happens, the estimate is ignored in subsequent analyses. In Monte Carlo simulations, the prevalence of complex NF estimates decreased with SNR and occurred with a frequency of 0.001% for an SNR of 3:1 with 15 observations. With fewer than 15 i.i.d. observations or when SNR is lower than 3:1, both the failure frequency and the bias increase, which makes this estimator largely unusable in practice. However, in some of the experiments conducted in this article, we use the MOM estimator with 15 observations as a voxel-wise “ground truth”—i.e., a stand-in for the true NF by which to estimate and compare the relative errors of the other (more practical) estimators.
Consider a pair of DTI acquisitions, including any repeated observations for the reference and each of N DW directions. As SNR increases, Rician distributions become well approximated by Gaussian distributions. In particular, Wood and Johnson (24) suggest that Rician data are approximately Gaussian at a minimum SNR of 5:1. We now turn our attention to regions of high signal intensity (and therefore high SNR). We form difference images between each repeated image pair (for a total of N+1 difference images). The difference of two i.i.d. Gaussian RVs has a zero-mean Gaussian distribution with twice the variance of the original distributions. Since the distribution of the difference of Gaussian RVs does not depend on the original mean, the distribution of the resulting set of differences depends only on the NF (for locations with sufficient SNR). These properties are not generally true for Rician distributed RVs.
In regions with sufficient SNR, a Gaussian approximation is appropriate and each difference image can be treated as an independent Gaussian observation. If all images are observed with the same number of averages, then the set of differences images will be distributed with the same standard deviation. Otherwise, if different numbers of averages are used for the references and DW images, then the distributions of the differences must be matched by scaling the differences by the square root of the number of k-space averages in each image pair. Given this framework, the NF can be estimated with a scale estimator applied to the set of difference images. The sample standard deviation is one such scale estimator. Here, we use a robust scale estimator, Qn (25), to estimate the NF () at each spatial location,
where ξi is the number of k-space averages used to construct the ith DW image, di is the difference between the ith pair of images. For notational convenience, i=0 is defined as the reference image. Qn is an unbiased scale estimator for Gaussian distributed data and is insensitive to the presence of outliers (50% breakdown point). Qn is the normalized first quartile of all pairwise distances between two data points,
where k is a normalization constant to make Qn unbiased when X is normally distributed. Note that the parenthetic subscript indicates the Nth smallest element of the set. See Rousseeuw and Croux for a complete theoretical exposition and implementation details (25).
The Gaussian approximation is not appropriate at low SNR, so observations that are likely to have been observed with low SNR are excluded from use in Eq.(8). For the purposes of this study, a minimum estimated SNR of 4:1 is used because fewer than 3.5% of observations from a RV with an SNR of 2:1 would be included. The bias in mean intensity for a RV with SNR of 2:1 is approximately 13%. To obtain the initial NF estimates, a heuristic, intensity based threshold is initially used and refined with a second pass. An initial value of 1/10th maximum was chosen for these DTI experiments because typical SNR is between 20:1 and 30:1, so a threshold of 1/10th maximum would typically fall within 2:1 and 3:1. For both iterations, the SNR threshold is based on a spatially varying NF estimate.
Introducing a threshold necessarily introduces bias because the threshold cannot distinguish between random variations caused by a low mean signal intensity and those arising from noise associated with a higher mean signal intensity. Our threshold is heuristically selected to achieve a balance between excluding non-Gaussian distributed data and avoiding bias. Additional iterations could be used to ensure convergence, but these were not typically necessary for the DTI data studied here and were thus avoided to reduce computation time. The impact for NF estimation at very low SNR is severe—a true SNR of 5:1 would be misestimated by 32.2% on average. However, this bias rapidly decreases at increasing SNR. At an SNR of 8:1, the bias is less than 1.4%.
Estimates of , Eq.(8), may exhibit variations that are inconsistent with spatial variability from coil sensitivity profiles. For example, regions surroundings vessels are prone to pulsation artifacts, which may manifest as abnormally bright regions. Such variability is not well described by a Rician distribution and is more appropriately addressed using an artifact model. NF tends to smoothly vary in accordance with coil sensitivity (21). To extrapolate the NF to low SNR voxels and to improve robustness against artifacts, we regularize the initial NF from high SNR voxels using a coil sensitivity model based on Chebyshev polynomial regression with a third degree two-dimensional polynomial. Although other regularizers are possible, Chebyshev polynomials were selected both for their numeric stability properties and for their ability to represent two-dimensional smooth functions, as in their previous use in estimating coil sensitivity profiles (26,27). The regularized NF ( is estimated for all voxels by,
where Pxy is the matrix of Chebyshev polynomials evaluated at the spatial location of each voxel and is the pseudo-inverse of the matrix of Chebyshev polynomials evaluated at the spatial location of voxels with an estimated SNR greater than an appropriate threshold, t. In Eq. (10), NF estimate images are represented as vectors indexed by spatial location. In this manuscript,t is chosen to be 4:1 to ensure sufficient applicability of the Gaussian approximation. The Chebyshev polynomial coefficients are recursively defined by,
The spatial coordinates are scaled from mm units to ±1 to match the domain over which the Chebyshev polynomials form a complete basis. The resulting matrix, Pxy, can be rather large for a typical slice (e.g., 60,000×9). However, this does not pose numerical problems as long as there are at least nine linearly independent rows and the pseudo-inversion process solves for only 9 independent coefficients. Computation of 60,000×9 a pseudo-inverse takes less than 100 ms in Matlab (Mathworks Inc., Natick, Massachusetts) on a 1.6 GHz notebook computer. The resulting fields were denoted Spatially Regularized Robust (SRR) estimates.
In summary, the complete algorithm is:
The proposed NF estimation method was evaluated in simulations, phantom experiments, and in vivo acquisitions. Simulation studies evaluated the proposed method relative to commonly used methods. Table 1 summarizes the assumptions of each method. Phantom studies assessed the correlation between detected changes in SNR associated with the number of acquired k-space averages. An in vivo study investigated the reliability of NF estimation in a clinical setting. Analyses were performed using custom Matlab software and the LIBRA robust statistics package (http://wis.kuleuven.be/stat/robust.html). When comparing SRR against estimators that produce a single NF estimate, the median NF over the high SNR regions was reported.
The performance of each NF estimator (i.e., SRR, SDU, SDD, BE, and BH) was evaluated in two simulated situations. First, idealized performance was explored with constant, i.i.d. Rician noise. A brain/background phantom was simulated with an FOV of 64×64 pixels. A background region had zero signal intensity while a brain region had unit reference intensity and consisted of a single prolate tensor (λ1=1 mm2/s, λ2=λ3=0.1 mm2/s). Diffusion simulations were constructed using 30 directions described by Jones et al. (28) with a b-value of 1000 s/mm2. The performance of all the algorithms was evaluated on 100 Monte Carlo iterations at 36 SNR's linearly spanning 5:1 to 40:1 in terms of the average mean squared error between the estimated NF and the true NF for all 3,600 iterations. Second, a more realistic situation was carried out by introducing 1) spatially varying NF through a linear gradient image resulting in 50% reduced SNR at one corner and 50% increased SNR at an opposite corner, 2) background signal suppression by a factor of two, and 3) an artifact model where 5% of observations were randomly scaled by either a factor of five or one fifth. Again, the 100 iterations were performance at 36 SNR levels and the performance of each method was evaluated in terms of average mean squared error.
When the assumptions underlying all methods were met as in the first simulations (Figure 1A), the background estimator generated more accurate NF estimates with less variability than the other methods. In the second more realistic scenario (Figure 1B), SRR resulted in dramatically lower error (by a factor of 100 to 1000 in terms of mean squared error). Reliability of SRR was nearly constant at SNR's from 10:1 to 40:1 and slightly degraded near 5:1. This low SNR behavior is expected because the Gaussian approximation for a Rician distribution becomes more tenuous at low SNR.
A physiological phantom (pH 7.1, T1/T2=1500/80 ms) was imaged using a 3T MR scanner (Intera, Philips Medical Systems, The Netherlands) with body coil excitation and an eight channel phased array SENSE head-coil for reception. A series of five repeated datasets was acquired with 1, 2, 4, 8, and 16 k-space averages, respectively. Apart from the number of averages, the protocol for all datasets was identical. A multi-slice, single-shot EPI (SENSE=2.5), spin echo sequence (TR/TE=2000/69 ms) was used to obtain 15 transverse with no slice gap and 2 mm nominal isotropic resolution (FOV=240×240 mm, matrix=96×96, reconstruction=256×256). Diffusion weighting was applied along 15 distinct directions with a b-factor of 700 s/mm2. One minimally weighted image (S0) was acquired as part of each DTI dataset. The total scan time to acquire one k-space average of one DTI dataset (15 DW images and one S0 image) was 29s. Images were motion corrected with a 2D rigid body model to compensate for eddy current effects and physical motion using CATNAP (Coregistration, Adjustment, and Tensor Solving – a Nicely Automated Program) (29). Estimated NF with each method was correlated with the number of k-space averages.
NF estimates in the phantom were better explained by the number of signal averages with the proposed method (SRR) than the other methods (Figure 2). The NF estimates from SDD and SRR were in close agreement, with the SDD resulting in slightly higher (median +1.3%) and more variable (mean standard deviation +10%). To evaluate the goodness of fit, the empirical NF estimates were individually fit to , where L is the NF from each method, L0 is a reference NF (a free parameter), and N is the number of averages. The R2 for this model was 0.99, −0.8, 0.95, −0.93, and −0.85 for SRR, SDU, SDD, BE, and BH, respectively.
The accuracy and precision of NF estimation was evaluated in vivo using two sessions of repeated DTI scans from a single subject. The data were acquired from the Biomedical Informatics Research Network (BIRN) repository (http://www.nbirn.net). Each dataset consists of 15 DTI scans of a healthy 24 year old male volunteer, acquired using a 1.5T MR unit (Intera, Philips Medical Systems, Best, The Netherlands) with body coil excitation and a six channel phased array SENSE head-coil for reception. For each DTI scan, a multi-slice, single-shot EPI (SENSE=2.0), spin echo sequence (90° flip angle, TR/TE=3632/100 ms) was used to acquire 25 transverse slices parallel to the line connecting the anterior and posterior commissures, with no slice gap and 2.5 mm nominal isotropic resolution (FOV=240×240, matrix=96×96, reconstruction=256×256). A slice at the level of the corpus callosum was selected for comparative processing. Diffusion weighting was applied along 30 directions described by Jones et al. (28) with a b-value of 1000 s/mm2. Five S0 images were acquired and averaged in k-space on the scanner as part of each DTI dataset. The SNR on the resulting S0 images was approximately 23:1 within the corpus callosum (white matter). The total scan time to acquire one DTI dataset was 2 min 18 s. DTI data were corrected for subject motion with CATNAP (29).
For each session, “ground truth” NF estimates were established by evaluating the MOM estimator for each set of 15 observations for each DW image individually and averaging the estimated NF across all 30 DW images. Reliability of the other approaches was assessed by evaluating each estimator independently on all 105 combinations of two datasets (i.e., 15 choose 2). Accuracy of the methods was compared to the mean of the MOM estimate within the brain.
The NF estimate SSR detects low spatial frequency variability, as would be expected from coil sensitivity profiles, along with high frequency variations that are more readily attributed to noise (Eq. (8), (Figure 3A). Spatial regularization preserves the low frequency structure in the SRR NF estimates while mitigating high frequency variations (Eq. (10), Figure 3B).
The SRR estimates are in close agreement with the “ground truth” MOM estimate (Figure 4). The coefficient of variation (C.V., standard deviation of the estimate divided by the mean) of the SRR NF is consistent (C.V.<5%) across the brain. The mean absolute difference between the SRR estimated NF and MOM is within 5% over the majority of the brain. The median of the SRR was more accurate than the other scalar NF estimators (Table 2).
The advantages of estimating and exploiting a spatially varying NF have not been widely studied, in part due to the additional scans that previous approaches required to obtain these fields (15,16). The proposed method estimates NF for common protocols that involve acquiring repeated acquisitions without requiring either additional acquisitions or preservation of internal scanner data (e.g., reconstruction parameters and field maps) that may be unavailable with a standard clinical sequence. However, based on the existing literature it is not clear that such estimates are important or useful for routine analysis tasks. In this section, we demonstrate that accounting for spatially varying NF is relevant and important by comparing three common analysis tasks.
Rather than comparing specific estimation methods, as was done in the experiments section, here we investigate how data would be interpreted differently when using a spatially variable understanding of the NF as opposed to using a classical, constant NF when the true NF is spatially varying. Thus, the NFs considered in this section are simulated NFs, and no estimator was used to obtain the NF values. First, we examine a typical power analysis calculation for a change in FA. Second, we explore the impact of spatially variable noise on outlier detection. Third, we investigate the impact of NF on likelihood based tensor fitting.
To perform realistic tensor simulations, all available data (450 diffusion weighted and 15 minimally weighted volumes) from one session of the in vivo study were simultaneously used to construct a tensor field, minimally weighted signal intensities and a realistic NF. The tensor field was determined with traditional log-linear fitting, and the minimally weighted signal intensities were simply the average intensity observed along each diffusion weighting direction. The realistic NF was constructed with MOM (Eq. (7)). For the remainder of this section, these three fields were considered to be the ground truth. Simulated acquisitions were based on Eq. (2) by adding Gaussian noise in quadrature.
When designing a new study, one must establish an appropriate sample size to ensure sufficient probability of success while minimizing resource utilization and the burden on volunteers. This is typically accomplished through an analysis of power (30). The power of a study is the probability that the experiment will detect an expected effect (true detection rate (TDR)) with an acceptable probability of falsely detecting an effect when no effect exists (false detection rate (FDR)).
In this example, we determine the minimum sample size so that the study will have a 95% chance of detecting a change (TDR) in FA of 0.05 with a FDR of 0.01. Three ROIs were considered (Figure 5). First, we performed the analysis by assuming that there was a constant NF in each of three ROIs where the constant NF was the mean of the true, simulated NF within each ROI (30). Next, we used the true NF on a voxel-wise basis to calculate the required sample size.
Table 3 presents the results of both analyses: Three ROIs were considered (first column). If a constant NF is assumed in a power analysis, the median NF over the whole brain is used for all ROIs (second column). Alternatively, one could use the voxel-wise NF to perform a power analysis (third column). The truth model leads to a mean FA in each region (forth column). Based on the intensity NF (either “Constant NF” or “Voxel-wise NF”), the variability of FA was simulated (fifth and sixth columns, respectively). The ROIs exhibit substantial differences in FA variability, which is due both to the local NF and to the underlying tensors. Finally, standard power analyses are performed for each calculated FA variability that results in a minimum sample size to detect the specified effect if assuming a constant “median” NF (seventh column) or using the empirical voxel-wise NF (eighth column).
These results demonstrate that use of a constant noise field for power analyses can result in misleading predictions both in terms of artificially small minimum sample sizes (e.g., as shown for FA in the corpus callosum and internal capsule) or larger minimum sample sizes (e.g., as shown for FA in the frontal white matter). Neglecting spatial variability of noise can lead to either falsely low or false high impressions of experimental power.
Artifacts corrupt MR data with signal intensities that are not well modeled by Rician distributions. As artifact influences may come from a variety of sources, it is difficult to create representative models of their impact on intensity and, in turn, to predict the TDR versus FDR tradeoffs for outlier exclusion. To reject outliers one typically accepts a tolerable FDR and seeks to maximize the TDR. The rejection threshold is determined from the tolerable FDR based on a model of the signal and the noise. When the NF specified is incorrectly low, too much data is excluded and the error rate suffers. Conversely, when the NF is specified incorrectly high, artifacts are not excluded and the error rate also suffers. Spatially variable FDR causes uneven outlier rejection and could complicate subsequent reliability analyses.
To model this phenomenon, 5000 voxels were randomly selected in the high signal regions in the ground truth simulated dataset described above. First, true FDR was calculated based on the simulated spatially varying NF. Then, an outlier rejection threshold was chosen to give an FDR based on the assumption of a constant NF across the brain, where the constant value was the mean simulated NF over the brain regions. The spatially variable NF does not substantially impact the expected value of the resulting FDR given an intended FDR; however, it does increase the spatial variability of the FDR (Figure 6). A very modest FDR of 0.05 (excluding 5% of good data, indicated by the gray arrow), the expected difference between intended and actual FDR is 78% (with 95% range: 0.0056-0.186). If one used a single NF for outlier rejection, in some regions almost 20% of the non-outlier voxels would be excluded while in other regions less than 0.6% of the non-outlier voxels would be excluded. The differences between intended and actual FDR are especially pronounced in the low FDR regime, which is exactly where one would be expecting to perform outlier rejection.
An accurate understanding of MR noise properties is becoming more important as sophisticated tensor estimation techniques are developed. The traditional, log-linear tensor estimation method assumes i.i.d. Gaussian noise and fits a tensor model using linear regression; the NF is not relevant to this estimation process (31). More sophisticated methods introduce weighting terms to compensate for some of the differences between Gaussian and Rician distributions (3). A recently presented method, Diffusion Tensor Estimation by Maximizing Rician Likelihood (DTEMRL) (6), employs a robust maximum likelihood model based on the joint distribution of all observations.
To demonstrate the importance of compensating for spatially varying noise, we compare tensor estimation using DTEMRL and two assumptions: 1) the noise is correctly assigned to its spatially varying value (obtained from the simulated NF) and 2) the noise is set to the mean value of the simulated NF over brain regions. In DTEMRL, the diffusion tensor ) is chosen to maximize the joint likelihood given the set of observations under an augmented tensor model. In a noise-free setting, the diffusion tensor relates a baseline intensity Sref to the intensity observed along each diffusion weighting direction. The likelihood function can be constructed based on a set of Rician distributed observations (Eq. (1)) with the true signal ν determined by the tensor model (Eq. (2)) and noise fixed for each voxel,
The NF at each voxel σ) is critical for determining the shape of the likelihood function, and thus has a substantial impact on the tensor estimation process.
Ten thousand tensors were randomly drawn from the ground truth model over the brain region. For each tensor, a simulated acquisition was generated using the 30 direction diffusion weighting scheme at a b-value of 1000 s/mm2 with Rician noise corresponding to the ground truth level at the same location as the simulated tensor added in quadrature. For each simulated data set (10,000 tensors), DTEMRL was used to estimate a tensor first using the simulated ground truth NF and second with the median of the ground truth NF.
The estimates with the median NF were significantly less accurate than those estimated with the true NF (Figure 7). The mean squared error on fractional anisotropy was 8.7% higher (5.42±7.36 versus 4.98±6.58, p<0.001), and the mean angular difference for white matter (FA>0.25) (e.g., “cone of uncertainty” (29)) was 1.7% larger (16.0°±16.6° versus 15.7°±16.3°, p<0.001) with the median NF relative to the true NF.
We propose an NF estimator that is stable, accurate, and does not depend on the existence of a background region. It handles spatial correlations, which are common in DTI due to up-sampling and interpolation. The method is fully automated and robustly improves NF estimation leading to improved power analysis, outlier identification and tensor estimation.
The degree of spatial inhomogeneity in NF depends strongly on the acquisition (including coil selection, SENSE factor, scanner hardware, anatomy of interest, etc.). While NF has been typically assumed to be constant across an image, both the phantom and in vivo experiments demonstrate that the NF may exhibit substantial spatial heterogeneity. In the phantom experiments, the difference between the 75th quantile and the 25th quantile was 27% of the median PRR SDC estimate in the region of the phantom bottle. In the in vivo experiments, the difference between the 75th quantile and the 25th quantile was 58% of the median MOM estimate over the head. Therefore, spatial variations in excess of 50% would appear commonplace using acquisitions similar to the standard DTI protocols used in this study.
The application case studies demonstrate that spatial variations in NF can significantly change the results of common analysis tasks. Power analyses in DTI are known to be problematic because the reliability is dependent on the underlying tensor and the relationship to the reference frame (29); Table 3 shows that reliability is further complicated by sensitivity to the magnitudes of spatially variability present in a clinical DTI scan. Adjustment for the spatial NF variability is shown to be crucial for ensuring consistent FDR in outlier rejection (Figure 6). Finally, accurate NF specification is shown to improve likelihood-based tensor estimation.
SRR produces post hoc estimates of background suppressed regions and acquisition sensitivity maps, which are quantities systematically derived within scanner software. If empirical coil sensitivity profiles (e.g., SENSE maps) from the scanner were directly exploited, then one may improve NF estimates. Furthermore, one may augment the NF estimate by using background suppression parameters (if available). While these parameters can be accessed in specialized research settings, the information is not commonly available with clinical sequences or in every day imaging research. The potential tradeoffs between improved performance through additional scans and/or preservation of scanner data and the decreased applicability to clinical sequences and legacy data should be carefully weighed. SRR can be immediately used with clinical scans without sequence modification or additional data storage.
The approach and motivation of the SRR method may be readily generalized to other DW and multi-scan studies. The primary requirement is that images of the same anatomy be acquired with sequences that have the same theoretical noise properties. Differences in contrast are not a problem because the NF is accessible from differences of images with the same contrast (Eq. (8)). For example, the proposed approach is well suited for q-space MRI (32), in which DW images are acquired at a large number of b-values for a single direction, or DSI, q-ball, and other HARDI techniques (33), in which DW images are acquired for multiple directions and/or multiple b-values. One could also apply this approach to other modalities that have consistent noise properties across scans, such as multiple flip angle T1 studies or functional MRI studies with a repeating structure. In the case of a single scan, a model may be fit to each voxel and the residuals may be used in place of the difference images. For multiple repetitions, the order of scale and location estimation may be interchanged: a scale estimator can estimate the NF at each position and a location estimator (e.g., mean, median) can combine information across images.
This work was made possible by NIH support from R01 NS056307 and a National Defense Science and Engineering Graduate Fellowship (Department of Defense, Office of Naval Research).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.