Q-Ball Imaging Using Spherical Harmonic Reconstruction
According to the q
-space formalism, the wavevector that describes diffusion encoding in a pulsed-gradient spin-echo experiment is defined as q
represents the gyromagnetic ratio, δ
is the diffusion gradient duration, and g
is the diffusion gradient vector. HARDI acquisition schemes measure samples from an underlying diffusion-attenuated signal E
) at a finite set of points on the sphere that are relatively uniform in their angular distribution. The choice of the spherical sampling radius in q
, depends on the desired angular resolution, the available SNR, and the gradient performance specifications. In practice, b
-values of 3000 s/mm2
or greater are typically required in order to distinguish among different intravoxel fiber populations at high angular resolution (Hess et al., 2006
To simplify the numerical solution, both q
are discretized to reflect finite sampling of m
measurements and reconstruction over a fixed number of points n.
Because both the ODF and the diffusion signal are defined on the domain of the sphere, it is convenient to normalize spherical points to unit magnitude and adopt a spherical coordinate system q
) and u
), where θ
denote elevation and azimuth, respectively.
Linear HARDI reconstruction algorithms such as QBI and spherical deconvolution have in common that each point of the reconstruction is computed as a linear combination of the diffusion measurements. Enumerating points on the sphere to construct a vector representation, this relationship can be expressed as
denote n ×
1 and m
× 1 column vectors composed of the estimated values of the ODF and the diffusion measurements, respectively. Here ZQ
is the matrix of spherical harmonics evaluated at measurement points Q, and ZU
is the matrix of spherical harmonics evaluated at reconstruction points U. Diagonal matrix P
] contains the Legendre polynomials of order l
= 0, 2,…, L.
Details of notation and formulation are contained in Hess et al., (2006)
. The conventional way of obtaining an estimate of f
is the pseudoinverse, which happens to be the maximum likelihood estimator in the presence of Gaussian additive noise:
Depending on the reconstruction algorithm employed, the n
reconstruction matrix A
is constructed using the spherical sampling geometry and the assumed relationship between the diffusion space and the ODF. The ODF was obtained by Hess et al., (2006)
in two ways – first by direct pseudoinverse of A
as above, and also by adding a standard Tikhonov regularization term. Tikhonov regularization (Press et al., 1992
) is performed to reduce the noise sensitivity of the transform. The harmonic q
-ball reconstruction in both cases is a linear transform of the diffusion measurements. Because the spherical harmonics form a complete spherical basis, the representation is capable of describing any bounded, finite-energy ODF given a sufficiently large order L
A New Bayesian ODF Reconstruction Algorithm Exploiting Spatial Priors
We now use a Bayesian estimation framework to introduce spatial smoothness constraints during the reconstruction. Our motivation is that voxels in diffusion images are expected to be coherent, both along spatial dimensions and in diffusion orientation space. Except at the boundaries between tracts, the density, orientation and anisotropy of fiber populations are expected to change smoothly from one voxel to the next, particularly along the dominant fiber orientation. For instance, the circled fibers in can be seen to belong to the same bundle, and hence among the voxels that contain fibers from this tract, there is a strong spatial coherence. These constraints can be exploited to improve the numerical conditioning of the reconstruction, and thereby decrease the sensitivity to noise. Current techniques do not make use of these constraints because they perform the ODF reconstruction independently for each voxel.
Figure 1 Spatial and angular coherence in HARDI. Except at the boundaries between separate tracts, fiber orientation in neighboring voxels varies smoothly, and as a result adjacent voxels are likely to have similar ODFs. By imposing spatial constraints among individual (more ...)
To incorporate smoothness constraints and to jointly estimate the ODFs of each voxel within the entire volume, we cast the problem of estimating the parameters of all HARDI ODFs together as one of minimization of a cost function that incorporates the probabilistic spatial constraints.
, ZU, P
etc are defined as before, and let us capture, in vector η
all the spherical harmonic coefficients of the entire spatial field, and in vector f
, the corresponding ODF field. For voxels enumerated as 1, …, p
, …, N
, this defines a collection of individual voxel-wise coefficients and ODFs:
The task is then to estimate the value of η
that simultaneously (1) fits both the diffusion measurements within each voxel (data consistency), and (2) minimizes the smoothness cost function that encodes inter-voxel spatial constraints. Since the estimation is jointly over all voxels, it is convenient mathematically to concatenate matrices ZQ
diagonally, yielding matrices that operate on all the voxels, as follows:
Assuming uncorrelated Gaussian additive noise, the likelihood function is given by
The constant α
is inversely related to the power of additive noise and is assumed to be an unknown quantity to be estimated empirically. For spatial constraints, we employ the following probability distribution model:
where subscript R is shorthand for vector norm with respect to some matrix
. Scalars β1
are unknown constants, called hyperparameters of the prior distribution (2
), which are empirically determined as described in Methods section, matrix D
is a first difference operator, and W
) is a matrix that represents a neighborhood weighting function. Here we have omitted the partition function for tractability. We will allow the weight matrix to depend on the configuration of ODFs, hence it is shown as parameterized by η
The motivation behind this prior model is discussed below. We note that recently so-called robust priors involving the L1
norm have been proposed in related fields (Raj et. al, 2007
) but we eschew that approach for reasons of computational complexity.
In Bayesian methods, the task is to find the unknown configuration η, given observations e and knowledge of likelihood and prior probability distributions. Here we use the maximum a posteriori (MAP) estimator, which selects the η that maximizes the posterior probability
which follows from Bayes’ Theorem. Combining the above equation with the negative log of Eqns (1)
, the MAP estimate can be obtained by minimization of the cost function:
are unknown constants, which can be interpreted as regularization factors whose heuristic determination is described in Methods section. From the MAP estimate of the spherical harmonic coefficients η
, the ODF profile is computed voxelwise according to
This Bayesian formulation of ODF reconstruction has similarities with regularized reconstruction methods proposed earlier, especially Tikhonov regularization. The first term in Eq. (4)
can be understood to enforce data consistency and exactly reproduces the least squares fitting approach. The last term enforces Tikhonov regularization (when R=I
) by adding a stabilizing term to the least squares fitting and favors solutions which have a low L2
norm. As mentioned above, there exists an explicit formula for computing this estimate using the Moore-Penrose pseudoinverse. Although we show the Tikhonov regularization term, our work is independent of which regularization scheme is used – whether Tikhonov, Laplace-Beltrami or otherwise. For brevity these other terms are not shown, but we assume that they can be captured by an appropriate norm induced by matrix R.
The middle term is new, and introduces the spatial smoothness cost discussed above. When W(η) = I,
the solution that results is maximally smooth under the data consistency constraint, because other potential non-smooth solutions will have a large first derivative norm and cause the smoothness cost to increase.
While global smoothness is frequently used in multi-dimensional imaging applications, its utility is limited in situations where global smoothness may not apply. For instance, the ODF field in typical MR data displays smoothness along the orientations of individual fiber bundles but may not display smoothness perpendicular to the fiber, or around voxels with crossing fibers. Global smoothness in these situations may lead to an indiscriminate blurring of the ODFs. To prevent this from occurring, we use a spatial weight matrix W(η). This matrix determines the strength of the constraint to place on the similarity of the spherical harmonics between neighboring voxels. Because the weight between two neighboring voxels should generally depend on their ODFs and hence on the joint configuration η, we denote an explicit dependence on η. Computationally, the nonlinearity that the spatial weighting matrix introduces into the function renders the optimization sensitive to starting estimates and prone getting stuck in local minima. Ideally the estimation of η should reflect the non-linear presence of W; however the resulting estimation problem would then become intractable and difficult to minimize.
Here we propose a simple iterative scheme whereby both W(η) and η are updated alternately. The process is summarized below:
- Begin with η = η0. Then for k = 1 to K, repeat:
- Finally, reconstruct ODF: = UK̄
refers to the 3D neighborhood system corresponding to the 26 nearest neighbors of every voxel. This iterative update algorithm alternately estimates the spatial weight matrix based on the current estimate of the ODF field, and then applies this weight matrix to solve for the Bayesian estimator. The initial estimate is based on the unconstrained ODF estimate obtained for each voxel.
The ODF field estimate is iteratively refined by minimizing the cost function in Eq. 5
, using a least squares algorithm called LSQR (Paige, et. al., 1982
), which is based on conjugate gradient descent. Because the cost function passed to LSQR is a quadratic function, it can be easily calculated. We employed the LSQR implementation in MATLAB version 7.8.0. LSQR was chosen for this purpose for several reasons. First, in each iteration, step 3 above is a linear least squares problem, which is preferable to solving an unconstrained general optimization problem. Second, LSQR does not involve computing the normal equations (Press et al., 1992
) which are known to have much worse conditioning than LSQR, which is based on conjugate gradients (Paige, et. al., 1982
). The MATLAB implementation of LSQR is particularly attractive because it does not require explicit computation or storage of
, which is a very large matrix. Instead, a user-defined function can be passed which simply returns the product
, which is easily computed in 0
) time, where N
is the number of voxels in the brain volume, and P1
is the number of diffusion directions and P2
is the number of ODF directions. In order to speed up processing, we update the weight matrix at each iteration, which means that the outer loop involving weight updates can be dispensed with. Therefore the computational complexity of the algorithm is Q
), where KLSQR
is the number of LSQR iterations, which in practice is between 10 and 20 till full convergence. Note that the cost is linear in N
, just like traditional ODF methods.
The iterative approach above has a well-tested pedigree in both alternate projections onto convex sets (Gerchberg et al., 1972
) and in Expectation Maximization (EM) theory (Zhang et al., 2001
). Several theoretical and practical results from these fields point to guaranteed convergence of these methods. The form of W
) above was chosen based on computational and tractability considerations. Intuitively, this iterative scheme weighs neighboring voxels in inverse proportion to the similarity of their coefficients, as shown in . This avoids including voxels with vastly different fiber profiles in the difference operator, and limits the smoothness constraint to voxels with similar fiber populations. This prevents blurring that would otherwise result from enforcing the global smoothness constraint.
Schematic of properties of the spatial weight function. When neighboring ODFs are similar in shape, they should have a