Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC3677778

Formats

Article sections

Authors

Related links

Neuroimage. Author manuscript; available in PMC 2013 October 15.

Published in final edited form as:

Published online 2012 June 23. doi: 10.1016/j.neuroimage.2012.06.027

PMCID: PMC3677778

NIHMSID: NIHMS389388

Zhaowei Hua,^{a} David B. Dunson,^{e} John H. Gilmore,^{c} Martin A. Styner,^{b,}^{c} and Hongtu Zhu^{a,}^{d,}^{*}

The publisher's final edited version of this article is available at Neuroimage

See other articles in PMC that cite the published article.

We propose a semiparametric Bayesian local functional model (BFM) for the analysis of multiple diffusion properties (e.g., fractional anisotropy) along white matter fiber bundles with a set of covariates of interest, such as age and gender. BFM accounts for heterogeneity in the shape of the fiber bundle diffusion properties among subjects, while allowing the impact of the covariates to vary across subjects. A nonparametric Bayesian LPP2 prior facilitates global and local borrowings of information among subjects, while an infinite factor model flexibly represents low-dimensional structure. Local hypothesis testing and credible bands are developed to identify fiber segments, along which multiple diffusion properties are significantly associated with covariates of interest, while controlling for multiple comparisons. Moreover, BFM naturally group subjects into more homogeneous clusters. Posterior computation proceeds via an efficient Markov chain Monte Carlo algorithm. A simulation study is performed to evaluate the finite sample performance of BFM. We apply BFM to investigate the development of white matter diffusivities along the splenium of the corpus callosum tract and the right internal capsule tract in a clinical study of neurodevelopment in new born infants.

Diffusion tensor imaging (DTI) is a magnetic resonance imaging (MRI) technique that measures the diffusion orientation of water molecules in tissue. The mobility of water molecules is affected by the tissue properties of white matter fiber tracts, such as the density of the fibers, the average fiber diameter, and the directionality of the fibers. In turn, the information collected on the diffusion properties of water molecules provides the structural organization of the white matter fiber tracts (Kubicki et al., 2007). The water diffusion directions and magnitudes at each voxel in the brain can be described by a 3 × 3 symmetric positive matrix, called a diffusion tensor (DT) (Basser et al., 1994a, 1994b). The degree of diffusivity can be quantified by the three eigenvalue–eigenvector pairs of DT, and its related parameters, such as fractional anisotropy (FA) (Hasan and Narayana, 2003; Hasan et al., 2001; Pierpaoli and Basser, 1996; Zhu et al., 2006). A rich literature in neuroimaging has been developed to analyze white matter fiber tract maturation and integrity via a set of water diffusion parameters, such as FA and mean diffusivity (MD), used as biomarkers (Cascio et al., 2007; Moseley, 2002; Mukherjee and McKinstry, 2006; Rollins, 2007).

Three major analytic approaches for grouping analysis of DTI data have been explored including analyses based on region-of-interest (ROI), voxels, and fiber tracts (O’Donnell et al., 2009; Smith et al., 2006; Snook et al., 2007; Stieltjes et al., 2001; Xue et al., 1999). The region-of-interest (ROI) method (Bonekamp et al., 2008; Gilmore et al., 2008) suffers from difficulties in identifying meaningful ROIs, particularly the long curved structures common in fiber tracts, instability of statistical results, and the partial volume effect in relatively large ROIs (Snook et al., 2007). Voxel based analysis (Camara et al., 2007; Chen et al., 2009; Focke et al., 2008; Snook et al., 2005) suffers from issues of alignment quality and the arbitrary choice of smoothing extent (Ashburner and Friston, 2000; Jones et al., 2005; Smith et al., 2006; Van Hecke et al., 2009).

There is an extensive interest in the DTI literature in developing fiber tract based analysis of diffusion properties (Goodlett et al., 2009; Goldsmith et al., 2011; O’Donnell et al., 2009; Smith et al., 2006; Stieltjes et al., 2001; Xue et al., 1999; Yushkevich et al., 2008; Zhu et al., 2010, 2011). Early tract based analyses simply average fiber tract profiles over multiple subjects (Stieltjes et al., 2001; Xue et al., 1999). Recent tract based analyses consist of DTI atlas building and a follow-up statistical analysis (Goodlett et al., 2009; Smith et al., 2006; Zhu et al., 2010). DTI atlas building is primarily to establish DTI correspondence across all DTI datasets from different subjects and to extract a set of individual DTI tracts (or skeleton) with the same corresponding geometry but varying DT and diffusion properties. For instance, in Smith et al. (2006), they developed a *voxel based analysis* framework to construct local diffusion properties along the white matter skeleton, fitted pointwise linear regression models, and performed pointwise hypothesis tests on the skeleton. This method essentially ignores the functional nature of diffusion properties along the white matter skeleton, and thus suffers from low statistical power in detecting interesting features and exploring variability in functional data (Zhu et al., 2011). In Goodlett et al. (2009), they used functional principal component analysis (fPCA) coupled with the Hotelling *T*^{2} statistic to compare a univariate diffusion property, such as fractional anisotropy, across two (or more) populations for a single hypothesis test per tract. Zhu et al. (2010, 2011) proposed a multivariate varying coefficient model based on fPCA for the analysis of fiber bundle multiple diffusion properties and their association with a set of covariates of interest such as age. In Greven et al. (2010), they developed a functional mixed effects model as a generalization of mixed effects models and fPCA for the analysis of longitudinal DTI fiber tract data. So far, frequentist inference is the primary approach for making statistical inferences in these statistical models for fiber tract based analysis of diffusion properties.

Bayesian nonparametric models incorporate infinitely many parameters to flexibly model uncertainty in an unknown random effects distribution *P*. A sparseness favoring approach can be accomplished through the intrinsic Bayesian penalty for model complexity (Jeffreys and Berger, 1992), which is aided through centering on a base Bayesian parametric model. The Dirichlet process (DP) (Ferguson, 1973, 1974) provides a routinely used nonparametric prior, allowing *P* to have an unknown discrete form that leads to global clustering of subjects. This global clustering in which two subjects are either in the same cluster for all their random effects or in completely distinct clusters ignores local differences among subjects, motivating local partition process (LPP) priors (Dunson, 2009) that accommodate simultaneous global and local clustering of subjects.

In this paper, we propose a new semiparametric Bayes approach to model the association between multiple fiber bundle diffusion properties and covariates of interest. A multivariate random coefficient model is developed to characterize heterogeneity in the shape of the fiber bundle diffusion properties among subjects, while allowing the impact of covariates to vary across subjects. We assume a nonparametric Bayesian LPP2 prior (Dunson, 2009) on the distribution of the random coefficients to facilitate global and local borrowings of information among subjects. We consider a sparse latent factor model to more flexibly capture the within-subject correlation structure and assume a multiplicative gamma process shrinkage prior on the factor loadings which allows introduction of infinitely many factors (Bhattacharya and Dunson, 2011). We propose Bayesian local hypothesis testing to identify fiber segments, where multiple diffusion properties are significantly associated with covariates of interest, while controlling for multiple comparisons. We propose a Bayesian credible band for the average effect of each covariate. Finally, we use the nonparametric LPP2 prior to randomly cluster subjects via global and local clustering. Posterior computation proceeds via an efficient Markov chain Monte Carlo (MCMC) algorithm using slice sampling (Kalli et al., 2011; Papaspiliopoulos, 2008).

This paper focuses on developing a semiparametric Bayesian local functional model pipeline, deemed BFM, to assess the association between fiber bundle diffusion properties and a set of covariates of interest (e.g., age). Before BFM, we use DTI atlas building followed by atlas fiber tractography and fiber parametrization as described in Goodlett et al. (2009) to extract DTI fibers and establish DTI fiber correspondence across all DTI datasets from different subjects. We skip its description here for the sake of simplicity (Goodlett et al., 2009; Zhu et al., 2010). After performing the DTI atlas building step, we obtain a set of individual fiber tracts (or skeleton) with the same corresponding geometry but varying DTs and diffusion properties. Subsequently, we run the analysis pipeline BFM including a multivariate random varying coefficient model, a semiparametric Bayesian estimation method, a Bayesian approach to construct credible bands, a Bayesian local hypothesis testing procedure, and a Bayesian posterior cluster analysis procedure (Fig. 1). The computational algorithm for BFM is developed by using Matlab.

A schematic overview of BFM: a multivariate random varying coefficient model for multiple diffusion properties, an MCMC posterior estimation method for estimating the coefficient functions, a construction of Bayesian credible bands of the mean covariate **...**

Assume *n* subjects are measured along a fiber bundle over a grid of *T* points for *M* diffusion properties (e.g. FA), denoted by {*Y*_{i}(*d _{t}*):

(1)

where * X_{i}* = is the design vector for the ith subject with

We model the random coefficient function (*d _{t}*) by using a linear combination of cubic B-spline basis functions

(2)

where **b**(*d _{t}*) = (

(3)

where ’ and are (*K* + 1)*p* × 1 vectors. Heterogeneity in the fiber bundle diffusion properties is then controlled by the variation of , which are usually treated as random effects and follow a specific parametric distribution. There is a serious concern about the sensitivity of associated inferences to the choice of the random effects distribution on the basis coefficients.

Let be an *M*(*K* + 1)*p* × 1 vector, * B_{i}* = (

(4)

where * N_{MT}*(

Given the massive dimensionality of the random effect vector * η_{i}*, it is important to favor lower dimensional representations of the dependence structure to address the curse of dimensionality. Instead of prespecifying a restrictive dependence structure, we follow the approach of using a Bayesian factor model, which relates the random effects

(5)

where * A* is a (

We obtain the within-subject correlation structure through projecting * Y_{i}* as a linear combination of the underlying

(6)

where * B_{i}*(

Let *Y _{if}* and be, respectively, the

Specifically, conditional on * A* and

where (**I**_{M}* B_{i}A*)

A constraint is usually specified on * A* to define a unique model free from identification problems, since Ω is invariant under the transformations

Although FRATS (Zhu et al., 2010) and FADTTS (Zhu et al., 2011) have been proposed to analyze the DTI fiber bundle datasets of multiple diffusion measures with a set of covariates such as gender and gestational age, they assume identical effects of the covariates on the diffusion properties for all subjects. Our BFM developed here relaxes the assumption to allow the subject specific functional covariate effects on the multiple diffusion outcome functions. In addition, FRATS and FADTTS assume a covariate-free correlation structure among the multiple diffusion properties. We introduce the random factors underlying the subject specific covariate effects to allow the correlation structure among the multiple diffusion measures to vary with the levels of the covariates along the fiber tracts. Moreover, we specify an LPP2 prior on the distributions of the subject specific random factors to produce a global and local clustering structure of the multiple diffusion trajectories across the subjects.

We choose independent priors for ** A**,

(7)

where *δ _{l}*,

In modeling the multiple trajectories of diffusion measures over subjects, we are interested in identifying the latent cluster structure of * Y_{i}*s among subjects, which can be induced by specifying a nonparametric prior on the unknown distribution of the random factors

After truncating the loadings matrix ** A** to

(8)

where *δ*_{χ} denotes a degenerate distribution with all its mass at χ, *π*_{ψ1,…,ψh*} is the probability of *θ _{i}* = Θ

The indicator *z _{j}*~Ber(

Let *π _{h}* denote Pr(

The allocation probability of * θ_{i}* = Θ

The specification is completed by choosing the hyperpriors

where *γ* controls the overall weight on the local family, and *α* controls the overall number of clusters. The LPP2 prior on the random effects distribution *P* is denoted by *P* ~ LPP2(*α*,*γ*,*P*_{0}).

The LPP2 prior specification (8) can also be viewed as a hybrid mixture model of infinitely many components drawn from *P*_{0} via 2-stage clustering. Stage 1 determines the membership of global or local clustering for each of the *h*^{*} components. Those components allocated to the global clustering membership will be clustered together to an atom in the global family at stage 2, while those allocated to the local family will be allocated individually to their own clusters. The joint cluster membership probability at stage 1 corresponds to Pr(*z*_{1},…,*z**h ^{*}*) = . The joint cluster allocation weight at stage 2 conditional on stage 1 corresponds to Pr(ψ

We are interested in making pointwise inference for the covariate effects along a fiber bundle. The local null and alternative hypotheses for the *k*th covariate effect on the *m*th diffusion property specific to a location *d _{t}*[0,

where (*d _{t}*) represents the mean of the subject-specific random coefficients for the

To conduct local hypotheses testing, we use the Bayesian decision rule for multiple testing proposed by Müller et al. (2004). Following the implementation in Wang and Dunson (2010), our strategy is to reject if the posterior alternative hypothesis probability |*Data*)≥*r* for any *t*[0,*T*], with *r* as a common threshold for all the local hypotheses. *r* is chosen to minimize the posterior expected false negative rate (FNR) under the constraint of the posterior expected false discovery rate (FDR) being no greater than *α _{T}*, where

where *t*_{1},…,*t _{W}* is a fine grid of points equally spaced along [0,

We construct a Bayesian simultaneous credible band for the mean coefficient curve , *k* = 1,…,*K*, *m* = 1,…,*M*, from its posterior MCMC samples. Assuming there is a collection of posterior sampled curves , *s* = 1,…,*S* indexing posterior iterations after burn-in, our goal is to compute a simultaneous credible band for . The principle in constructing a Bayesian credible band is to search for a region *R _{α}* = {

where *α* is a pre-specified significance level.

The strategy is based on pointwise measures of uncertainty. We used the method proposed by Crainiceanu et al. (2007) which assumes approximate posterior normality at each grid point and derives the 1–*α* sample percentile *c*_{1–α} of

where (*t _{l}*) is the posterior mean at time

In our implementation, we replaced *c*_{1–α} by *c _{b}* which is calculated by

We propose a Bayesian clustering approach to identify the number of clusters and to probabilistically assign each individual to the identified clusters. The nonparametric LPP2(*α*,*γ*,*P _{0}*) prior has the feature of randomly clustering subjects via global and local clustering. Given a truncated

The set of probability weights assignment for both global clustering and local clustering are formulated by a sticking-breaking process:

where ϕ_{ij} indexes the assigned cluster for *θ _{ij}*, with

In practical posterior computation, we develop a posterior cluster analysis procedure on the clustering of the subject specific latent random factors {* θ_{i}*:

We conducted a set of Monte Carlo simulations to evaluate the false discovery rate (FDR) and the power of the local hypothesis testing approach. We simulated the two diffusion measures FA and MD along the right internal capsule tract obtained from a clinical study (Fig. 2(a)) according to the following model:

(9)

where *d _{t}*[0,

Simulation study: power of the local hypothesis tests for the effects of gestational age on FA and MD along the right internal capsule tract, under VBA and BFM, evaluated at five different values of c and three different values of for BFM, for sample **...**

We analyzed each simulation scenario by using 100 repeated simulations. We set *n* = 32 and 64. We randomly chose 16 males and 16 females for * n = 32 (or 32 males and 32 females for n* = 64) from our clinical data and used their values of gender and gestational age to simulate the values of FA and MD along the right internal capsule tract. We assigned Ga(1,1) hyper priors for the hyperparameters

We conducted the local hypothesis testing of for each point *d _{t}* along the right internal capsule tract and the

The DTI fiber tract data come from a clinical study approved by the Institute Review Board of the University of North Carolina at Chapel Hill. This large study was designed to investigate early brain development. Healthy infants less than 1-year old were recruited with written informed consents obtained from their parents before imaging acquisition. In our study, a total of 128 healthy full-term infants (75 males and 53 females) are included, whose mean gestational age at MR scanning is 298±17.6days (range: 262 to 433days). All infants were placed with efforts such that they slept comfortably inside the MR scanner and none of them was sedated during the imaging acquisition. They were fed and calmed to sleep on a warm blanket with suitable ear protection.

The device used to acquire all images is a 3T Allegra head only MR system (Siemens Medical Inc., Erlangen, Germany), featuring 40mT/m for the maximal gradient strength and 400mT/(m.ms) for the maximal slew rate. Every subject was scanned for contiguous slices with slice thickness of 2mm to cover the whole brain. Each slice was repeated 5 times such that an average was obtained to improve the signal-to-noise ratio. These acquired DTI images are obtained via a single shot EPI DTI sequence (TR/TE=5400/73ms) with eddy current compensation. Diffusion gradients were collected at six non-colinear directions: (1,0,1), (−1,0,1), (0,1,1), (0,1,−1), (1,1,0), and (−1,1,0), at the *b*-value level of 1000s/mm^{2}. The reference scan at the *b*-value of 0 was implemented to construct diffusion tensor matrices. The voxel resolution was set at isotropic 2mm, and the in-place field of view was set at 256mm in both directions.

The diffusion tensors were constructed using a weighted least square estimation method (Basser et al., 1994b; Zhu et al., 2007). We then processed all 128 DTI datasets using the image processing steps in the DTI atlas building and computed diffusion properties along all fiber tracts of interest. We focused on analyzing two tracts of interest including the splenium of the corpus callosum tract and the right internal capsule tract (Figs. 3(a) and 8(a)). We computed fractional anistropy (FA), mean diffusivity (MD), and the three eigenvalues of the diffusion tensors, denoted by λ_{1}≥λ_{2}≥λ_{3}, at each grid point on both tracts for each of the 128 subjects. FA denotes the inhomogeneous extent of local barriers to water diffusion, and MD measures the averaged magnitude of local water diffusion. The three eigenvalues of the diffusion tensor reflect the magnitude of water diffusivity along and perpendicular to the long axis of white matter fibers (Song et al., 2003).

Right internal capsule tract description: (a) the right internal capsule tract extracted from the tensor atlas with color presenting mean FA value; diffusion properties of FA in panel (b), MD (mm^{2}/s) in panel (c), λ_{1} (mm^{2}/s) in panel (e), λ **...**

Splenium tract description: (a) the splenium capsule tract extracted from the tensor atlas with color presenting mean FA value; diffusion properties of FA in panel (b), MD (mm^{2}/s) in panel (c), λ (mm^{2} 1 /s) in panel (e), λ (mm^{2} 2 /s) in **...**

We focused on addressing four goals in the analysis of the multiple diffusion properties along the right internal capsule tract and the splenium tract. (i) Our first goal was to determine if there is a difference in the multiple diffusion properties along the fiber bundles between male and female infants. (ii) Our second goal was to investigate the development of the diffusion properties along the fiber bundles. (iii) Our third goal was to accomplish a parsimonious representation of the multiple diffusion trajectories across subjects. (iv) Our fourth goal was to examine covariate-dependent correlation patterns among the multiple diffusion properties along the fiber bundles.

We applied BFM (9) to model the smoothed FA and MD functions (M = 2) along the right internal capsule and the splenium capsule with **χ**_{i}=(1,*G _{i}*,

Results of the gender effects on FA and MD along the right internal capsule tract: the estimated coefficient curves (–) and the corresponding 95% credible bands (- -) for FA in panel (a) and MD in panel (b); (c) the posterior probability curves **...**

We assigned Ga(1, 1) hyper priors for the hyperparameters *α* and *γ*, a Ga(0.1, 0.1) prior for the precision parameters , and Ga(2, 1) priors on *a*_{1} and *a*_{2}. We specified *ν* = 3 and a default choice of 5*log*((*K*+ 1 )*pM*) as the initialized number of factors. We recommend these settings as default values in other analyses of DTI fiber tract data. We summarized the 1000 posterior samples from the MCMC output after the burn-in of 4000 iterations. Additionally, to perform the local hypothesis testing, we collected 80,000 posterior samples after the burn-in of 20,000 samples. Using multiple chains with widely-distributed points, the proposed MCMC algorithm exhibits good rates of convergence and mixing and takes 519s per 100 iterations in Matlab 2010b on a Lenovo X61 laptop. We repeated the same MCMC algorithm for different hyper-parameter values. For instance, we multiplied the mean and variance of *α* and *γ* by 2 and 0.5, and multiplied the variance of *P*_{0} by 2 and 0.5. We used *ν* = 3.5,4, and 5 and varied *b _{σ}* between 0.1 and 0.5. We also used different starting numbers of factors between 3 and 10. The posterior results were found to be robust to the varying hyper-parameter values.

In the null case, FDR equals one based on its definition. Table 1 indicates that FDR is accurate for the sample size of 32 and 64. Consistent with our expectation, the statistical power for rejecting the local null hypotheses increases with the sample size and decreases with the zero neighborhood size ε. Inspecting Table 1 and Fig. 2 reveals that BFM is much more powerful than VBA. BFM has an attractive feature in explicitly modeling the structured inter-subject variability. In contrast, GLM used in VBA ignores the correlation among the tract data at different grid points, and thus the larger covariance estimates obtained from GLM decreases its power in signal detection.

Fig. 4(c) shows that the degrees of both FA and MD vary significantly between male and female groups along sub-regions of the right internal capsule (RIC) tract. Inspecting the posterior mean effect curves of gender on FA and MD and their 95% credible bands Figs. 4(a) and (b) reveal that in a large subregion of the RIC tract, FAs in female infants are smaller than those in male infants, whereas female infants have larger MD than male infants in half of the RIC tract, but such differences diminish in the other half of the RIC tract. The difference in MD between male and female groups was also observed in the sample average MD values (Fig. 4(d)). Such region-dependent differences between males and females were not observed by using FRATS and FADTTS. BFM has an advantage in modeling structured subject specific effects of gender, while FRATS and FADTTS assume identical gender effects for all subjects. The covariance estimates of FRATS and FADTTS are larger than the posterior variability of the posterior gender effect estimates of BFM. Therefore, FRATS and FADTTS are less powerful than BFM in detecting the gender effects which vary on tract regions.

Similar to FRATS and FADTTS, Fig. 5(b) reveals uniformly significant gestational age effects on FA and MD values along the RIC tract. We also observed that FA increases with gestational age (Figs. 5(a) and (c)), while MD decreases with gestational age (Fig. 5(d)). This finding is consistent with the existing findings in other infant groups (McGraw et al., 2002).

Results of the gestational age effects on FA and MD along the right internal capsule tract: (a) 3D surf plot of FA along the right internal capsule tract for visualizing the gestational age effects; (b) the posterior probability curves (–) in **...**

Therefore, BFM provides a more advanced analytic approach to study the spatiotemporal trend in anisotropic development in new born infants. *Latent clusters for FA and MD trajectories*. We examined latent clusters in the FA and MD measures along the RIC tract across the 128 infants. The estimated value of the hyperparameter *γ*, controlling allocation weight on the local family, equals to 0.51 with 95% credible interval of [0.12, 1.20]. It implies that the posterior proportion of local clustering is 1/3, indicating that the data favors global clustering of the FA and MD trajectories across infants. The estimated value of the hyper-parameter *α* = 5.34, with 95% credible interval of [2.59, 9.65]. These values suggest few clusters. This conclusion is further supported by the estimated values of 4.45, 3.94, and 4.42 for the numbers of overall, global, and local clusters, respectively (Fig. 3(d)). This result agrees with the FA and MD trajectories in Figs. 5(b) and (c). For instance, although most MD trajectories cluster together, some ‘outlying’ MD trajectories exist with extremely smaller/larger MD values.

Along the RIC tract, we further computed the posterior mean curves of FA and MD (Fig. 6(a)) for male infants at the mean gestational age and estimated their posterior covariance matrices (Figs. 6(b) and (c)). We observed the pattern of oscillating between positive and negative correlations with negative correlations at around 60% of grid points (Fig. 6(d)). It agrees with the gender effects on FA and MD trajectories in Fig. 4(d). We also computed the posterior covariance matrices for the gestational age effects on FA and MD values (Figs. 6(f) and (g)) and derived the grid pointwise correlation between the gestational age effects on FA and MD. Fig. 6(h) reveals negative correlations at most grid points. It agrees with the ‘opposite’ development patterns of FA and MD trajectories with gestational age (Figs. 5(a), (c), and(d)).

In the analysis of the three eigenvalues, a similar locally significant pattern of gender effect was observed for all three eigenvalues (Fig. 7(b)). Gestational age effect was found uniformly significant for λ_{2} and λ_{3}, while locally head-and-tail significant for λ_{1} (Fig. 7(c)). We observed positive correlations between λ_{2} and λ_{3} for both male and female groups at the mean gestational age (Figs. 7(d) and (e)), and positive correlation between the gestational age effects on λ_{2} and λ_{3} (Fig. 7(f)). Overall, these correlations between λ_{2} and λ_{3} are larger than those between λ_{1} and λ_{3} or λ_{1} and λ_{2}. It agrees with the small differences between λ_{2} and λ_{3} and the relatively large differences between λ_{1} and λ_{3} or λ_{1} and λ_{2} in the middle of the RIC tract (Figs. 3(e), (f) and (g)). Finally, the cluster analysis of the three eigenvalues along the RIC tract across all infants reveals small numbers of overall, global, and local clusters (Fig. 3(h)).

Fig. 9(c) shows that gender is only slightly significant for MD development in the head subregion of the splenium tract. The posterior estimates for the mean effects of gender and the corresponding 95% credible bands agree with the testing results. Figs. 9(d)–(e) display the uniformly increasing effects of gestational age on FA development and decreasing effects on MD development at a major subregion of the splenium tract. The local hypothesis testing results (Fig. 9(f)) support that gestational age is uniformly significant along the splenium tract for the developmental change of FA, but locally significant for MD.

Fig. 8(d) shows the posterior sampling distribution of the number of clusters among the 128 infants. The posterior mean estimates of the numbers of overall, global, and local clusters are 3.42, 3.07, and 3.40, respectively, indicating that the 128 infants concentrate on a few clusters. The weight parameter *γ* for allocation on the local family was estimated to be 0.38 with 95% credible interval of [0.08, 0.98]. The estimated posterior mean of the proportion of local clustering is 0.28, which favors a global clustering of all infants. The estimated posterior mean of α equals to 6.46, with 95% credible interval of [2.94, 11.80]. This result agrees with the sparse clusters identified in the data (Fig. 8(b), (c), (e), (f), and (g)). For instance, variation in FA and MD trajectories is relatively high with some ‘outlying’ trajectories.

Figs. 10(a)–(c) present the posterior estimation of the mean curves of FA and MD for male infants at the mean gestational age and the posterior estimates of their covariance matrices. Inspecting Fig. 10(d) reveals that male infants at the mean gestational age have positive correlations between FA and MD at a major subregion of the splenium tract. Furthermore, Figs. 10(e)–(g) present the estimated curves associated with gestational age effects and their corresponding covariance matrices. We observed that the gestational age effects on FA and MD are negatively correlated almost along the whole splenium tract (Fig. 10(h)), which is consistent with the opposite trend of gestational age on FA and MD values along the splenium tract (Fig. 10(e)).

We also observed that both gender and gestational age are locally significantly associated with the developmental patterns of the three eigenvalues along the splenium tract (Figs. 11(b) and (c)). We further observed large correlation between λ_{2} and λ_{3} and relatively weak correlation between λ_{1} and λ_{2} or λ_{1} and λ_{3} for both male and female groups at the mean gestational age. This agrees with the small difference between λ_{2} and λ_{3}, but relatively large difference between λ_{1} and λ_{3} in the first half subregion of the splenium tract (Figs. 8(e), (f) and (g)). We did not observe noticeable difference among the pairwise correlations of the gestational age effects on the three eigenvalues (Fig. 11(f)).

A prior study is particularly relevant to the present study, because it examined the anisotropic trend across three age groups in infants (McGraw et al., 2002). Our results on the association between anisotropic values and gestational age agree with their findings with a few key differences. We identified the association of anisotropic values with gestational age changing over the brain regions. The region-dependent association is evaluated on the basis of integrative multivariate diffusion properties rather than univariate anisotropic outcome in McGraw et al. (2002). The region-dependent association reveals the complex inhomogeneous spatio-temporal maturation patterns of fiber bundle diffusion properties in infant nerve development. Hence, BFM presents a novel robust analytic approach to capture complex spatio-temporal pattern of diffusion properties in infant nerve structure.

From the statistical perspective, we have developed a new Bayesian functional analysis pipeline BFM for delineating the structure of the variability of multiple diffusion properties along major white matter fiber bundles and their association with a set of covariates of interest. The BFM pipeline integrates five advanced Bayesian statistical tools from the statistical literature. Compared with the existing literature (Zhu et al., 2010, 2011), our BFM explicitly models subject-specific functional covariate effects on the multiple diffusion outcome functions and the covariate-specific correlation structure among the multiple diffusion measures. From the application perspective, we have demonstrated BFM in a clinical study of neurodevelopment. Our results have revealed the complex inhomogeneous spatio-temporal maturation patterns of fiber bundle diffusion properties. Our BFM not only boosts the statistical power in detecting signal of interest, but also improves prediction accuracy, which is very important for real applications.

Several limitations need to be addressed in future research. First, since the existing fiber tract based methods including BFM require DTI fiber correspondence across subjects, BFM cannot be used to investigate some scenarios in practice. For instance, it is possible that the centroid of the localization of white matter lesion may vary across time and subjects. In this case, one cannot use both ROI-based methods and tract based methods. Secondly, although we have used the B-spline basis functions in BFM, it is interesting to consider other basis functions, such as wavelets. Thirdly, since diffusion properties may be non-normally distributed, it is interesting to consider other distributions for the measurement error. Fourthly, BFM is limited to diffusion properties along fiber bundle, but it is interesting to consider full DTs and other representations based on high angular resolution diffusion image (HARDI) (Lenglet et al., 2009; Lepore et al., 2008; Schwartzman; Schwartzman et al., 2005; Tuch et al., 2002; Whitcher et al., 2007; Zhu et al., 2009). Fifthly, BFM can be readily extended to more complex brain structures, such as the medial manifolds of fiber tracts (Yushkevich et al., 2008), functional neuroimaging data (DuBois Bowman et al., 2008; Gössl et al., 2001; Woolrich et al., 2004; Xu et al., 2009), and group analysis of neuroimaging data (Penny et al., 2007; Rosa et al., 2010).

We develop an efficient Markov chain Monte Carlo (MCMC) algorithm for posterior computation. After truncating the loadings matrix ** A** to

Allowing data to inform information about the hyperparameters involved in the model through Eqs. (1)–(7), we specify *α*~*Ga*(*a _{α}*,

where * Y_{i}* is a outcome vector for the M measures of the ith subject,

Starting from the initiation step, the Gibbs sampler at the truncated level *h*^{*} proceeds as follows:

- Update the
*h*th column of the factor loadings matrix**A**_{h*}, denoted by, from its conditional distribution**A**_{h}where with , - Update ϕ
_{gh}from its conditional distribution - Update
*δ*_{1}from its conditional distributionand update*δ*_{1},*l*≥2 from its conditional distributionwhere - Update ,
*f*= 1,…,*MT*, from its conditional distributionwhere (**I**_{M})**B**_{i}_{f}denotes the*f*th row of (**I**_{M}).**B**_{i} - Update
*u*from its conditional distribution_{ih} - Update the latent
*z*from its conditional distribution_{ih}where*θ*_{i(zih = j)}refers to the current value of*θ*with inserting_{i}*θ*_{0ϕi0h}to the*h*th component for*j*= 1, and*θ*_{0ϕi0h}for*j*= 0. - Update the stick-breaking weights from its conditional distributionfor
*q*≤ψ^{*}with ψ^{*}=*max*{ψ_{ih}:*i*= 1, …,*n**h*= 0.1,…,*h*^{*}], for*q*≤ψ^{*}, sample from ~ Beta(1,*α*) - Update the allocation index ψ
, from its conditional distribution_{ih}where*C*(_{π}*u*) = {_{ih}*q*:*π*>_{q}*u*}{1,2,…,_{ih}*∞*}, and is sampled for*q*= 1, …, such thatwhere*u*^{*}=*min*{*u*:_{ih}*i*=1,…*n*,*h*= 0,1,…,*h****}. - Update
*θ*from its conditional distribution_{jq}in whichrefers to the contribution for**Y**_{jq}from subjects with**θ**_{jq}*z*= 1 –_{ih}*j*and ϕ=_{ih}*q*. In addition,=**W**_{jq}**diag**(,…,**w**_{1})**w**_{n}and*Z*Φ*R*= diag(**Y**_{jq}*Y*_{1},…,)**Y**_{n}**Z**_{Y}**Φ**_{Y}**1**, where_{n}=**w**_{i}**I**_{M},**B**_{i}A= Blkdiag(*Z**Z*_{1},…,) with**Z**_{n}= diag(1(**Z**_{i}*z*_{i1}= 1–*j*),…,1(*z*= 1–_{ih*}*j*)),**Φ**= Blkdiag(**Φ**_{1},…,**Φ**) with_{n}**Φ**= diag(1(ψ_{i}_{i1}=*q*),…,1(ψ* =_{ih}*q*)),=**R****1**_{n}*,**I**_{h}= Blkdiag(**Z**_{Y}*Z*_{Y1},…,*Z*) with ,_{Yn}**Φ**_{Y}= Blkdiag(**Φ**_{Y1},…,**Φ**) with ,_{Yn}- and
**1**_{n}is a*n*× 1 vector of entries of 1.

- Update
*ζ*from its conditional distribution Ga(0.5+h*/2,0.5+_{jq}*θ*/2)._{jq}θ_{jq} - Update the hyperparameter
*γ*from its conditional distribution Ga(). - Update the hyperparameter
*α*from its conditional distribution Ga().

To choose the effective number of factors, we use an adaptation approach (Bhattacharya and Dunson, 2011) to update the truncated number of factors *h*^{*} across iterations. Starting with a conservative guess *h*^{*} of *h*^{*}, we adapt with probability *p*(*t*)=exp(*α*_{0}+*α*_{1}*t*) at the tth iteration. We specify *α*_{0} and *α*_{1} such that the chain comes with adaptation around every 10 iterations at the beginning and exponentially fast decreasing in adaptation frequency. In detail, at the tth iteration, we monitor the columns in the factor loadings by comparing *u _{t}* with

^{}This work was partially supported by NIH grants R01ES17240, MH091645, U54 EB005149, P30 HD03110, RR025747-01, P01CA142538-01, MH086633, AG033387, MH064065, HD053000, and MH070890.

- Ashburner J, Friston KJ. Voxel-based morphometry: the methods. Neuroimage. 2000;11:805–821. [PubMed]
- Basser PJ, Mattiello J, LeBihan D. Estimation of the effective self-diffusion tensor from the NMR spin echo. J. Magn. Reson. 1994a;103:247–254. [PubMed]
- Basser PJ, Mattiello J, LeBihan D. MR diffusion tensor spectroscopy and imaging. Biophys. J. 1994b;66:259–267. [PubMed]
- Bhattacharya A, Dunson DB. Sparse Bayesian infinite factor models. Biometrika. 2011;98(2):291–306. [PMC free article] [PubMed]
- Bonekamp D, Nagae LM, Degaonkar M, Matson M, Abdalla WM, Barker PB, Mori S, Horská A. Diffusion tensor imaging in children and adolescents: reproducibility, hemispheric, and age-related differences. Neuroimage. 2008;34:733–742. [PMC free article] [PubMed]
- Camara E, Bodammer N, Rodriguez-Fornells A, Tempelmann C. Age-related water diffusion changes in human brain: a voxel-based approach. Neuroimage. 2007;34:1588–1599. [PubMed]
- Carvalho CM, Chang J, Lucas JE, Nevins JR, Wang Q, West M. High-dimensional sparse factor modeling: applications in gene expression genomics. J. Am. Stat. Assoc. 2008;103(484):1438–1456. [PMC free article] [PubMed]
- Cascio CJ, Gerig G, Piven J. Diffusion tensor imaging: application to the study of the developing brain. J. Am. Acad. Child Adolesc. Psychiatry. 2007;46:213–223. [PubMed]
- Chen YS, An HY, Zhu HT, Stone T, Smith JK, Hall C, Bullitt E, Shen DG, Lin WL. White matter abnormalities revealed by diffusion tensor imaging in non-demented and demented HIV+ patients. Neuroimage. 2009;47:1154–1162. [PubMed]
- Crainiceanu CM, Ruppert D, Carroll RJ, Joshi A, Goodner B. Spatially adaptive Bayesian penalized splines with heteroscedastic errors. J. Comput. Graph. Stat. 2007;16(2):265–288.
- DuBois Bowman F, Caffo B, Bassett SS, Kilts C. A Bayesian hierarchical framework for spatial modeling of fMRI data. Neuroimage. 2008;39(1):146–156. [PMC free article] [PubMed]
- Dunson DB. Nonparametric Bayes local partition models for random effects. Biometrika. 2009;96:249–262. [PMC free article] [PubMed]
- Ferguson TS. A Bayesian analysis of some nonparametric problems. Ann. Stat. 1973;1(2):209–230.
- Ferguson TS. Prior distributions on spaces of probability measures. Ann. Stat. 1974;2(4):615–629.
- Focke NK, Yogarajah M, Bonelli SB, Bartlett PA, Symms MR, Duncan JS. Voxel-based diffusion tensor imaging in patients with mesial temporal lobe epilepsy and hippocampal sclerosis. Neuroimage. 2008;40:728–737. [PubMed]
- Geweke J, Zhou G. Measuring the pricing error of the arbitrage pricing theory. Rev. Financ. Stud. 1996;9(2):557–587.
- Gilmore JH, Smith LC, Wolfe HM, Hertzberg BS, Smith JK, Chescheir NC, Evans DD, Kang C, Hamer RM, Lin W, Gerig G. Prenatal mild ventriculomegaly predicts abnormal development of the neonatal brain. Biol. Psychiatry. 2008;64:1069–1076. [PMC free article] [PubMed]
- Goldsmith A, Feder J, Crainiceanu C, Caffo B, Reich D. Penalized functional regression. J. Comput. Graph. Stat. 2011;20:830–851. [PMC free article] [PubMed]
- Goodlett CB, Fletcher PT, Gilmore JH, Gerig G. Group analysis of DTI fiber tract statistics with application to neurodevelopment. Neuroimage. 2009;45:S133–S142. [PMC free article] [PubMed]
- Gössl C, Auer DP, Fahrmeir L. Bayesian spatiotemporal inference in functional magnetic resonance imaging. Biometrics. 2001;57(2):554–562. [PubMed]
- Greven S, Crainiceanu C, Caffo B, Reich D. Longitudinal functional principal component analysis. Electron. J. Stat. 2010;4:1022–1054. [PMC free article] [PubMed]
- Hasan KM, Basser PJ, Parker DL, Alexander AL. Analytical computation of the eigenvalues and eigenvectors in DT-MRI. J. Magn. Reson. 2001;152:41–47. [PubMed]
- Hasan KM, Narayana PA. Computation of the fractional anisotropy and mean diffusivity maps without tensor decoding and diagonalization: theoretical analysis and validation. Magn. Reson. Med. 2003;50:589–598. [PubMed]
- Jeffreys W, Berger J. Ockham’s razor and Bayesian analysis. Am. Stat. 1992;80:64–72.
- Jones DK, Symms MR, Cercignani M, Howard RJ. The effect of filter size on VBM analyses of DT-MRI data. Neuroimage. 2005;26:546–554. [PubMed]
- Kalli M, Griffin J, Walker S. Slice sampling mixture models. Stat. Comput. 2011;21:93–105.
- Kubicki M, McCarley R, Westin CF, Park HJ, Maier S, Kikinis R, Jolesz FA, Shenton ME. A review of diffusion tensor imaging studies in schizophrenia. J. Psychiatr. Res. 2007;41(1–2):15–30. [PMC free article] [PubMed]
- Lenglet C, Campbell JSW, Descoteaux M, Haro G, Savadjiev P, Wassermann D, Anwander A, Deriche R, Pike GB, Sapiro G, Siddiqi K, Thompson PM. Mathematical methods for diffusion MRI processing. Neuroimage. 2009;45:S111–S122. [PMC free article] [PubMed]
- Lepore N, Brun CA, Chou Y, Chiang M, Dutton RA, Hayashi KM, Luders E, Lopez OL, Aizenstein HJ, Toga AW, Becker JT, Thompson PM. Generalized tensor-based morphometry of HIV/AIDS using multivariate statistics on deformation tensors. IEEE Trans. Med. Imaging. 2008;27:129–141. [PMC free article] [PubMed]
- McGraw P, Liang L, Provenzale J. Evaluation of normal age-related changes in anisotropy during infancy and childhood as shown by diffusion tensor imaging. Am. J. Roentgenol. 2002;179:1515–1522. [PubMed]
- Moseley M. Diffusion tensor imaging and aging—a review. NMR Biomed. 2002;15:553–560. [PubMed]
- Mukherjee P, McKinstry RC. Diffusion tensor imaging and tractography of human brain development. Neuroimaging Clin. N. Am. 2006;16:19–43. [PubMed]
- Müller P, Parmigiani G, Robert C, Rousseau J. Optimal sample size for multiple testing: the case of gene expression microarrays. J. Am. Stat. Assoc. 2004;99:990–1001.
- O’Donnell L, Westin CF, Golby A. Tract-based morphometry for white matter group analysis. Neuroimage. 2009;45:832–844. [PMC free article] [PubMed]
- Papaspiliopoulos O. A note on posterior sampling from Dirichlet mixture models. Tech. rep. Universitat Pompeu Fabra. 2008
- Papaspiliopoulos O, Roberts G. Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models. Biometrika. 2008;95:169–186.
- Penny W, Kilner J, Blankenburg F. Robust Bayesian general linear models. Neuroimage. 2007;36:661–671. [PubMed]
- Pierpaoli C, Basser PJ. Toward a quantitative assessment of diffusion anisotropy. Magn. Reson. Med. 1996;36:893–906. [PubMed]
- Rollins NK. Clinical applications of diffusion tensor imaging and tractography in children. Pediatr. Radiol. 2007;37:769–780. [PubMed]
- Rosa MJ, Bestmann S, Harrison L, Penny W. Bayesian model selection maps for group studies. Neuroimage. 2010;49(1):217–224. [PMC free article] [PubMed]
- Schwartzman A. Ph.D. thesis. Stanford University; Random ellipsoids and false discovery rates: statistics for diffusion tensor imaging data.
- Schwartzman A, Dougherty RF, Taylor JE. Cross-subject comparison of principal diffusion direction maps. Magn. Reson. Med. 2005;53:1423–1431. [PubMed]
- Smith SM, Jenkinson M, Johansen-Berg H, Rueckert D, Nichols TE, Mackay CE, Watkins KE, Ciccarelli O, Cader M, Matthews P, Behrens TE. Tractbased spatial statistics: voxelwise analysis of multi-subject diffusion data. Neuroimage. 2006;31:1487–1505. [PubMed]
- Snook L, Paulson LA, Roy D, Phillips L, Beaulieu C. Diffusion tensor imaging of neurodevelopment in children and young adults. Neuroimage. 2005;26:1164–1173. [PubMed]
- Snook L, Plewes C, Beaulieu C. Voxel based versus region of interest analysis in diffusion tensor imaging of neurodevelopment. Neuroimage. 2007;34:243–252. [PubMed]
- Song SK, Sun SW, Ju WK, Lin SJ, Cross AH, Neufeld AH. Diffusion tensor imaging detects and differentiates axon and myelin degeneration in mouse optic nerve after retinal ischemia. Neuroimage. 2003;20:1714–1722. [PubMed]
- Stieltjes B, Kaufmann WE, van Zijl PC, Fredericksen K, Pearlson GD, Solaiyappan M, Mori S. Diffusion tensor imaging and axonal tracking in the human brainstem. Neuroimage. 2001;14(3):723–735. [PubMed]
- Tuch DS, Reese TG, Wiegell MR, Makris N, Belliveau JW, Wedeen VJ. High angular resolution diffusion imaging reveals intravoxel white matter fiber heterogeneity. Magn. Reson. Med. 2002;48:577–582. [PubMed]
- Van Hecke W, Sijbers J, De Backer S, Poot D, Parizel PM, Leemans A. On the construction of a ground truth framework for evaluating voxel-based diffusion tensor MRI analysis methods. Neuroimage. 2009;46:692–707. [PubMed]
- Walker SG. Sampling the Dirichlet mixture model with slices. Commun. Stat.— Simul. Comput. 2007;36:45–54.
- Wang L, Dunson D. Semiparametric Bayes multiple testing: applications to tumor data. Biometrics. 2010;66:493–501. [PubMed]
- Whitcher B, Wisco JJ, Hadjikhani N, Tuch DS. Statistical group comparison of diffusion tensors via multivariate hypothesis testing. Magn. Reson. Med. 2007;57:1065–1074. [PMC free article] [PubMed]
- Woolrich M, Jenkinson M, Brady J, Smith S. Fully Bayesian spatiotemporal modeling of fMRI data. IEEE Trans. Med. Imaging. 2004;23(2):213–231. [PubMed]
- Xu L, Johnson TD, Nichols TE, Nee DE. Modeling inter-subject variability in fMRI activation location: a Bayesian hierarchical spatial model. Biometrics. 2009;65(4):1041–1051. [PMC free article] [PubMed]
- Xue R, van Zijl PC, Crain BJ, Solaiyappan M, Mori S. In vivo three-dimensional reconstruction of rat brain axonal projections by diffusion tensor imaging. Magn. Reson. Med. 1999;42:1123–1127. [PubMed]
- Yushkevich PA, Zhang H, Simon T, Gee JC. Structure-specific statistical mapping of white matter tracts. Neuroimage. 2008;41:448–461. [PMC free article] [PubMed]
- Zhu HT, Xu D, Amir R, Hao X, Zhang H, Alayar K, Ravi B, Peterson B. A statistical framework for the classification of tensor morphologies in diffusion tensor images. Magn. Reson. Imaging. 2006;24:569–582. [PMC free article] [PubMed]
- Zhu HT, Zhang HP, Ibrahim JG, Peterson BG. Statistical analysis of diffusion tensors in diffusion-weighted magnetic resonance image data (with discussion) J. Am. Stat. Assoc. 2007;102:1085–1102.
- Zhu HT, Cheng YS, Ibrahim JG, Li YM, Hall C, Lin WL. Intrinsic regression models for positive definitive matrices with applications in diffusion tensor images. J. Am. Stat. Assoc. 2009;104:1203–1212. [PMC free article] [PubMed]
- Zhu HT, Styner M, Tang NS, Liu ZX, Lin WL, Gilmore J. FRATS: functional regression analysis of DTI tract statistics. IEEE Trans. Med. Imaging. 2010;29:1039–1049. [PMC free article] [PubMed]
- Zhu H, Kong L, Li R, Styner M, Gerig G, Lin W, Gilmore JH. FADTTS: functional analysis of diffusion tensor tract statistics. Neuroimage. 2011;56:1412–1425. [PMC free article] [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's Canada Institute for Scientific and Technical Information in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |