|Home | About | Journals | Submit | Contact Us | Français|
Diffusion tensor tractography offers a unique perspective of white matter anatomy, but proper delineation of white matter tracts of interest generally requires the active involvement of an expert neuroanatomist. Here we describe the implementation of an automated tractography method requiring no user input, and we compare its results to user-driven tractography.
Fourteen healthy volunteers underwent diffusion tensor imaging at 3T. Images were registered to a standard template and predefined seed regions containing tract termini were transformed into subject space for use in unsupervised probabilistic tractography. The output was compared to the results of user-driven tractography performed on the same subjects.
After selection of suitable smoothing kernels and thresholds, the results of automated tractography closely approximated those of user-driven tractography. The main bodies of the cingulum, inferior fronto-occipital fasciculus, and inferior longitudinal fasciculus were depicted equally well by both methods. Discrepancies mainly arose at the periphery of these tracts, where anatomic uncertainty tends to be greatest.
Automated tractography can be used to depict white matter anatomy without need for user intervention, particularly if the main body of the tract is of greatest interest.
Diffusion tensor tractography is a recently developed imaging modality that can demonstrate the anatomy of white matter tracts in vivo.1 It is based on the observation that the diffusion of water in the brain occurs preferentially in directions parallel to axon bundles, which can be measured with an appropriate set of directionally-encoded diffusion-weighted images.2 Because axons extend over multiple voxels, white matter tracts can be parceled as regions of directionally coherent diffusion tensors.
Tractography has found growing use in several clinical settings. However, several factors have constrained the use of tractography in routine practice. Among these is the operator-dependence of its output. Typically, tractography results are produced through an interactive process that relies on an experienced user to identify a “seed”, a likely location of each white matter tract. Tractography fibers are then inspected by the user and rejected if they do not conform to known white matter tract anatomy. Obtaining accurate and reproducible depictions of white matter can be time-consuming, and user-introduced bias may complicate a population-based analysis.3–5 Semi-automated methods have been proposed to reduce operator-dependence by using information in the neighborhood of a minimal seed region to improve seed reproducibility.6, 7
More recently, several methods have been developed to automate tractography completely. Some of these classify tractography fibers by clustering them according to their morphologic similarity to each other or to a reference standard, encapsulating the process by which a user decides whether to include a fiber in the tract of interest.8, 9 Other automated tractography methods are based on the principle that most white matter structures in the central nervous system serve to connect two regions of gray matter. Therefore, white matter fibers can be identified by their termini rather than an explicit comparison of fiber morphology to known neuroanatomical structures.10–15 The latter methods frequently make use of probabilistic tractography, which estimates whether any fibers exist within a voxel, instead of deterministic tractography, which estimates fiber trajectories for each voxel but does not address which trajectory is most likely.11, 14, 15, 16, 17 Probabilistic tractography is often chosen for this approach because it is capable of using gray matter as seeds, unlike deterministic tractography.
At first glance, the outputs of deterministic and probabilistic tractography do not appear to be directly comparable. However, in practice their end results frequently take the same form: volumes of interest used for metrics such as total volume or mean fractional anisotropy (FA). The geometric details of streamlines produced by deterministic tractography are often discarded (for example, in recent elegant work by Whitford et al.),18 leaving only a depiction of the volume that was crossed by streamlines. Likewise, analysis of probabilistic connectivity maps often begins by applying a probability threshold to produce a binary map, which similarly results in a volume of interest.16 Thus, despite the divergent theoretical underpinnings of each method, the two methods are often intentionally forced to converge when quantification becomes necessary. This raises the possibility that quantitative automated probabilistic tractography could be an acceptable substitute for quantitative user-driven deterministic tractography. Yet although automated probabilistic tractography is increasingly used in investigations of white matter, to our knowledge its output has never been compared to the output of user-driven tractography.
Here, we describe an automated tractography method that uses atlas-selected regions of white matter termini to seed unsupervised probabilistic tractography.11, 17 We compare the results of this method to the results of user-driven tractography, which relies on an experienced user to recognize white matter tracts. Our hypothesis is that depiction of white matter tracts with automated tractography is consistent with the results of user-driven tractography.
Written informed consent was obtained from 14 healthy volunteers (9 males, 5 females, mean age 31.7 +/− 7.9 years) after which spin-echo echo-planar DTI was acquired at 3T (Siemens Trio, TR=6500 ms, TE=99 ms, FOV=22 cm, matrix=128×128, b=800 s/mm2, slice thickness = 3 mm, 40 slices) using 12 noncollinear motion probing gradients, which we have previously found to be adequate for user-driven and automated tractography. Images were also acquired using b=0 s/mm2. Further processing was performed offline.
Tractography was performed by an experienced neuroradiologist using the FACT algorithm for deterministic tractography as implemented by Volume One.19, 20 This algorithm was chosen because it is commonly employed in clinical software packages, and for the purposes of this study three tracts were chosen as examples of the typical workflow of a tractographer: the cingulum, inferior fronto-occipital fasciculus (IFOF), and inferior longitudinal fasciculus (ILF). Tractography was performed only in the left cerebral hemisphere, based on previous work that did not demonstrate substantial left-right asymmetry in these white matter tracts.11
After eddy current correction, the IFOF and ILF were isolated using two regions of interest (ROIs). For the IFOF, seeds were placed in the expected location of the tract in the inferior frontal white matter and the occipital white matter. For the ILF, seeds were placed manually in the expected location of the tract in the anterior temporal stem and the occipital white matter. For the cingulum, tractography seeds were placed manually in the expected midpoint of the tract. A second ROI was not used as it did not improve the results (data not shown). Tract reconstructions were terminated in voxels with FA < 0.18. All tractography output was visually inspected to ensure anatomic validity.21 When necessary, spurious fibers were eliminated using additional ROIs.22 The fibers were then converted into binary masks for further analysis.
After brain extraction and eddy current correction, seeds representing tract termini were prepared for the left cingulum, left IFOF, and the left ILF using a method described in Figure 1.23 Without user interaction, Hierarchical Attribute Matching Mechanism for Elastic Registration (HAMMER) was used to register the b=0 images to a high-resolution single-subject T1-weighted anatomic atlas (Montreal Neurological Institute) and a coregistered set of 116 standardized anatomic labels.24, 25 The deformation fields were then reversed and used to import preselected labels into subject space for use as anterior and posterior terminal seeds for each white matter tract. Specifically, anterior and posterior cingulate regions were used to seed the cingulum, occipital and inferior frontal regions were used to seed the IFOF, and occipital and anterior temporal regions were used to seed the ILF. It should be noted that these regions all contained both gray matter and local subcortical white matter.
Automated tractography was performed using a freely-available probabilistic tractography program (FDT, FMRIB Centre, University of Oxford).17 For each subject, 5000 sample fibers were estimated at every seed voxel. Sample fibers were retained if they passed through at least one anterior and one posterior seed voxel. In the resulting automated tractography maps, the value of each voxel was the peak-normalized count of the number of sample fibers passing through it. This value reflects the likelihood of the tract existing within the voxel as well as the fraction of the entire seed volume that is connected to the voxel. In order to compare the automated tractography maps to user-driven tractography masks, they were converted into binary masks after smoothing by applying a threshold. The same smoothing kernel and threshold value were automatically applied to all individuals, which were determined as follows.
Four randomly-selected members of the study population contributed all of their tractography images to form an “optimization group” that was used to refine the automated tractography results. Automated tractography maps first underwent spatial filtering using Gaussian smoothing kernels of 3, 5 or 10 mm FWHM. Unsmoothed automated tractography maps were also evaluated. Next, automated tractography maps were converted into binary masks by applying thresholds ranging from 0 to 0.9. Then, optimized postprocessing criteria were chosen to maximize the consistency between each automated tractography mask and its corresponding user-driven tractography mask, as quantified by the Jaccard similarity coefficient.
The Jaccard similarity coefficient can also be expressed as TP/(TP + FN + FP), where TP denotes true positives, FN denotes false negatives, and FP denotes false positives. The Jaccard coefficient was used rather than sensitivity and specificity for two reasons. First, it accounts for both false negatives and false positives, unlike sensitivity (TP/(TP + FN)). Second, it ignores true negatives (which are generally over-represented in applications like tractography), unlike specificity (TN/(TN + FP), where TN denotes true negatives). 23
The ten subjects who did not belong to the optimization group (7 males, 3 females, mean age 32.8 +/− 7.7 years) contributed all of their images to form a “test group” that was used to assess the reliability of the automated tractography. Using the postprocessing criteria determined during the optimization step, binary masks were produced for automated tractography. Finally, the degree of overlap with binary masks produced by user-driven tractography was determined.
All study protocols were approved by the local institutional review board.
Qualitatively, the results of automated tractography were consistent with the results of user-driven tractography, as shown in Figure 2. The complex architecture of each white matter tract was well depicted by both methods. Areas of the brain that had a high density of fibers on user-driven tractography typically corresponded to areas demonstrating higher probability values on automated tractography. Automated and user-driven tractography were both able to successfully distinguish white matter tracts even in regions of potential overlap. For example, the IFOF and ILF were successfully isolated by both methods despite substantial overlap in the posterior temporal and occipital white matter.
As expected, we found user-driven tractography to be a work intensive process. Initial placement of seed ROIs sometimes did not produce expected results and therefore had to be inspected and adjusted to remove spurious fibers.26 Fibers extending into the contralateral hemisphere, for example, were individually identified and excluded. In contrast, when probabilistic tractography results contradicted anatomical boundaries, they were excluded automatically during the thresholding step because they were of very low likelihood. Thus, no post-hoc adjustments were applied to any of the automated tractography maps.
Automated tractography maps (containing likelihood estimates in each voxel) were converted to masks (depicting a tract present or not present in each voxel) through a variety of smoothing kernels and thresholds and compared to user-driven tractography results. Twelve pairs of results were examined, comprising the cingulum, IFOF, and ILF from each of four subjects. The greatest consistency between automated tractography and user-driven tractography was achieved with a 3 mm smoothing kernel and a threshold of 0.02, as shown in Figure 3. With these parameters, the Jaccard similarity coefficient was 0.32. As the size of the smoothing kernel was increased, the threshold that maximized similarity to the user-driven results also increased.
Thirty automated tractography maps (the cingulum, IFOF, and ILF from each of ten subjects) were used for a quantitative evaluation. None of the maps in this group had been used in the preceding optimization step. Final automated tractography masks were produced for all thirty maps according to the parameters determined by the optimization step. The mean total volume of the cingulum was 6.2 +/− 1.6 mL, the mean volume of the IFOF was 15.4 +/− 3.0 mL, and the mean volume of the ILF was 15.6 +/− 6.0 mL. The mean FA of the cingulum was 0.328 +/− 0.021, the mean FA of the IFOF was 0.402 +/− 0.021, and the mean FA of the ILF was 0.381 +/− 0.027.
A voxelwise comparison of automated tractography masks to user-driven tractography masks from the same subjects was also performed. A representative comparison of these two methods in the same individual is presented in Figure 4. Both methods identified a similar set of voxels as the main body of the tract. Discrepancies between the two methods most often occurred at the periphery of the white matter tracts, where the degree of uncertainty was greatest. Sensitivity (the ratio of true positives to the sum of true positives and false negatives) was 43% +/− 15% in our test group, which was fairly high given that we did not explicitly attempt to optimize this metric.
As tractography finds increasing use in the clinical evaluation of white matter abnormalities, there is growing interest in reducing its operator-dependence. To the clinician, automated tractography methods offer the possibility of a more efficient clinical workflow. To the researcher, automated tractography methods offer the possibility of a reducing bias and improving sensitivity in group comparisons.
However, the problem of validation has long challenged tractographers, and automated tractography methods compound this problem. Very few studies have directly compared user-driven tractography to automated methods. Zhang et al. developed a standardized set of seed regions consisting of uniform white matter located in the body of white matter tracts of interest, similar to those chosen in typical user-driven tractography. These seeds were applied to registered subjects and used for automated deterministic tractography.27, 28 Spatial normalization to a labeled white matter atlas has also been used as an alternative to tractography.29, 30 Both of these methods compared favorably to user-driven seed placement. Nevertheless, these methods may be susceptible to misregistration error if there is anatomic variability or pathologic distortion of the tract of interest.
In contrast, the large size of tract termini relative to the body of the tract may improve robustness in the setting of minor displacements occurring as a result of pathology. This approach has been used in automated tractography in diseases with unpredictable white matter anatomy (e.g schizophrenia and Huntington’s disease) as well as complicated anatomy (e.g. Meyer’s loop in optic neuritis).11, 14, 15 It is important to note that automated tractography in these pathologic states has not yet been validated against a reference standard, and future studies will be needed to extended the scope of our understanding of white matter pathology.
We chose tract termini from regions defined in an established brain atlas. Although effective for our tracts of interest, this technique may be less suitable for tracts that project outside the brain (such as the corticospinal tract) or commissural tracts (such as the corpus callosum). However, any labeling scheme could be used, including one making use of advances in white matter anatomy. As tractography increasingly informs our knowledge of anatomy, this approach raises the possibility of directly correlating standardized cortical templates with evolving white matter atlases.10, 31, 32
Because we chose white matter termini containing both gray matter and subjacent white matter, we used a probabilistic tractography algorithm that performs more robustly in voxels with low FA. With this algorithm, however, we lost some of the advantages of deterministic tractography. For example, probabilistic tractography does not provide a voxel-by-voxel depiction of the course of the fiber, as commonly found in deterministic depictions. Instead, it aggregates the voxels traversed by multiple fibers into a directionless probability score. We expect such maps to be suitable for evaluating clinical issues regarding white matter displacement or destruction, but specific questions regarding fiber directionality may require user-driven tractography.
In automated tractography, thresholding criteria are used as a substitute for expert visual inspection to reject output not belonging to the tract of interest. We found that tractography output varies widely with small changes in thresholding and smoothing criteria. In this work, postprocessing criteria were standardized over all tracts of interest. This might be problematic if each white matter tract had substantially different features, but we did not find this to be the case in an earlier study. When postprocessing criteria were determined independently for each of these tracts, the optimal values for the tracts we evaluated were found to be identical.23 Naturally this uniformity may not extend to all white matter tracts, but comparable values have been described for automated tractography of the optic radiations.33
The mean FA values reported by our method are slightly less than those described elsewhere in the literature.34–36 However, FA is typically measured over a small segment of the tract rather than averaged over its entirety. If the most prominent segments of a tract also have the highest FA, then we would expect whole-tract measurements to be lower. Using geometric methods to measure mean FA over the entire cingulum, O’Donnell et al. have also reported slightly lower FA values than other groups.37
Morris et al. have noted that probabilistic tractography can under-represent tract termini and overrepresent areas of a white matter tract that are close to the seed.38 This effect was mitigated in our work by choosing terminal seeds, so that the distance of a tract segment from one seed was inversely related to its distance from the other seed. In addition, threshold masks were used to reduce the effect of small variations in probability over the course of a tract. As a result, we did not observe any gross “flare” in our results.
Finally, we note that the relative advantages of automated tractography may depend on the tract of interest. In a careful series of measurements, Wakana et al. found that inter-rater reproducibility of manual tractography of cingulum could achieve a kappa value as high as 0.97 when raters were from the same institution.5 However, the kappa values varied considerably for other tracts. The corresponding kappa value was 0.89 for the inferior fronto-occipital fasciculus and 0.69 for the inferior longitudinal fasciculus. In this context, automated tractography is perhaps best viewed as one of many tools that can be used in the study of white matter, each of which is best-suited for a different purpose.
With appropriately chosen postprocessing criteria, automated tractography can be performed reliably for multiple white matter tracts. This work suggests that automated tractography may be a reasonable substitute for user-driven tractography when quantifying the main body of certain white matter tracts in healthy subjects, thus raising the possibility of more widespread use of tractography in research endeavors.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.