Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2864489

Formats

Article sections

Authors

Related links

Inf Process Med Imaging. Author manuscript; available in PMC 2010 May 5.

Published in final edited form as:

Inf Process Med Imaging. 2009; 21: 239–251.

PMCID: PMC2864489

NIHMSID: NIHMS197638

See other articles in PMC that cite the published article.

This paper presents a method for detection of cerebral white matter hyperintensities (WMH) based on run-time PD-, T1-, and T2- weighted structural magnetic resonance (MR) images of the brain along with labeled training examples. Unlike most prior approaches, the method is able to reliably detect WMHs in elderly brains in the absence of fluid-attenuated (FLAIR) images. Its success is due to the learning of probabilistic models of WMH spatial distribution and neighborhood dependencies from ground-truth examples of FLAIR-based WMH detections. These models are combined with a probabilistic model of the PD, T1, and T2 intensities of WMHs in a Markov Random Field (MRF) framework that provides the machinery for inferring the positions of WMHs in novel test images. The method is shown to accurately detect WMHs in a set of 114 elderly subjects from an academic dementia clinic. Experiments show that standard off-the-shelf MRF training and inference methods provide robust results, and that increasing the complexity of neighborhood dependency models does not necessarily help performance. The method is also shown to perform well when training and test data are drawn from distinct scanners and subject pools.

White matter foci that are hyperintense on FLAIR images of the human brain are indicative of focal dysfunction of underlying axonal tracts. Common in a variety of clinical conditions, including multiple sclerosis, cerebrovascular disease, and depression, WMHs are important clinical measures in the elderly because their prevalence is strongly associated with cognitive function, longevity, disease progression, and the effects of disease-modifying treatments [1][2][3]. Because semi-quantitative manual grading of WMH severity is time-consuming and variable due to human subjectivity [4], a variety of fully automated methods have been developed to detect WMHs on FLAIR images in a robust, efficient, and objective manner [5][6].

However, while FLAIR images provide optimal contrast between WMHs and all other tissues, the detection of WMHs when no FLAIR is available is an increasingly important problem. Large-scale imaging studies are under pressure to collect a wide range of MR imaging sequences, including T1, T2, proton density (PD), diffusion tensor, functional, and perfusion MR, to capture the broadest possible range of biological phenomena in the brains of participants. Simultaneously, the studies are under pressure to scan each subject for the shortest amount of time possible due to scanner resource costs and the increases in head motion and subject discomfort that occur over the course of the scan session. Therefore, a growing list of large-scale imaging studies that have a strong interest in white matter dysfunction have nonetheless chosen to forgo FLAIR acquisition [3][7][8].

Because T1-weighted and double echo PD/T2-weighted acquisitions are nearly ubiquitous in large-scale imaging studies, we focus on WMH detection based solely on T1, T2, and PD input images. We use FLAIR exclusively for training data and the validation of automated methods (Fig. 1). WMHs are hyperintense on PD and T2, and hypointense on T1, but none of these modalities provide sufficient contrast between normal white matter (WM) and WMHs (Fig. 1). Therefore, we combine image intensity information with prior anatomical knowledge about where WMHs are known to occur in the brain and how they progress over time from one part of the brain to another. In particular, we employ a **spatial prior**– the prior probability of a WMH occurring at a given pixel, irrespective of imaging data– and a **contextual prior**– the conditional probability of a WMH occurring at a given pixel, given that WMHs have occurred at neighboring pixels. In elderly subjects, the spatial and contextual priors are highly structured and capture a characteristic spatial distribution of WMH occurrence and progression; specifically, WMHs in Alzheimer’s Disease and healthy aging tend to begin in periventricular zones and spread upward and outward (see Fig. 2 and [9]). The prior models that capture this progression are learned from FLAIR-based ground-truth WMH detections in a training phase, and are combined with intensity information at run-time in an MRF framework to detect WMHs in novel sets of coregistered (PD, T1, T2) test image sets.

A representative axial slice from the input images used for detecting WMHs at run time (left) and ground-truth data used for training the WMH detection method and validating the results (right).

Few papers to date have dealt with the problem of automated WMH detection in the absence of FLAIR images, each using a comparatively simple model of WMH spatial distribution. One such method detected WMHs using a MRF system with a 2D and spatially invariant isotropic smoothing prior [10]. In another, the authors detected WMHs as outliers to models of other tissue classes instead of modelling them explicitly [11]. One method used boosted classifiers and Support Vector Machines to perform detection from PD and T1 images using spatially invariant isotropic smoothing and radial distance from center as a spatial prior. It also required separate training sets for mild, moderate, and severe WMH cases [12]. Finally, another used several run-time steps, including segmentations of grey matter, white matter, and CSF; segmentation of the thalamic nuclei; morphological post-processing to fix segmentation problems; and separation of WMHs into sub-classes based on image contrast [13]. The key difference between these methods and the current one is that the current method uses training data to directly capture the anatomical distribution and progression of WMHs in a model that allows spatial dependencies in WMH occurrence to vary arbitrarily across the image. Our method leverages this additional prior knowledge to directly model WMHs using a relatively straightforward run-time procedure that requires few steps or arbitrary parameter settings since it only fits parameters to a 3D intensity distribution and runs an existing, widely available MRF solver. Additionally, we focus on the elderly brain, whose morphological characteristics can be highly heterogeneous across a population due to diverse aging-related biological phenomena; the heterogeneity provides challenges to WMH detection that may differ from those associated with multiple sclerosis [10] and [11].

While little attention has been paid to WMH detection in the absence of FLAIR, several methods have used neighborhood information during FLAIR-based WMH detection (*e.g.*, [6][5]). Usually the use of contextual information amounts to fully-isotropic smoothing– that is, WMHs are considered more likely at a given pixel if they occur at neighboring pixels, regardless of their absolute positions or the directions in which neighboring WMHs do or do not occur. We extend these prior contextual methods by allowing the associations between neighboring WMH detections to vary with pixel position and direction of neighbors. As suggested above, the spatially-and directionally-variable nature of associations between neighboring WMHs in our contextual model allows us to more accurately capture the neurobiological course of spreading WMHs over the course of brain aging.

We tested our method on a diverse pool of 114 elderly individuals who received a full clinical workup and structural MR scans including T1-weighted, double-echo PD/T2 weighted, and FLAIR scans at their times of enrollment into the University of California, Davis Alzheimer’s Disease Center (ADC). Subjects were 70–90 years of age; the subject pool included individuals with normal cognition, mild cognitive impairment, and dementia.

All scans were pre-processed through a standardized pipeline. T1, T2, PD, and FLAIR were rigidly coregistered using cross-correlation as a similarity measure and previously-presented optimization methods [14]. Nonbrain tissues were manually separated from the brain on all scans. A strongly-validated, semi-automated method was used to detect WMHs based solely on the FLAIR scans and human input [15]. The skull-stripped T1-weighted image was then nonlinearly aligned to a minimum deformation template (MDT) based on moving control points in a multi-scale grid and using cubic spline interpolation to move image pixels between the control points [16][17]. The warp is constrained such that no region is permitted to collapse entirely. The T1, T2, PD, FLAIR, and map of ground-truth FLAIR-based WMH pixels were then warped to the space of the MDT image using the nonlinear alignment.

We take a Bayesian MRF approach to WMH detection. Let *y _{i}* denote a vector of three image intensities– PD, T1, and T2– associated with image pixel

The MRF label prior involves spatial and contextual prior models whose parameters are learned from training data. We write the MRF prior as a Gibbs field:

$$\mathrm{\Pi}(X)={Z}^{-1}\ast \mathit{exp}(-H(X))$$

where *Z*, the partition function, is the sum of *exp*(−*H*(*X*)) over all possible labelings, and *H*(*X*) is an energy function that takes on lower values when the label field *X* is more probable *a priori*. This Gibbs prior is equivalent to an MRF prior under straightforward technical restrictions [18]. The energy function is a sum of terms that represent energies from the spatial and contextual priors: *H*(*X*) = *H _{s}*(

$${H}_{s}(X)=\sum _{{x}_{i}\in X}{\alpha}_{i}{x}_{i}+(1-{\alpha}_{i})(1-{x}_{i})$$

The contextual prior *H _{c}*(

$${H}_{c}(X)=\sum _{\delta \in \mathrm{\Delta}}{\beta}_{{X}^{\delta}}$$

This is a Potts model in which neighboring labels within a group *δ* incur a fixed penalty of *β*_{Xδ} [19]. Generally, these *β* parameters encourage neighboring pixels to have the same label, but in some locations of the brain, they may actually be encouraged to be different. These *β* parameters are calculated from the training data (Sec. 2).

The likelihood of a given set of image intensity vectors, given the underlying labels, comes from a tissue mixture model with one lognormal distribution for WM and one for WMH:

$$\begin{array}{c}L(Y\mid X)=\mathit{exp}(-{H}_{L}(Y\mid X))\\ {H}_{L}(Y\mid X)=\sum _{{x}_{i}\in X}\frac{{\pi}_{{x}_{i}}f({y}_{i};{\mu}_{{x}_{i}},{\mathrm{\sum}}_{{x}_{i}})}{{\sum}_{{x}_{i}\in \{0,1\}}{\pi}_{{x}_{i}}f({y}_{i};{\mu}_{{x}_{i}},{\mathrm{\sum}}_{{x}_{i}})}\\ f(y;\mu ,\mathrm{\sum})=\frac{1}{C}\ast \mathit{exp}(-.5\ast {(\mathit{log}(y)-\mu )}^{T}{\mathrm{\sum}}^{-1}(\mathit{log}(y)-\mu )\\ C=\mid \mathrm{\sum}{\mid}^{.5}{(2\pi )}^{1.5}\mid \mathit{log}(y)\mid \end{array}$$

where *π*_{0} and *π*_{1} are mixture coefficients for non-WMH and WMH respectively, with *π*_{0} + *π*_{1} = 1 and *log*(*y*) is the component-wise log of vector *y*. We estimate *π*_{1}, by taking the proportion of pixels in *Y _{H}* that are inliers to the distribution found for the pixels in

A lognormal mixture model was chosen because the distributions of 3D intensity vectors for WM and WMH empirically followed asymmetric, “comet-like” patterns (Fig. 3). A Gaussian mixture model was initially tried without success, which led to adoption of this choice. As we explain below, the *μ* and *Σ* parameters are estimated at run time by an unsupervised method that fits the two lognormal distributions to (PD, T1, T2) triples sampled from a large number of pixels.

The intensity distributions of WM and WMH intensities empirically follow “comet-like” patterns. Pictured: WMH intensities and the fit distribution for them in one ADC subject.

Combining the equations for *Π* and *L* and taking the log, we have

$$\mathit{log}(P(X\mid Y))\propto -{H}_{s}(X)-{H}_{c}(X)+{H}_{L}(Y\mid X)$$

In the following sections, we describe the Training phase that determines the values of *α* and *β*, followed by the Inference phase where the best set of labels *X* is determined for an input image *Y*.

In the training phase the parameters *α _{i}* and

The *β*_{Xδ} values are calculated using the same training data as the *α _{i}* values using Iterative Proportional Fitting (IPF) [20]. For each

$${\beta}_{{X}^{\delta}}^{n}={\beta}_{{X}^{\delta}}^{n-1}\times R(\frac{{M}_{{X}^{\delta}}^{e}}{{M}_{{X}^{\delta}}^{m}})$$

where
${\beta}_{{X}^{\delta}}^{n}$ is the value of *β*_{Xδ} at the *n*th iteration of IPF,
${M}_{{X}^{\delta}}^{e}$ is the empirical marginal probability of *δ* = *X ^{δ}* calculated as the proportion of the training data in which that label configuration occurred in

Run-time processing of a novel image set begins by using an MLESAC-based procedure to robustly estimate the means and covariances of the lognormal distributions associated with the WM and WMH classes [21]. Specifically, we generate *k* random samples of 10 pixels each from among those pixels that are most likely to contain WMHs *a priori*, *i.e.* from among the 5% of pixels *i* with the highest *α _{i}*. Similarly we generate

$$\sum _{X\in \{{X}_{L},{X}_{H}\}}\sum _{{x}_{i}\in X}\delta (i)f({y}_{i};{\mu}_{{x}_{i}},{\mathrm{\sum}}_{{x}_{i}})+(1-\delta (i))\nu $$

where *ν* is a fixed penalty for outliers and *δ*(*i*) indicates whether *y _{i}* is an outlier,

In Belief Propagation (BP), inference is performed by propagating local evidence (beliefs) as messages. Here, we use the Factor Graph formulation of BP in order to simplify notation. Factor Graphs represent undirected graphs in a bipartite fashion with two types of nodes: factor nodes and variable nodes. In our method, variable nodes directly correspond with pixel labels *x _{i}* and factor nodes each correspond to a

$$\begin{array}{c}{x}_{i}\to \delta (x)=O(i,x)\sum _{\alpha \in {\mathrm{\Delta}}_{i}\backslash \{\delta \}}\alpha \to {x}_{i}(x)\\ \delta \to {x}_{i}(x)=\underset{{X}^{\delta}:{x}_{i}=x}{max}C({X}^{\delta})\sum _{{x}_{m}\in \delta \backslash \{{x}_{i}\}}{x}_{m}\to \delta (x)\end{array}$$

where *x* is a candidate label for *x _{i}*,

$$O(i,x)=[x{\alpha}_{i}+(1-x){\alpha}_{i}][L({x}_{i}=x\mid {y}_{i})],$$

and the compatibility term *C*(*X ^{δ}*) =

$${M}_{{X}^{\delta}}^{m}=C({X}^{\delta})\prod _{{x}_{i}\in \delta}{x}_{i}\to \delta ({x}_{i})$$

for each possible configuration of labels *X ^{δ}* for the given

In this section, we test the method’s performance under varying training/inference conditions, training set sizes, neighborhood connectivity, and training data sources.

In these tests, we use leave-one-out cross-validation to evaluate MRF-based WMH detection on the ADC data set; for each subject, we estimate the *α* and *β* parameters from the remainder of the subjects and use them to detect WMHs on the left-out subject. Agreement between the ground-truth WMH volumes and our computed volumes is evaluated using the intraclass correlation coefficient (ICC). We compute these ICC values for our method under each of these conditions: In the **No MRF** method, we do not use an MRF-based system and instead simply threshold the Posterior probabilities deduced from the *H _{s}* and

Comparison of WMH detection results for a selected brain region (see green box, left, and ground-truth). Detected WMHs are shown in yellow.

Intraclass correlation coefficients (ICCs) between ground-truth WMH volume and WMH volume estimated by our method on the ADC data set with several variations.

In our experiments, our MRF-based method outperforms the No-MRF and untrained MRF versions.

One variable parameter of the method is the connectivity of its Contextual Prior information, ie. what size groupings of neighboring pixels influence each other in the MRF system. Higher values allow the system to model more complex spatial patterns. In 2D images, this choice is generally whether or not diagonal pixels are considered neighbors. In 3D, neighborhoods are described in values between 6, ie. a pixel’s 4 nearest neighbors within the plane and 2 nearest in the Z direction; and 26, ie. all of a pixel’s neighbors in a 3×3×3 pixel box around it. Results of testing our method under varying connectivities are presented in Table 2. For these tests, as in the previous, we used the ADC dataset and leave-one-out cross validation.

Intraclass correlation coefficients (ICCs) between ground-truth WMH volume and WMH volume estimated by our method using various degrees of spatial prior directional connectivity for the ADC data set.

In these experiments, we found that our method performs best using 6-connected neighborhoods, the smallest logical size within 3D space.

One important property of any training-based classification method is the amount of training data it requires to give good results on test data. To test this property, we trained upon three different randomly selected subsets of the ADC dataset for each size: 10, 20, 40, 60, 80, and 100 subjects. We then ran the method to classify the dataset using these subsets as training data (Fig. 4).

Plotted mean *μ* and *μ* ± *σ* of ICC values between ground-truth WMH volume and WMH volume estimated by our method using differently-sized random subsets of the training set in cross validation. Note that these values are absolute, **...**

For this dataset our method performs better when using more training data up until about 60 images, after which there is little improvement.

To test our method’s performance using a completely different dataset from that upon which it was trained, we employed ground-truth WMH map data of 51 subjects from the Chicago Health and Aging Project (CHAP), a longitudinal Epidemiological study of individuals with risk factors for Alzheimer’s Disease [24]. These images were preprocessed in the same fashion as the ADC data (Sec. 2) and used for training. We then tested (using 6-connected neighborhoods and standard training/inference) our dataset of 114 ADC subjects using this training data and obtained results with an ICC of **0.841**, demonstrating our method’s ability to perform reasonably when classifying images from a dataset from an entirely different MRI scanner, study type (epidemiological vs. clinic-based cohorts), and population.

Our method performs robust WMH detection with no FLAIR when using at least 60 training images and standard MRF training/inference, including when the sources of training and testing data differ significantly. While our method performs strongly in these experiments, there exist several routes through which it can be improved in the future. Specifically, we discuss why the method performed worse using higher connectivities and possible new applications such as longitudinal WMH detection and multi-class segmentation.

Increased complexity can model more complex spatial dependencies among WMHs, but did not perform well in our experiments (Sec. 3). This drop in performance can be explained by a combination of factors. Higher connectivities subdivide the training data into a larger set of parameters, requiring a larger amount of training data. Additionally, it is possible that higher connectivities result in overfitting to the training data. Finally, BP, used here in both training and inference, is technically not guaranteed to perform well in loopy graphs but empirically does for 4-connected 2D latices. As the connectivity of our model increases, so does the proportion of loops in the graph, which may decrease performance. Future work should determine which combination of factors causes the decrease.

In addition to improving the method itself, future work will test and extend it for use in other applications. Simply by using appropriate training data, it could be applied to other diseases and modalities. It could also be extended to classify multiple tissue types at once to create an overall brain tissue segmentation system. Another possibility would be to detect WMHs on longitudinal series of MRIs. With this change our method could not only improve the results of each detection by the additional information (eg. encouraging pixels with WMH at time 1 to remain WMH at time 2) but also generate models of disease progression.

1. Taylor WD, Steffens DC, et al. White Matter Hyperintensity progression and late-life depression outcomes. Arch Gen Psychiatry. 2003 November;60(11):1090–1096. [PubMed]

2. Au R, Massaro JM, et al. Association of White Matter Hyperintensity volume with decreased cognitive functioning: the Framingham Heart Study. Arch Neurol. (In Press) [PubMed]

3. Dufouil C, de Kersaint-Gilly A, et al. Longitudinal study of blood pressure and White Matter Hyperintensities: The EVA MRI cohort. Neurology. 2001 April;56(7) [PubMed]

4. van Straaten E, Fazekas F, et al. Impact of White Matter Hyperintensities scoring method on correlations with clinical data: The LADIS study. Stroke. 2006;37(3) [PubMed]

5. Admiraal-Behloul F, van den Heuvel D, et al. Fully automatic segmentation of White Matter Hyperintensities in MR images of the elderly. Neuroimage. 2005 November;28(3):607–617. [PubMed]

6. Wu M, Rosano C, et al. A fully automated method for quantifying and localizing White Matter Hyperintensities on MR images. Psychiatry Research: Neuroimag. 2006 December;148(2–3):133–142. [PMC free article] [PubMed]

7. Longstreth W, Manolio TA, et al. Clinical correlates of White Matter findings on cranial magnetic resonance imaging of 3301 elderly people: The Cardiovascular Health Study. Stroke. 1996 August;27:1274–1282. [PubMed]

8. Mueller S, Weiner M, et al. Ways toward an early diagnosis in Alzheimer’s disease: The Alzheimer’s disease neuroimaging initiative (ADNI) Alzheimers Dement. 2005 July;1(1):55–66. [PMC free article] [PubMed]

9. Yoshita M, Fletcher E, et al. Extent and distribution of White Matter Hyperintensities in normal aging, mci, and ad. Neurology. 2006 December;67(12):2192–8. [PubMed]

10. Leemput KV, Maes F, Bello F, Vandermeulen D, Colchester ACF, Suetens P. Automated segmentation of MS lesions from multi-channel MR images. 1999:11–21.

11. Van Leemput K, Maes F, Vandermeulen D, Colchester A, Suetens P. Automated segmentation of Multiple Sclerosis lesions by model outlier detection. Medical Imaging, IEEE Transactions. 2001;20(8):677–688. [PubMed]

12. Azhar Quddus PF, Basir O. Adaboost and support vector machines for White Matter Lesion segmentation in MR images. Engineering in Medicine and Biology Society. 2005 [PubMed]

13. Maillard P, Delcroix N, et al. An automated procedure for the assessment of White Matter Hyperintensities by multispectral (T1, T2, PD) MRI and an evaluation of its between-centre reproducibility based on two large community databases. Neuroradiology. 2008 January;50(1):31–42. [PubMed]

14. Maes F, Collignon A, et al. Multimodality image registration by maximization of mutual information. IEEE Trans Med Imaging. 1997;16:187–198. [PubMed]

15. Yoshita M, Fletcher E, et al. Current concepts of analysis of cerebral White Matter Hyperintensities on Magnetic Resonance Imaging. Top Magn Reson Imaging. 2005 December;16(6):399–407. [PubMed]

16. Kochunov P, Lancaster J, et al. Regional spatial normalization: toward an optimal target. J Comp Assist Tomog. 2001 Sep–Oct;25(5):805–16. [PubMed]

17. Otte M. Elastic registration of fMRI data using bezier-spline transformations. IEEE Trans Med Imaging. 2001;20:193–206. [PubMed]

18. Hammersley J, Clifford P. Markov fields on finite graphs and lattices. 1971.

19. Besag J. On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society Series B. 1986;48(3):259–302.

20. Jirousek R, Preucil S. On the effective implementation of the iterative proportional fitting procedure. Computational Statistics & Data Analysis. 1995 February;19(2):177–189.

21. Torr P, Zisserman A. MLESAC: A new robust estimator with application to estimating image geometry. CVIU. 2000 April;78(1):138–156.

22. Freeman WT, Pasztor EC, YOTC Learning low-level vision. International Journal of Computer Vision. 2000;40

23. Bishop CM. Pattern Recognition and Machine Learning. Springer Science+ Business Media; 2006.

24. Bienias JL, Beckett LA, Bennett DA, Wilson RS, Evans DA. Design of the chicago health and aging project (CHAP) J Alzheimers Dis. 2003 Oct;5:349–355. [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |