Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Magn Reson Imaging. Author manuscript; available in PMC 2011 May 1.
Published in final edited form as:
PMCID: PMC2863100

Brain MRI Tissue Classification Based on Local Markov Random Fields


A new method for tissue classification of brain magnetic resonance images (MRI) of the brain is proposed. The method is based on local image models where each models the image content in a subset of the image domain. With this local modeling approach, the assumption that tissue types have the same characteristics over the brain needs not to be evoked. This is important because tissue type characteristics, such as T1 and T2 relaxation times and proton density, vary across the individual brain and the proposed method offers improved protection against intensity non-uniformity artifacts that can hamper automatic tissue classification methods in brain MRI. A framework in which local models for tissue intensities and Markov Random Field priors are combined into a global probabilistic image model is introduced. This global model will be an inhomogeneous Markov Random Field and it can be solved by standard algorithms such as iterative conditional modes. The division of the whole image domain into local brain regions possibly having different intensity statistics is realized via sub-volume probabilistic atlases. Finally, the parameters for the local intensity models are obtained without supervision by maximizing the weighted likelihood of a certain finite mixture model. For the maximization task, a novel genetic algorithm almost free of initialization dependency is applied. The algorithm is tested on both simulated and real brain MR images. The experiments confirm that the new method offers a useful improvement of the tissue classification accuracy when the basic tissue characteristics vary across the brain and the noise level of the images is reasonable. The method also offers better protection against intensity non-uniformity artifact than the corresponding method based on a global (whole image) modeling scheme.

Keywords: segmentation, magnetic resonance, brain imaging, sub volume probabilistic atlas, classifier combination

1. Introduction

The automatic segmentation of three dimensional (3-D) magnetic resonance images (MRI) of the brain is increasingly important for neuroscientific studies. Studies concerning morphological measurements of different brain structures can benefit from the cost-effectiveness, speed and good reproducibility of automatic image segmentation [1]. The basic form of the 3-D brain MRI segmentation involves the classification of the voxels in the MR image into the three primary tissue types: white matter (WM), gray matter (GM) and cerebro-spinal fluid (CSF). This tissue classification has various important applications. For example, several cortical surface extraction algorithms depend on tissue classification [2, 3, 4, 5, 6, 7].

The research problems in tissue classification methodology include how to best cope with the intensity non-uniformity, partial volume effect, and scanner specificity of MRI as well as how to best apply the existing neuroanatomical knowledge and the spatial dependency of the labels of close-by image voxels to increase the quality of tissue classification. Recently, information and/or classifier fusion for the tissue classification, or more generally for MRI segmentation, have drawn an increasing amount of attention. For example, an interesting line of research has focused on the simultaneous registration and tissue classification [8, 9, 10, 11]. These approaches can be viewed as information fusion methods where the information from the brain atlases is combined with the information on the tissue composition of the individual image in a non-trivial manner. Similarly, label propagation techniques for the brain MRI segmentation can use information from multiple templates to better adapt to the anatomical variability of human brain leading to decision fusion approaches [12].

In this work, we propose a different kind of the information/classifier fusion approach targeted to the tissue classification problem. Below, we argue that the MR brain images are more accurately modeled locally than globally. We therefore divide the image domain into local subdomains that are described by their own local image model and derive a framework in which image models with local spatial support can be combined to yield a global, statistically based image model. Both local and global image models will be posed in a Markov Random Field (MRF) framework allowing the incorporation of the spatial information to improve the quality of the voxel classification. We apply a Sub Volume Probabilistic Atlas (SVPA) to define the local support regions so that the division of the image domain into sub-domains is anatomically informed. The model-fusion framework is complemented by computational algorithms for solving the parameters of the model and performing the tissue classification in a completely automated manner. Particularly, we apply a global optimization approach based on a genetic algorithm for unsupervised estimation of the tissue type parameters for each local model [13]. We note that the approach presented here is in many senses complementary to the above referenced frameworks for joint image segmentation and registration. This is because, although we start from local image models, the final image model is a global generative model that could then be unified with a joint segmentation and registration approach.

As already mentioned, we argue that local image models can be more accurate than global image models. There are at least two important reasons for this. First, MR images contain low frequency intensity variations that are often described by the terms intensity non-uniformities or bias fields. In this work, we refer to this phenomenon as instrumentation non-uniformity. These low frequency variations cause non-stationarity of the parameters for the class conditional probability density functions (pdfs) of individual tissue classes. This has a negative effect on the quality of automatic tissue classifications that are based on the assumption that tissue classes have the same statistical properties throughout the image. Several methods for the correction of this low frequency noise in images have been proposed and a recent review can be found in [14]. The methods for the retrospective bias field correction typically assume that the bias field is multiplicative and spatially smooth. Some of them, including [15, 16, 17, 18] are stand-alone methods, while others perform the tissue classification jointly with non-uniformity correction [19, 20, 21]. With modern MR imaging devices, state-of-the-art methods for non-uniformity correction can be considered to be largely successful. However, these methods are not able to remove all of the low frequency noise in all parts of the image. In addition, there is evidence that the assumption of the spatial smoothness of the bias field may not be completely valid for T1 and T2 weighted scans [22]; furthermore Marroquin [23] presented a case against multiplicative nature of the bias fields. The framework we propose in this paper can reduce the unfavorable effects of unsuccessful intensity non-uniformity correction.

Second, the intrinsic properties of tissue types, such as T1 and T2 relaxation times and proton density (PD), vary across an individual brain. Since the MRI signal is a function of these intrinsic tissue characteristics, their variation across the brain causes variation in the MRI signal values as demonstrated in [24, 25]. This phenomenon - termed here as biological non-uniformity - has been confirmed in various studies. The variability of GM T1 relaxation time across the cortex is well documented [9, 26, 27] as is the T1 relaxation time difference between cortical GM and various sub-cortical GM structures such as caudate, putamen, thalamus, amygdala and hippocampus [9, 28]. Differences in T2 between various GM and WM structures have been reported [27, 29]. There is evidence demonstrating a slight hemispheric asymmetry of PD as well as variability of PD in different WM structures [30]. All the above findings are about healthy adult populations with the exception of [29] where also schizophrenic subjects were studied. The biological non-uniformity can be even greater in pathology or in aging or d Since image intensities from a tissue class have changing statistical characteristics across the brain due to biological non-uniformity that may not be accounted for by methods designed to correct for instrumentation non-uniformity, the assumption of global models for tissue types has a negative effect on the quality of tissue classification and subsequent analysis.

Grabowski [31] and Kovacevic [32] have proposed tissue classification strategies based on similar arguments to those we present here, considering both biological and instrumentation non-uniformity. They first estimated parameters of a global mixture of Gaussians (MoG) based intensity model. Thereafter, the image domain was divided into smaller sub-domains whose intensities were modeled by individually by local MoGs. The parameters for these were estimated by a local optimization method starting from the global image parameter estimates. These previous approaches differ from our method in three important aspects. As opposed to our framework, these approaches did not include a spatial MRF prior and they did not lead to a generative, global image model. Moreover, partial volume effect (PVE) was not modeled in [32] and in [31] it was assumed that the PVE could be modeled using 50:50 mixtures, e.g., voxels containing 50% WM and 50% GM were included in the intensity model. That assumption is clearly overly simplistic.

Many methods for simultaneous instrumentation non-uniformity correction and tissue classification are similar to our method in that they consider tissue class parameters to be spatially varying over the image domain. These methods can include also an MRF prior for tissue classifications, and some of them incorporate statistically based modeling for PVE, e.g.,[33]. The general difference between these methods and our framework is that in these the spatial variation of tissue class parameters is strongly parametrized whereas our method is nonparametric in this respect. Note that we do not argue that our method should be used to completely replace existing non-uniformity correction methods, but rather to complement them. The adaptive K-means algorithm [34], modified and extended to MRI application in [35, 36] and the Adaptive MAP algorithm [37] are perhaps closest to our framework. The adaptive K-means algorithm accounts for local intensity variations by an iterative procedure involving averaging over a sliding window whose size decreases as the algorithm progresses. These variations may be tissue class specific. The algorithm also includes an MRF prior. The adaptive MAP algorithm can be considered as an extension of the adaptive K-means, where also the variance of the tissue class distribution is allowed to vary over the image. Both of these methods rely on the standard K-means clustering for their initialization, and as local optimization methods, their performance is dependent on this initialization. In contrast to our method, these methods also assume a single, global MRF prior and do not account for PVE. We will also provide experimental results clearly demonstrating the superiority of the proposed approach to the Adaptive MAP algorithm [37].

The organization of the paper is as follows. Section 2 describes the combination of the local MRFs defined on image sub-domains into a single global MRF-based image model. Section 3 describes the segmentation method rooted in the image model of Section 2. In Section 4, we present experimental validation of the method by comparing it quantitatively to other recent tissue classification schemes as well as the corresponding tissue classification method based on a traditional MRF-based image model. This validation is performed using both synthetic and real MR brain images. Section 5 summarizes the approach and discusses it in relation to the other approaches.

2. Theory: Combining local Markov Random Fields

2.1. Markov Random Fields, Gibbs distributions and image segmentation

Markov Random Fields [38] are widely applied for modeling the spatial dependency of nearby voxel tissue classes in tissue classification within brain MRI. The purpose of this sub-section is to briefly introduce the notation and give the required background information on inhomogeneous MRFs. It is not our intention to give a comprehensive review on MRFs within brain MRI segmentation.

The image voxels, or sites, are denoted by si = (six, siy, siz), i = 1, …, m, where m is the number of voxels in the image. The set of all image voxels is called the image domain and denoted by S. The set of neighboring voxels adjacent to the voxel si is denoted by Ni. In this paper, the 26-neighborhood of the image voxels is assumed. This set of voxels together with the assumed adjacency relations forms a neighborhood system (S, {Ni}) [39, 38]. Every voxel si is associated with an observed intensity value xi [set membership] R and X = {xi: i = 1, …, m} is the observed image. We have a set of tissue classes L = {lk: k = 1, …, K}. A tissue classified image, or classification, is denoted by C = {ci [set membership] L: i = 1, …, m}. Our task is then to find the best tissue class ci*[set membership]L for each voxel given the observed image intensities. The best classification is denoted by C*={ci*:i=1,,m}.

The quality of a tissue classification is measured using its posterior probability given the observed image intensities. This leads to a maximum-aposteriori (MAP) optimization problem:

C*=argmaxC[set membership]LmP(C[mid ]X),

where P(C|X) is the probability of the classification C given the image X. We assume that the probability of observing the intensity xi is dependent only on the tissue class ci and the site si. We write

P(C[mid ]X)=P(X[mid ]C)P(C)p(X)=1Z([product]i=1mp(xi[mid ]ci,si))P(C),

where p(X|C) is the probability of X when the classification is C, p(xi|ci, si) is the probability density of observing xi when the class at si is ci, P(C) is the prior probability of the classified image C and Z = p(X) is a normalization constant that is independent of C. By denoting the likelihood term by p(xi|ci, si), we emphasize that the likelihood term can have a different parametric form for different voxels.

We define the prior probabilities by the Gibbs distributions:

P(C)[proportional, variant]exp(U(C));U(C)=QUQ(ci:si[set membership]Q)

where U(·) is termed the energy function and Q [subset or is implied by] S denotes a clique (a single site or any subset of sites such that each site in the subset is contained in the neighborhood of every other location in the subset) in the graph (S, {Ni}). With certain assumptions (see [40]), the Gibbs distributions are equivalent to MRFs, and particularly, they satisfy the Markovian property of the MRFs stating that the prior probability of ci depends only on cj, sj [set membership] Ni. Denoting C[partial differential](i) = {c1, …, ci−1, ci+1, …, cm} and CNi = {cj: sj [set membership] Ni}, we can write

P(C)=P(ci[mid ]si,C[partial differential](i))P(C[partial differential](i))=P(ci[mid ]si,CNi)P(C[partial differential](i)).

We emphasize that P(ci|si, CNi) can be different for different si. These kinds of MRFs are termed inhomogeneous [38]. Strictly speaking, P(C[partial differential](i)) should also be conditioned on sites, but we omit this to avoid unnecessarily confusing notation. By combining (2) and (4) we get [39]

P(ci[mid ]si,X,C[partial differential](i))=1Zp(xi[mid ]ci,si)P(ci[mid ]si,CNi),

where Z′ is a normalization constant. The decompositions (4), (5) form a basis for widely applied techniques for solving the maximization problem in (1), namely iterated conditional modes (ICM) algorithm [39] and simulated annealing based techniques such as the Gibbs sampler [41]. Note that the decomposition (4) does not directly help in generating a suitable P(·). If one took an arbitrary P(·|si, CNi), for each i, then there might not exist a joint probability P(·): LmR generating these conditionals. Thus, MRFs are defined through Gibbs distributions and not through conditionals P(ci|si, CNi).

In this paper, we consider second order MRFs for simplicity although the concepts can be extended to higher order MRFs as well. The focus on the second order MRFs is because tissue classification methods within MRI typically rely on second order MRFs. The energy of a second order Gibbs distribution is

U(C)=si[set membership]S(G(ci[mid ]si)+12sj[set membership]NiH(ci,cj[mid ]si,sj)),

where singleton potentials G(·|si) and doubleton potentials H(·, ·|si, sj) can be freely selected. However, it is assumed that H(ci, cj|si, sj) = H(cj, ci|sj, si) for all i, j. Note that above we sum over sites rather than interactions. Approximately, G(·|si) defines the probability for an individual tissue class occurring at the site si and H(·, ·|si, sj) defines the probabilities of tissue classes at a site given the tissue classes at neighboring sites. With the energy given in (6) and using the relation 12(H(ci,cj[mid ]si,sj)+H(cj,ci[mid ]sj,si))=H(ci,cj[mid ]si,sj), the conditional is

P(ci[mid ]CNi)=exp(G(ci[mid ]si)sj[set membership]NiH(ci,cj[mid ]si,sj))ck[set membership]Lexp(G(ck[mid ]sk)sj[set membership]NkH(ck,cj[mid ]sk,sj)).

2.2. Sub volume probabilistic atlases

We apply Sub Volume Probabilistic Atlas (SVPA) to divide the image domain S into a set of sub-domains. It is important to note that, in our setting, the set of sub-domains does not coincide with the set of the tissue classes L. SVPAs provide information on the probability that a certain voxel in the MR image in the stereotactic space is part of a certain brain structure [42]. The structures can include any brain regions having a clear anatomical interpretation. There can also exist multiple types of divisions of the image domain. Here, we need to distinguish between two of them. The simplest type of subdivisions - tissue probability maps (TPMs) - give the probability that a certain voxel is of a certain tissue type (CSF, GM or WM) and this has been used for automatic tissue classification in a variety of settings (see e.g. [10, 23, 43]). We do not use this information to divide the image domain into sub-domains, however it is possible to derive G(ci|si) terms in (6) and (7) based on this information. The second level of sub-divisions in SVPAs provides information whether a certain voxel belongs to certain sub-structure. We use this information to divide the image domain into sub-domains. In our experiments, we divide the brain into 12 local domains: cerebellum, a sub-cortical domain (consisting of basal ganglia, thalamus, and ventricles), frontal lobe, occipital lobe, parietal lobe and temporal lobe, all divided into left and right hemispheres. This division is based on the ICBM (International Consortium for Brain Mapping) SVPA for young healthy adults [44]. For each voxel si [set membership] S and region (or domain) Rb, b = 1, …, B, the SVPA gives a probability ri(b) = P(si [set membership] Rb) that the voxel i belongs to the region Rb. We assume that b=1Bri(b)=1 for all i.

While the ICBM-SVPA divides GM into distinct sub-domains, more specifically into 20 cortical, sub-cortical or cerebellar GM structures, the ICBM-SVPA atlas does not provide similar divisions of WM and CSF. For example, in the ICBM-SVPA no distinction is made between WM in the temporal lobe and WM in the parietal lobe. Thus, we can extract ri(b) only for those voxels which are dominantly GM voxels in the atlas according to first-level (tissue type) subdivisions. For our purposes, it is imperative that we have ri(b) for every voxel, independent on its tissue type in the atlas, and we extend the ICBM-SVPA to cover also WM and CSF. In Appendix A, we present a simple procedure of extending GM SVPAs to atlases that divide WM and CSF into distinct brain regions. Obviously, this pragmatic solution takes away a part of the probabilistic interpretation of the extended atlas. However, this presents no real difficulties although some justifications are given based on the probabilistic interpretation of ri(b) as P(si [set membership] Rb).

2.3. Compound MRF based on local MRFs and SVPAs

It remains to be explained how to construct the inhomogeneous MRFs of Section 2.1 in practice. We use a set of homogeneous MRF-image models defined on local image sub-domains and construct a global MRF-based image model using these (we consider the likelihood term to be included within MRFs). The driving idea is that local MRFs describe the data better than a global MRF and hence allow more informed image segmentation. The sub-domains R1, …, RB are derived based on SVPAs and they must be overlapping, i.e. the SVPA must be probabilistic, as applying B distinct MRFs for the image segmentation would be likely to produce discontinuities in tissue classifications. It is better to combine these local MRFs into a single MRF and perform the segmentation with respect to this compound MRF. An additional benefit of the presented novel combination scheme is that it results in a global image model. We define regional likelihood terms pb(·|·) and the regional prior terms Pb(·) for b = 1, …, B. We interpret the regional likelihood via pb(xi|ci) = p(xi|ci, si [set membership] Rb) and similarly for the regional prior. We combine pb(·|·) (b = 1, …, B) and Pb(·) (b = 1, …, B), respectively, into a compound likelihood term and a compound prior term whose regions of support cover the whole image domain. Note that the regional likelihoods do not depend on the site si.

2.3.1. Likelihood term

We write the site likelihoods in the likelihood term of Eq. (2) as

p(xi[mid ]ci,si)=b=1Bri(b)pb(xi[mid ]ci).

Interpreting ri(b) as P(si [set membership] Rb), assuming that the events si [set membership] Rb, b = 1, …, B are mutually exclusive and assuming that ri(b) does not depend on ci, (8) can be derived as

p(xi[mid ]ci,si)=bp(xi,si[set membership]Rb[mid ]ci)=bP(si[set membership]Rb)pb(xi[mid ]ci,si[set membership]Rb)=b=1Bri(b)pb(xi[mid ]ci).

2.3.2. Prior term

The combination is not as straight-forward for the prior probability term as it was for the likelihood term. A mixture b=1Bri(b)Pb(C) is dependent on the site si (via ri(b)), and hence the compound prior term can not be derived this way. Averaging b=1Bri(b)Pb(C) over the all sites is not a feasible solution either since this would take away the Markovian property (4).

Instead, we give the compound energy via regional energies and then derive the required conditionals based on this energy. We denote the regional G and H functions by Gb and Hb, respectively. Now the compound energy function is

U(C)=si[set membership]S(ln(b=1Bri(b)exp(Gb(ci[mid ]si)))+12sj[set membership]Niln(b=1B(0.5rj(b)+0.5ri(b))exp(Hb(ci,cj[mid ]si,sj)))).

The compound energy function defines a second order MRF based on pairwise interactions, which leads to efficient computations. This order preservation property relies on the separate combination of the 1st order and 2nd order terms. The second order term in Eq. (9) is weighted by 0.5rj(b)+0.5ri(b) to preserve the symmetry of the 2nd order term in the energy function. This particular combination strategy is analogous to the combination of the local likelihood terms. We will show later on that it has a few interesting properties such as if all regional terms are equal then the energy in Eq. (9) reduces to the standard Gibbs energy in Eq. (6).

In this work, we assume that Hb(ci, cj|si, sj) = H(ci, ci) for all b. This is assumption is made for practical purposes: usefulness of generating different 2nd order MRF terms for different regions would depend on the application of tissue classification and is beyond of the scope of the current work. However, starting from a more general model (9) than absolutely necessary demonstrates the extensibility of the general principles we propose. Now, because Hb(ci, cj|si, sj) = H(ci, ci) and Σb ri(b) = 1 for all i, it follows that ln(b=1B0.5(ri(b)+rj(b))exp(H(ci,cj)))=H(ci,cj). Consequently, Eq. (9) simplifies to

U(C)=si[set membership]S(ln(b=1Bri(b)exp(Gb(ci[mid ]si)))+12sj[set membership]NiH(ci,cj)).

This is the MRF used in this paper.

The conditionals of Eq. (10) are given by (see Eq. (7)):

P(ci[mid ]si,CNi)[proportional, variant]exp(ln(b=1Bri(b)expGb(ci[mid ]si))sj[set membership]NiH(ci,cj))

=(b=1Bri(b)exp(Gb(ci[mid ]si)))(exp(sj[set membership]NiH(ci,cj)))

=b=1Bri(b)exp(Gb(ci[mid ]si)sj[set membership]NiH(ci,cj)).

The fact that H(ci, cj) = H(cj, ci) was used to simplify the conditionals.

The compound prior defined in Eq. (10) has two interesting properties:

  1. If we set Gb(ci|si) [equals, single dot above] G(ci|si) for all b, i.e. we apply the same prior models for each region, Eq. (10) reduces to Eq. (6) with no site dependence in the H-functions. In other words, the compound MRF prior reduces to the ordinary 2nd order MRF prior. This was one of the reasons for choosing the above strategy for constructing the compound prior.
  2. When we set H(ci, cj) = 0 for all ci, cj, i.e. the second order term in the MRF is ignored, the energy in Eq. (9) can be written as
    U(C)=si[set membership]Sln(b=1Bri(b)exp(Gb(ci[mid ]si))).
    We write Pb1st(ci[mid ]si)[equals, single dot above]exp(Gb(ci[mid ]si)) and assume that ci[set membership]LPb1st(ci[mid ]si)=1. Pb1st(·[mid ]·) are interpreted as regional prior terms. This yields the prior
    P(C)=[product]si[set membership]Sb=1Bri(b)Pb1st(ci[mid ]si)ci[set membership]Lb=1Bri(b)Pb1st(ci[mid ]si)=[product]si[set membership]Sb=1Bri(b)Pb1st(ci[mid ]si).
    Thus, if the higher order terms in the MRF are ignored, the combination strategy for priors is the mean rule in the terminology of the classifier combination. Also, this prior can be derived analogously to the derivation of the compound likelihood term.

2.3.3. Compound MRF and discussion

Substituting (3), (8), and (10) into (2) yields the posterior:

P(C[mid ]X)[proportional, variant][product]i=1m(b=1Bri(b)pb(xi[mid ]ci))×exp(si[set membership]Sln(b=1Bri(b)exp(Gb(ci[mid ]si)))12sj[set membership]NiH(ci,cj)),

which we maximize with respect to C. Before we specify the likelihood and prior terms in our implementation, we discuss briefly why we have selected this particular strategy of combining local image models into a global image model among several possibilities. The combination strategy we have chosen has a few desirable properties: 1) it results in a global image model constructed from local models; 2) it contains a component for modeling spatial dependency of the tissue classes of neighboring voxels; and 3) it results in a computationally feasible segmentation algorithm suitable for a desktop computer. For example, an appealing way of generating a global posterior would be to proceed in a reverse order to what we have done and first combine regional priors and likelihood terms into regional posteriors and then combine the regional posteriors into a global posterior model. However, for this strategy, it is not immediately clear how to define the regional priors containing a component that models the spatial interaction between tissue classes of neighboring voxels. Further, in general, the partition functions of the MRFs that might be required for this reverse strategy are impossible to compute exactly [38]. Of course, one could assume a single prior model describing the whole image domain. However, according to property 1 of the Section 2.3.2, this would lead to the exactly the same posterior P(C|X) as our scheme.

We still compare our combination scheme to an alternative scheme. In this alternative scheme, one would have B × K tissue classes (e.g. GM in occipital lobe) instead of K tissue classes and B sub-domains. The class of the voxel would be the class maximizing the posterior probability among B × K classes and K class classification would be obtained by selecting the class among K classes corresponding to the maximizing class in B ×K class classification (e.g. GM in occipital lobe would become GM). The priors for classes could be constructed by multiplying the priors for K classes with ri(b)s. This would be similar to the ‘max-rule’ in the classifier combination while our approach is more similar to ‘mean rule’ [45]. While this approach may seem simpler than the one proposed here, it has a few shortcomings which are both theoretical and practical. Theoretically, this might lead to problems of continuity in image segmentations if neighboring sub-domains have very different intensity characteristics. A practical difficulty arises if the likelihood values for each tissue class and voxel need to be stored for computational efficiency - this is the case for the algorithm introduced in the next section. Then, storage requirements would be B times higher than for our method which would be likely to complicate the segmentation of high resolution MR images on a desktop computer.

3. Methods

3.1. Algorithm overview

Our algorithm for tissue classification into CSF, GM, and WM using the compound MRF introduced in the previous section proceeds in two stages. First, the parameters for the image models (specified in Section 3.2) are estimated for each of B regions. Then, we substitute these estimated parameter values into (14) and solve the minimization problem with the Iterated Conditional Modes (ICM) algorithm [39]. The resulting algorithm is called SVPASEG.

Before any computations can be performed, the compound MRF must be fully specified. In Section 3.2, we specify the exact forms of the likelihood terms pb(·|ci) for all b and ci, and give a method for the image parameter estimation. In Section 3.3, Gb (the 1st order terms of the local MRFs) and H (2nd order term of the local MRFs) functions of the prior term are specified. In Section 3.4, the complete algorithm is summarized.

3.2. Image model

3.2.1. Likelihood term

The same parametric image model, which follows the mixel model [46], is applied for all regions Rb. However, this model is assumed to have different parameter values for each region. Our image model does not model the instrumentation non-uniformity artifact within a region.

The images are composed of two kinds of voxels: those that contain only one type of tissue (pure voxels) and those that contain multiple types of tissues (partial volume (PV) voxels). The pure voxel classes are WM, GM, and CSF. The PV classes are CSF/GM, WM/GM, and CSF/background. The WM/CSF class is not considered due to a very small number of WM/CSF voxels in normal human brain MRI. In each region Rb, it is assumed that the intensities of the pure voxels follow the normal density:

pb(x[mid ]μkb,σkb2,lk)=(2πσkb2)1/2exp(xμkb)22σkb2

where μkb is the mean and σkb2>0 is the variance of the class lk in the region Rb. The thermal noise in magnitude MR images is Rician, but with signal to noise ratios typical to anatomical MRI this Rician distribution can be well approximated by Gaussian distribution [47]. The pdfs for PV voxels are constructed by marginalization [48]. We assume that there are no more than two tissue types present in a voxel. The pdfs of PV voxels are dependent on the parameters of the appropriate pure voxel classes. For the mixture of tissue types lu and lv, the pdf of the PV class is

pb(x[mid ]lk,θub,θvb)=01exp((xwμub(1w)μvb)22(w2σub2+(1w)2σvb2))(2π)(w2σub2+(1w)2σvb2)dw,

where θub=[μub,σub2]. The derivation of Eq. (16) presented in [49] starts from the assumption that each voxel is divided between different tissue types which is slightly different from the starting assumption of [48]. To simplify later notation, we denote θkb = [θub, θvb] when lk is a mixture of lu and lv.

3.2.2. Parameter estimation

We formulate the estimation of the image parameters as finite mixture model (FMM) fitting problem. Rather than solving the image parameters μkb, σkb2 for all regions simultaneously, we formulate a weighted likelihood (WL) approximation for solving them for each region separately. Let Θb={αkb,μkb,σkb2;k=1,,K}, where the mixing parameter αkb represents the proportion of the tissue type lk in the region b, denote the set of parameters to be estimated. The parameter estimation problem can now be posed as

Θb*=argmaxΘb[product]i=1m(lk[set membership]Lαkbpb(xi[mid ]θkb,lk))ri(b)=argmaxΘbi=1mri(b)ln(lk[set membership]Lαkbpb(xi[mid ]θkb,lk))

subject to k=1Kαkb=1. The optimization problem (17) is solved by a novel real coded genetic algorithm (GA) [13]. This algorithm is global in the sense that it is not sensitive to its initialization and it effectively avoids getting caught in weak local optima. This is important because the parameter estimation for FMMs often leads to complex optimization problems with many local optima. In this work, for even more accurate parameter estimates, we run the GA 10 times and select the parameters with the highest WL value as our estimate.

The tissue type parameters are estimated for each region separately rather than for all regions together partly because of computational considerations and partly because there could exist severe local optimum problems (see [50]) if trying to estimate all the parameters for all regions at once. The WL formulation is motivated by the idea that in the ICBM SVPA we have a number of voxels that belong to only one region Rb, i.e. ri(b) = 1 and ri(b′) = 0 for b′ ≠ b. Therefore, in principle, it could be possible to estimate image parameters for each region by just sampling these voxels. However, this would lead to ignoring a large number of voxel intensities and complicate the estimation of αkb. The concept of WL can be used to obviate these problems [51]. In the WL formulation, each data point is weighted according to its relevance to the estimation task. In our case, the relevance of xi to the estimation of parameters for Rb is given by ri(b). It is interpreted that the basic sample to estimate parameters for Rb is {xi: rb(i) = 1}, and intensities xi with 0 < ri(b) < 1 give complementary information to the estimation process. We refer to [51] for further details on the WL estimation.

The WL formulation departs from the assumptions made for the constructing of the compound MRF. However, as stated above, estimating the parameters for all regions simultaneously would be likely to produce many additional local optima. Indeed, an expectation-maximization (EM) type algorithm can be derived to solve the parameter estimation problem for all regions simultaneously using an FMM-formulation of the site likelihoods in Eq. (8). We have implemented such an algorithm and experimentally found that it was dependent on initial conditions to an extent that the robustness of the automatic segmentation would be compromised. Further, in Section 4.2, we demonstrate that the optimization of the WL objective (17) provides accurate parameter estimates for our purpose and thus this approximation serves well our purposes.

It is important for computational reasons that the WL function can be approximated by a Kullback-Leibler divergence similarly to the approximation of the ordinary likelihood presented in [13]. The only difference to the analysis presented in [13] is that the Parzen density estimate is replaced by a weighted version. This offers a considerable (up 10000 fold) speed-up for GA. A similar consideration is also valid for other parameter estimation algorithms.

3.3. Prior term

We introduce algorithms equipped with two different choices for the prior term, or more particularly, the Gb terms in the prior. SVPASEG-TPM is to be used with images of poor quality or/and low resolution, and SVPASEG-PVE is to be used with high quality and resolution images.

  • In the SVPASEG-TPM algorithm, if ci is a pure tissue class (i.e. if ci [set membership] {CSF, GM, WM}), we set Gb(ci|si) = − ln Patlas(ci|si), where Patlas(ci|si) is the probability of a class ci at the voxel si drawn from a tissue probability map (TPM, see section 2.2). Gb(ci) = ∞ if ci is a mixed tissue class. Hence, this algorithm cannot classify voxels to mixed tissue classes although the partial volume effect is accounted during the parameter estimation. We use the tissue probability maps provided in the ICBM SVPA that we used in the regional parcellation of the image domain.
  • In the SVPASEG-PVE algorithm, we obtain Gb based on the estimated mixture model parameters. This is done by setting Gb(ci)=lnαkb* when ci = lk, where αkb* is the WL estimate of αkb. This choice is valid if the regions are relatively small and the estimates αk* are accurate. The small size of regions is required because then it can be assumed αkb in the mixture model models the prior probability of the voxel si belonging to the class lk.

The prior of the SVPASEG-TPM algorithm makes it more robust to noise and other image imperfections partly due to its use of TPMs and partly due its ability to decide the pure tissue class for each voxel during the optimization of the MRF. SVPASEG-PVE algorithm requires an additional re-classification step for mixed voxels if the interest is in the segmentation into CSF, GM, and WM. We do not use any information besides the voxel intensity value for this re-classification. This makes it more vulnerable to image noise but, at the same time, this reduces the chances that the MRF term oversmooths the segmentation. Also, the estimates of the partial volume coefficients at each voxel are available based on this algorithm (see [49]). However, the focus of the current work is in crisp tissue classification.

We apply an extended Potts model similar to that suggested in [16] to model the functions H(·, ·). It is defined by


where β is a user tunable parameter, d(si, sj) is the distance between the centers of voxels si and sj, and


We set β = 0.05 in all experiments unless otherwise mentioned. This value is based on our experience with this algorithm and our earlier work on MRI segmentation.

3.4. Segmentation algorithm

We now summarize the complete segmentation algorithm. We assume that the atlas consisting of probabilities ri(b), i = 1, …, m, b = 1, …, B and the 2nd order term for the MRF are given. Pre-processing can include intensity non-uniformity correction, and in practice we assume that the images have been skull-stripped. Given that these preliminaries are met, the algorithm segmenting the brain image into WM, GM and CSF is as follows.

  1. For each region Rb, b = 1, …, B, estimate parameters αkb and the parameter vectors θkb, k = 1, …, K by maximizing (17).
  2. Find the MAP segmentation C* by maximizing Eq. (14), where regional likelihood terms and priors are given in sections 3.2 and 3.3, respectively. The optimization is performed with the ICM algorithm [39].
  3. For the SVPASEG-PVE, assign PV voxels into a pure tissue class. This is done by estimating the amount of each tissue type within a voxel as in [49]. The voxel is then assigned to the pure tissue type that is dominant in the voxel. This step is not needed for the SVPASEG-TPM algorithm as it does not allow the PV classes during the step 2.

While the ICM algorithm is a local optimization method, it has favorable properties compared to the global methods such as the Gibbs sampler in addition to computational efficacy: Besag [39] argued that because the MRFs are constructed based on local considerations, it can be better to solve the resulting MAP segmentations locally due to often unknown properties of the global MAP estimate. We note that a number of powerful optimization approaches for MRFs based, for example, on graph cuts and belief propagation have surfaced during recent years (c.f. [52] for an overview and an experimental comparison). However, as these algorithms have not been widely applied to MRI segmentation and the main purpose of this work is to introduce a novel MRF-based image model, we prefer a standard ICM-based optimization strategy.

4. Experiments and results

4.1. Performance metrics

We evaluated our segmentation methods with several datasets for which the ground-truth segmentations are available. Each of these datasets served a specific purpose for our evaluation. The experiments reported in Sections 4.2 and 4.3 depict the performance of SVPASEG in the presence of biological and instrumentation non-uniformity, respectively. The experiments reported in Section 4.4 depict the success of SVPASEG with MR images that correspond to the typical image quality in current neuroscience research.

The Jaccard coefficient1 between the ground-truth and automatic segmentation was used as the performance measure. It was computed separately for each tissue class (CSF, GM, WM). The Jaccard coefficient between the sets U and V is defined as |UV |/|U [union or logical sum] V | [53]. The value 1.0 corresponds to the perfect match and the value 0.0 corresponds to no match. The Dice coefficient defined as 2|UV |/(|U| + |V |) is an alternative to the Jaccard coefficient. The Dice coefficient has also been applied in the evaluation of tissue classification methods. When it is necessary for comparison purposes, we report both Dice and Jaccard coefficient values. There exists a bijective correspondence between the Jaccard and Dice coefficients [16]. However, since this correspondence is non-linear, comparing the average of Jaccard coefficients to the average of Dice coefficients is not feasible.

4.2. Biological non-uniformity: Experiments with simulated data

4.2.1. Material

We simulated MR images while modeling the natural variation of T1 and T2 relaxation times across the brain to study the performance of our method when no instrumentation non-uniformity exists. The anatomical model was constructed based on the BrainWeb [54] and AAL atlases [55]. The BrainWeb atlas consists of a set of 3-D fuzzy tissue membership volumes, one for each tissue class, thus modeling the partial volume effect. To model the T1 and T2 variability, BrainWeb’s GM volume was further divided into 22 (12 cortical, 8 sub-cortical, and 2 cerebellar) membership volumes using the AAL atlas, which was created by manually segmenting a simulated T1-weighted BrainWeb image. Note that AAL atlas is different than the atlas we use for dividing the brain into local domains. Different T1 and T2 relaxation times, drawn from literature, were assigned for these 22 GM regions [26, 27, 29], thus modeling the biological non-uniformity. See Appendix B for details. Proton density (PD) values for different tissue types as well as T1 and T2 for other types of tissue than GM were the same as used in the BrainWeb simulations, see

Given the T1, T2, PD values modeling the acquisition at 1.5T, the signal intensity I was generated using the steady-state equation for the spin-echo sequence with TR of 550 ms and TE of 15ms: I = PD(1 −exp(−TR/T1))exp(−TE/T2). The signal intensities were generated separately for 22 GM structures, WM, CSF, and extra-cerebral structures. The noise-free intensity of each voxel was then generated by multiplying the tissue membership value for each structure at the voxel by the signal intensity of the structure and summing these over all structures. Next, Rician noise was added to the images by adding white Gaussian noise to the real and imaginary parts of the signal and computing the complex modulus. The noise standard deviation was relative to the white matter intensity value. The noise levels were 1%, 3%, 5%, 7% and 9%. Examples of the simulated images are displayed in Fig. 1. Ten noise realizations were generated from each image. The image dimensions were 181 × 217 × 181 with the voxel size of 1mm3.

Figure 1
Examples of axial cross-sections of the simulated images.

4.2.2. Experimental settings

We compared our local MRFs based method (SVPASEG) to the corresponding tissue classification algorithm based on the homogeneous global model (MRFSEG) to investigate if the use of local MRFs improves the segmentation accuracy; MRFSEG was based on the homogeneous second order MRF with the image model parameters estimated by the same GA as used in the SVPASEG method. The MRFSEG algorithm can be formed from the SVPASEG algorithm by a modification of the atlas: in the global MRF-SEG algorithm, the atlas has just one region (the brain) and ri(1) = 1 for every voxel within the brain. We considered both MRFSEG-PVE and MRFSEG-TPM defined as their SVPASEG counterparts (see Section 3.3). For all algorithms, non-brain tissues were removed from consideration based on the ground truth anatomy. As the bias field was not simulated, the bias correction was not performed.

We segmented the images using the unified segmentation algorithm of the SPM5 software package ([10], This algorithm was chosen as the reference method because it is widely applied to MR brain image tissue classification. It incorporates tissue classification, nonlinear stereotactic registration, and non-uniformity correction into a single algorithm. We used two sets of parameters for the algorithm: the default parameters (SPM5), and parameters that discourage the bias field correction (SPM5 no BFC, bias regularization was set to 10). The second parameter set was used because no instrumentation bias was simulated and thus it might be that bias correction would have negative effect to the results. We tested a few other parameter settings, but the default parameter settings appeared to be the best. The ground truth brain mask was used to prune the SPM5 segmentation, which incorporates the brain extraction into the same algorithm.

We subjected the resulting Jaccard coefficients to graphical analysis and statistical hypothesis testing. The hypothesis testing was performed using SOCR: Statistics Online Computational Resource[56]. All p-values we report were based on the (two paired sample) sign test. This test was selected because it is robust with respect to distributional assumptions and the Jaccard coefficient values cannot be assumed to be normally or even symmetrically distributed.

4.2.3. Results

The Jaccard coefficient values averaged over 10 noise realizations are presented in Table 1. Example segmentations are shown in Fig. 2. The results of the SVPASEG algorithms were better than those of the MRFSEG counterpart when the noise level was 1% (one sided p < 10−7) or 3% (one sided p < 10−5). The p-values were computed using the two paired sample sign test and pooling all noise realizations, tissue types, and algorithm types resulting in 60 pairs of Jaccard coefficients. When the noise level was 7% or 9%, the situation was reversed (for both noise levels p < 10−9). This was not surprising since when the noise level increases, the biological variations in image intensities become smaller compared to the level of the instrumentation noise, and the increased complexity of the SVPASEG model (compared to that of MRFSEG) causes more errors in the parameter estimation. Also, the estimated noise variances were typically lower with SVPASEG than with MRFSEG giving more weight to the prior term with MRFSEG. Increasing β would be likely to eliminate the difference between MRFSEG and SVPASEG with higher noise levels.

Figure 2
Examples of axial slices of segmentation results. a) Ground truth; b) SVPASEG-TPM, σ = 1%; c) SVPASEG-PVE, σ = 1%; d) SVPASEG-TPM, σ = 5%; e) SVPASEG-PVE, σ = 5%.
Table 1
The Jaccard coefficients for the simulated images.

The performance of SVPASEG-PVE was superior to SVPASEG-TPM at the lowest noise level (this was evident based on the graphical, line-chart based analysis of the Jaccard coefficient values and no hypothesis test was required; see also Fig. 2 b) and c); ). When the noise level was 3%, no considerable differences in the performance of the algorithms were noticed. When the noise level was higher than 3%, SVPASEG-TPM, which only considers PVE during the parameter estimation, was the best choice. Similar considerations were valid for MRFSEG algorithms. This was not surprising since the PV voxel reclassification step does not use spatial prior information and the inaccuracy of the fractional content estimation increases with the noise level. Thus, it can be expected that SVPASEG-PVE is more vulnerable to noise near the tissue type boundaries than SVPASEG-TPM (see Fig. 2 d) and e)). Conversely, when the noise level was low, the Potts model based MRF without PV classes slightly over-regularized the segmentation near the tissue type boundaries. This effect appeared to be avoided by the MRF including PV classes.

The Jaccard coefficients of SPM5 [10] are also listed in Table 1. We limit our discussion to the comparison of the SPM5 and SVPASEG-TPM because both use tissue probability maps as an aid to classification. SVPASEG-TPM outperformed both SPM5 methods (in both cases one-sided p < 10−13, 150 pairs of samples). This serves as a clear indication that in certain settings using SVPASEG for tissue classification is advantageous.

We studied the accuracy of image parameters estimated by optimizing the WL objective. Based on the anatomical ground-truth, we computed the true image parameters (means and variances for each pure tissue class) and the correct proportions of each tissue type within each of the 12 brain regions with the procedure described in [49]. The averaged (over all domains, noise realizations, and tissue types) absolute error in the estimated tissue class means varied between 0.52 (1% noise) to 2.83 (9% noise) when the image intensities were in the range [0,255]. To illustrate the effect of these parameter estimation errors to the segmentation results, we compared SVPASEG-TPM equipped with true tissue parameters instead of the estimated ones. The average improvement in the Jaccard coefficient was only 0.002 (averaged over all tissue types, noise realizations and noise levels). Thus, we can conclude that the parameter estimation was sufficiently accurate for our purposes.

With these high resolution synthetic images, the time consumption of our C implementation of the SVPASEG algorithm was below 30 minutes for a single image. The time consumptions are reported for a IBM xSeries 335 with 3.0 GHz Intel Xeon processor running on Linux.

4.3. Instrumentation non-uniformity: The IBSR dataset

4.3.1. Material

The proposed tissue classification method was tested with the IBSR (Internet Brain Segmentation Repository) dataset. The IBSR dataset comprises 20 normal MR brain data sets and their semiautomatic segmentations. These are provided by the Center for Morphometric Analysis at Massachusetts General Hospital and are available at We call this dataset IBSR1 to distinguish between it and a more recent dataset, IBSR2, used in the next subsection. These images have been used to evaluate several tissue classification methods. This dataset is especially suitable for the evaluation of the capability of our method to reduce the non uniformity artifacts, since some of the images contain quite high levels of intensity non uniformity. The dimensions for the images were typically 256 × 256 × 62. The voxel size was 1mm × 1mm × 3mm.

4.3.2. Experimental settings

Preprocessing of the images consisted of correction of inter-slice intensity non-uniformity [57] and the brain extraction by the Brain Surface Extractor (BSE) [16]. The ICBM atlas was registered to the images with an affine transformation [58] and resampled to the image dimensions. We considered two sets of images: the first set was corrected for the 3D intensity non-uniformity by the N3 algorithm [15] and the second set images was not corrected for the 3D intensity non-uniformity. We will abbreviate the former set as N3 set and the latter set as No INUC set. Only SVPASEG-TPM and MRFSEG-TPM algorithms were considered due to the low image resolution. The SVPASEG-PVE and MRFSEG-PVE algorithms classify the PV voxels, numerous due to low resolution, without the use of spatial information that is expected to degrade the quality of segmentation results. In addition, intermediate PV classifications were not expected to be particularly useful with these low resolution images.

4.3.3. Results

The average Jaccard coefficient values are listed in Table 2. SVPASEG-TPM achieved the mean Jaccard coefficient values of 0.698 for GM and 0.685 for WM when the preprocessing included the N3 non-uniformity correction and 0.698 for GM and 0.675 for WM when the non-uniformity correction was skipped.

Table 2
The average Jaccard similarity coefficients between automatic and manual segmentation results over the IBSR dataset for various algorithms. With SVPASEG algorithms, N3 refers to the results with N3 non-uniformity correction and No INUC refers to the results ...

MRFSEG-TPM’s mean Jaccard coefficients were 0.695 for GM and 0.673 for WM (with N3 non-uniformity correction). It is not reasonable to test MRFSEG without non-uniformity correction as a preprocessing step since this algorithm assumes that the tissue class intensities follow the same model across the brain. The Jaccard coefficients of individual images for MRFSEG and SVPASEG algorithms are visualized in Fig. 3. Especially with the images containing large intensity non-uniformity artifacts, the results with SVPASEG-TPM were better than with MRFSEG-TPM (see Fig. 3C for an example). Surprisingly, MRFSEG-TPM’s Jaccard coefficients for GM were in some cases better than SVPASEG-TPM’s Jaccard coefficients when both used N3-correction. This did not occur with WM Jaccard coefficients indicating that this lowering was probably due to differences in CSF/GM boundary between the methods (see Fig. 3D for an example). No notable differences between the GM Jaccard coefficient values of SVPASEG-TPM N3 and SVPASEG-TPM No Inuc were observed. For some images, WM Jaccard coefficients were lower for the No Inuc set indicating that the N3 correction was useful.

Figure 3
The GM (A) and WM (B) Jaccard coefficients of MRFSEG-TPM, SVPASEG-TPM (N3), and SVPASEG-TPM (No Inuc). The coefficients ordered so that the GM Jaccard coefficient for MRFSEG-TPM is increasing. Panel C: an example slice of the segmentation results of an ...

Several authors have evaluated their methods for the tissue classification using the IBSR1 dataset. The average Jaccard coefficient values of recent methods are provided in Table 2 along with the average Jaccard coefficient values of the MRFSEG and SVPASEG methods. The Jaccard coefficients included in Table are by Adaptive MAP (A-MAP, [37]), BrainSuite [16], MPM-MAP [23]; PVE-TMCD [49], and CGMM [59]. The average Jaccard coefficient values in Table 2 are as reported in the respective papers except for A-MAP which were obtained from The pre-processing choices (skull stripping, non-uniformity correction, whether they use TPMs) of the algorithms varied. The skull-stripping was implemented in the same way as here with the BrainSuite whereas for the other methods skull stripping method was different. For example, the PVE-TMCD results were based on the brain extraction using the ground-truth brain mask. The MPM-MAP used tissue probability maps to aid the tissue classification similarly to SVPASEG-TPM and MRFSEG-TPM whereas the other methods did not. SVPASEG-TPM produced a higher average Jaccard coefficient values than any of the competing methods for both GM and WM when N3 non-uniformity correction was included in the pre-processing.

The tissue classification algorithms from widely used software packages SPM5 [10] and the FSL’s FIRST [60] were recently evaluated with these data [61]. The Dice coefficient was used as the performance measure. For GM (WM measures were not provided in [61]), the average Dice coefficients were 0.790 (SPM5) and 0.756 (FIRST) while for our methods these were 0.822 (SVPASEG-TPM and N3), 0.821 (SVPASEG-TPM No Inuc) and 0.818 (MRFSEG-TPM).

With the IBSR1 dataset, the processing time of our C implementation of the SVPASEG algorithm was less than 10 minutes for a single image (excluding preprocessing).

4.4. Anatomical variation: the IBSR dataset version 2.0

4.4.1. Material

The IBSR project has also made available a second dataset at This IBSR2 dataset consists of 18 T1-weighted MR brain images and their manual segmentations. The image size was 256 × 256 × 128 with the voxel dimensions varying between 0.84×0.84×1.5 mm3 and 1×1×1.5mm3. The images in the IBSR2 dataset are of better quality than the images in the original IBSR1 dataset and they correspond to the typical image quality in current neuroscience research. The 18 subjects represented a wide age range from 7 years to 71 years of age. Thus, considerable inter-subject differences in the anatomy existed. The images have been ‘positionally normalized’ into the Talairach orientation (rotation only) and non-uniformity corrected by the CMA ‘autoseg’ biasfield correction routines.

4.4.2. Experimental settings

The pre-processing included brain extraction by BSE and affine registration of the ICBM atlas to the images by MNI’s autoreg routines [58] followed by resampling of the atlas to image dimensions. No further bias field correction was done.

We evaluated the segmentation results using both Jaccard and Dice metrics. The Jaccard metric was used to provide a comparison to the other experiments in this work. The Dice metric was used for comparing our method to other recent tissue classification methods evaluated using this data and the Dice similarity metric.

4.4.3. Results

The average quantitative results are displayed in Table 3. The results of SVPASEG and MRFSEG are compared to the CGMM method [59], the EMS method [20], and the Awate method [62]. The results of the CGMM and Awate methods were extracted from the respective papers and the results of the EMS method were extracted from [59].

Table 3
The quantitative results with IBSR2 dataset.

The SVPASEG algorithms produced higher average Dice coefficients than the corresponding MRFSEG algorithms. This can be seen as evidence that the SVPASEG indeed complements the existing nonuniformity correction methods or copes better with the biological nonuniformity. We tried to apply N3 non-uniformity correction to the images, but this degraded the image quality and subsequently the segmentation results, sometimes considerably. The SVPASEG-PVE algorithm was the best of the algorithms, thus the PVE-model for the tissue classification appears to be advantageous for these images. The algorithms using the atlas prior (SVPASEG-TPM and MRFSEG-TPM) produced better segmentations of the GM and worse segmentations of the WM than the others. This was mainly because two reasons: the atlas prior resulted in less GM classified as CSF and more WM classified as GM. The first effect increased the Dice coefficient for GM while the second one decreased the Dice coefficient for WM.

The SVPASEG algorithms produced better average Dice coefficients than the competing methods with a single exception - SVPASEG-TPM’s average Dice coefficient for WM was lower than the Awate’s method. The increase in the average Dice coefficient for SVPASEG-PVE compared to competing methods was 3.6% - 6.4% for GM and 0.4% - 5.2% for WM.

5. Discussion

We have introduced a novel framework for tissue classification in brain MRI founded on local, MRF-based image models. The driving idea was that modeling image content locally rather than globally can provide more accurate classification results. There were two main reasons for this argument. First, natural biological variation of tissue types in different parts of the brain can cause problems for traditional tissue classification schemes where the whole image is assumed to have the same intensity characteristics after the intensity non-uniformity correction. Second, we hoped to cope better with large scale intensity non-uniformity artifacts that may not be completely eliminated by correction algorithms. To meet the aim of locally modeling image content, we introduced the concept of the local MRFs. These local MRFs were then combined to create an MRF-model describing the whole volume. Our framework included modeling of the partial volume effect and spatial dependency of the tissue classes of neighboring voxels. The regions of support for local image models were derived based on SVPAs and they were defined as overlapping fuzzy sets. Thus the resulting global MRF model is spatially continuous.

We evaluated our algorithm with respect to both of the reasons for designing it. The evaluation related to the biological variation was performed based on realistic synthetic images so that we could control the level of biological non-uniformity. The SVPASEG algorithm was shown to provide more accurate segmentation than the corresponding global image model based tissue classification algorithm (MRFSEG) when the noise level was reasonable. Obviously, it is not the case that SVPASEG would lead to improvements in the classification accuracy when the level of instrumentation noise significantly exceeds the level of biological non-uniformity and this was confirmed by the experiments. The evaluation against the instrumentation non-uniformity was performed based on the IBSR1 dataset. This dataset has two favorable features for this evaluation: many tissue classification algorithms have been evaluated using it and it contains images with strong non-uniformity artifacts. In this experiment, SVPASEG-TPM performed favorably compared to MRFSEG-TPM, and particularly, SVPASEG-TPM gave slightly better results than other recent algorithms evaluated using this dataset. We additionally performed an evaluation with IBSR2 dataset comprising of 18 high resolution MRIs of the subjects from a wide age range. The SVPASEG algorithms outperformed the MRFSEG algorithms and a few recent tissue classification methods with these data.

There exist several approaches for brain MRI tissue classification which combine an MRF modeling of the voxel labelling to a non-stationary intensity model accounting for the instrumentation non-uniformity. Many of these, for example [60] and [21], model the intensity non-uniformity as spatially smoothly varying multiplicative noise. The assumption about the multiplicative nature of intensity non-uniformity was criticized in [23] and the authors presented a segmentation algorithm (MPM-MAP) which does not evoke this assumption. However, the smoothness of the non-uniformity was assumed within a tissue class. Measurement noise was modeled with a single noise term which was independent on the tissue-class. A non-parametric, MRF-based approach for tissue classification was suggested in [62] using an adaptive model of image neighborhoods that avoids the explicit modeling of the bias field. However, in [62] no particular attention was drawn to the biological non-uniformity. While these two algorithms [23, 62] are partly based on similar considerations as ours, there are important differences. First, we consider the intensity non-uniformity to arise both from the MR imaging methodology and from biological variations of the tissue composition across the human brain. The biological non-uniformity can violate the assumptions that the intensity non-uniformity is spatially smooth or that the noise level is constant all over the image. Also, there is a certain interplay between the instrumentation and biological non-uniformities in MRI [63]. Second, our approach models the partial volume effect in such a way that allows the estimation of the fractions of the tissue types in each voxel. It would be straightforward to generate the tissue fraction maps with SVPASEG-PVE although we concentrated on hard segmentation in this work. Third, our approach allows the use of different explicit MRF-prior models for different brain regions. This might be useful when tissue classification in a certain brain region is of interest. For example, WM/GM segmentation of the occipital cortex, required in vision research, has been reported to be especially difficult for automatic algorithms. A specific prior for accounting particular features of this brain region could be designed to alleviate these problems. We also evaluated SVPASEG with the same data as MPM-MAP [23] (IBSR1) and Awate’s method [62] (IBSR2). SVPASEG-TPM, which is the most relevant version of the algorithm to compare to MPM-MAP, produced higher average Jaccard coefficients than MPM-MAP. Both versions of SVPASEG produced a higher mean Dice coefficient (averaged over all subjects and both tissue types) than Awate’s method.

We have introduced two versions of the SVPASEG algorithm in this work. SVPASEG-PVE used partial volume (PV) classes during the tissue classification. This was noted to be advantageous with low noise high resolution images. The advantage probably followed from the ability of the algorithm to treat PV voxels (mostly existing at the tissue type boundaries) differently from pure tissue classes and decide their final (pure) class based on the intensity value only. As expected, when the noise level increased or the image quality otherwise degraded, including PV classes into the tissue classification became disadvantageous and SVPASEG-TPM, modeling PV effect only during the parameter estimation, outperformed SVPASEG-PVE. The SVPASEG-TPM algorithm also included an SPM-style tissue probability map prior, which is expected to provide more robustness to the segmentation when the images are of poor quality.

The parameter estimation for local intensity models was performed relying on FMMs and weighted likelihood formulation. These challenging optimization problems were solved by a novel GA [13]. The GA, as a global optimization method, is not overly sensitive to its initialization and can provide accurate parameter estimates even when the optimization problem is difficult. The FMM based method was selected in part by its generality. For example, it would be possible to apply the SVPASEG method for tissue classification in brain MRI from other species in addition to human brain MRI. Indeed, algorithms for tissue classification of human brain MRI are not typically directly suitable for tissue classification in mouse brain MRI as we have demonstrated [13]. We experimentally confirmed with the synthetic MR images that the classification accuracy improves only minimally if the true values of the tissue type parameters were used instead of the parameter values estimated by the GA. This confirmation served two related purposes. It demonstrated that the heuristic weighted likelihood formulation leads to accurate tissue-class parameter estimates, and more importantly, it appears that no major improvements to the tissue classification accuracy can be gained by an alternate parameter estimation approach. However, if the interest lies in partial volume estimation (estimation of tissue fraction maps) instead of the tissue classification, we have previously found that application specific parameter estimation techniques offer an advantage over general techniques [64]. A minor disadvantage of the specialized parameter estimation techniques such as those based on outlier detection [64, 49] is that the estimation of the number of voxels of a particular tissue type, required for the SVPASEG-PVE algorithm, would not be possible.

In summary, we have proposed a framework that makes less assumptions about the sources and nature of intensity non-uniformity within brain MRI than many of the existing methods for tissue classification and we have supplied computational tools to support this framework. This tissue classification method can be combined with an existing non-uniformity correction method if that is considered to be beneficial. We have supplied experimental evidence about the increased segmentation accuracy of the general framework proposed in several settings. Dividing the brain into 12 sub-domains, as we have done, may not be optimal. It is obvious that the performance can be improved if a specific division is utilized. A more precise sub-division based on an analysis of the biological and instrumentation intensity non-uniformities and their combined effects to tissue classification could bring further improvements to the tissue classification results. However, more precise sub-divisions might be dependent on a particular pulse sequence or scanner and advantages they offer would be more dependent on the accuracy of stereotactic registration than with the coarse sub-division used here.

Supplementary Material



Thanks to Grace Liang for her help in typing. The SVPASEG and MRF-SEG algorithms are available at

This work is funded by the National Institutes of Health through the NIH Roadmap for Medical Research, Grant U54 RR021813. Information on the National Centers for Biomedical Computing can be obtained from This work was supported by research grants P41 RR013642 and R01 MH071940 (funded by the NIMH). J. Tohka’s work was supported by the Academy of Finland (application number 129657, Finnish Programme for Centres of Excellence in Research, 2006-2011) and the University Alliance Finland Cluster of Excellence STAT-CORE.

Appendix A: Extending gray matter SVPA to white matter and CSF regions

We describe the procedure used to extend to gray matter ICBM SVPA to cover also white matter and CSF regions. Denote the set of voxels in the brain by BM and the set of voxels belonging to a GM region (in the ICBM-SVPA) with probability greater than zero by GM. Our purpose is generate a SVPA {ri(b), si [set membership] BM, b = 1, …, B} covering whole brain given a partial SVPA { ri*(b), si [set membership] GM, b = 1,…,B} such as ICBM SVPA. First, we create an initial SVPA by setting

ri0(b)=ri*(b)ifsi[set membership]GM;ri0(b)=0otherwise.

Given this initialization we then proceed by evolving the fields { ri0(b)} by the 3D heat equation for all b = 1, …, B separately. Note that this amounts to iterated Gaussian smoothing [65]. The evolution is terminated when all rit(b)>0 for some b. Then, we obtain our SVPA by setting

ri(b)=ri*(b)ifsi[set membership]GM;ri(b)=rit(b)b=1Brit(b)otherwise.

Appendix B: T1 and T2 values for simulated images

T1 and T2 relaxation times in ms used in the simulation are provided in table 4. T1 values reported in [27] at 3.0T were converted to corresponding values at 1.5T using the relationship T1 = a(fq), where f is the Larmor frequency of the scanner, and a and q are constants [66]. The constants were solved assuming that the T1 values at frontal lobe should be equal in [26] and [27]. T2 values should theoretically be independent on the field strength. However, considerable differences in T2 values between [29] at 1.5T and [27] at 3.0T were observed; these were corrected for in the same way as T1 relaxation times.

Table 4
T1 and T2 relaxation times used in the simulation in ms.


1The Jaccard coefficient is also known as the Tanimoto coefficient in image processing literature.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.


1. Toga A, Mazziotta J, editors. Brain Mapping: The Methods. 2. Academic Press; San Diego: 2002.
2. Dale A, Fischl B, Sereno M. Cortical surface-based analysis. I. segmentation and surface reconstruction. NeuroImage. 1999;9:179–194. [PubMed]
3. MacDonald D, Kabani N, Avis D, Evans A. Automated 3-D extraction of inner and outer surfaces of cerebral cortex from MRI. Neuroimage. 2000;12(3):340–356. [PubMed]
4. Zeng X, Staib L, Schultz R, Duncan J. Segmentation and measurement of the cortex from 3-D MR images using coupled-surfaces propagation. IEEE Trans Med Imaging. 1999;18(10):927–937. [PubMed]
5. Xu C, Pham D, Rettman M, Yu D, Prince J. Reconstruction of the human cerebral cortex from magnetic resonance images. IEEE Trans Med Imaging. 1999;18(6):467–480. [PubMed]
6. Shattuck D, Leahy R. BrainSuite: An automated cortical surface identification tool. Med Image Anal. 2002;6:129–142. [PubMed]
7. Kim JS, Singh V, Lee JK, Lerch J, Ad-Dab’bagh Y, Mac-Donald D, Lee JM, Kim SI, Evans AC. Automated 3-d extraction and evaluation of the inner and outer cortical surfaces using a laplacian map and partial volume effect classification. Neuroimage. 2005;27(1):210–221. [PubMed]
8. Warfield SK, Kaus M, Jolesz FA, Kikinis R. Adaptive, template moderated, spatially varying statistical classification. Med Image Anal. 2000;4(1):43–55. [PubMed]
9. Fischl B, Salat DH, van der Kouwe AJ, Makris N, Segonne F, Quinn BT, Dale AM. Sequence-independent segmentation of magnetic resonance images. NeuroImage. 2004;23:S69–S84. [PubMed]
10. Ashburner J, Friston K. Unified segmentation. NeuroImage. 2005;26:839–851. [PubMed]
11. Pohl K, Fisher J, Grimson W, Kikinis R, Wells W. A bayesian model for joint segmentation and registration. Neuroimage. 2006;31:228–239. [PubMed]
12. Heckemann RA, Hajnal JV, Aljabar P, Rueckert D, Hammers A. Automatic anatomical brain mri segmentation combining label propagation and decision fusion. Neuroimage. 2006;33(1):115–126. [PubMed]
13. Tohka J, Krestyannikov E, Dinov ID, MacKenzie-Graham A, Shattuck DW, Ruotsalainen U, Toga AW. Genetic algorithms for finite mixture model based voxel classification in neuroimaging. IEEE Trans Med Imaging. 2007;26(5):696–711. [PMC free article] [PubMed]
14. Belaroussi B, Milles J, Carme S, Zhu Y, Benoit-Cattin H. Intensity non-uniformity correction in mri: Existing methods and their validation. Med Image Anal. 2006;10:234–246. [PubMed]
15. Sled JG, Zijdenbos AP, Evans AC. A non-parametric method for automatic correction of intensity non-uniformity in MRI data. IEEE Trans Med Imaging. 1998;17(1):87–97. [PubMed]
16. Shattuck DW, Sandor-Leahy SR, Schaper KA, Rottenberg DA, Leahy RM. Magnetic resonance image tissue classification using a partial volume model. Neuroimage. 2001;13(5):856–876. [PubMed]
17. Likar B, Viergever MA, Pernus F. Retrospective correction of mr intensity inhomogeneity by information minimization. IEEE Trans Med Imaging. 2001;20(12):1398–1410. [PubMed]
18. Manjon JV, Lull JJ, Carbonell-Caballero J, Garcia-Marti G, Marti-Bonmati L, Robles M. A nonparametric mri inhomogeneity correction method. Med Image Anal. 2007;11(4):336–345. [PubMed]
19. Wells W, III, Grimson W, Kikinis R, Jolesz FA. Adaptive segmentation of MRI data. IEEE Trans Med Imaging. 1996;15(4):429–442. [PubMed]
20. Van Leemput K, Maes F, Vandermeulen D, Suetens P. Automated model-based tissue classification of MR images of the brain. IEEE Trans Med Imaging. 1999;18(10):897–908. [PubMed]
21. Van Leemput K, Maes F, Vandermeulen D, Suetens P. Automated model-based bias field correction of mr images of the brain. IEEE Trans Med Imaging. 1999;18(10):885–896. [PubMed]
22. Sled J, Pike G. Understanding intensity non-uniformity in mri. Medical Image Computing and Computer Assisted Intervention (MIC-CAI98), Lecture notes in computer science. 1998;1496:615–622.
23. Marroquin J, Vemuri B, Botello S, Calderon F, Fernandez-Bouzas A. An accurate and effcient bayesian method for automatic segmentation of brain MRI. IEEE Trans Med Imaging. 2002;21(8):934–945. [PubMed]
24. Fischl B, Salat D, Busa E, Albert M, Dietrich M, Haselgrove C, van der Kouwe A, Killany R, Kennedy D, Klaveness S, Montillo A, Makris N, Rosen B, Dale A. Whole brain segmentation: Automated labeling of neuroanatomical structures in the human brain. Neuron. 2002;33:341–355. [PubMed]
25. Amato U, Larobina M, Antoniadis A, Alfano B. Segmentation of magnetic resonance brain images through discriminant analysis. J Neurosci Methods. 2003;131:65–74. [PubMed]
26. Steen R, Reddick W, Ogg R. More than meets the eye: significant regional heterogeneity in human cortical T1. Magn Reson Imaging. 2000;18:361–368. [PubMed]
27. Wansapura JP, Holland SK, Dunn RS, Ball WS. Nmr relaxation times in the human brain at 3.0 tesla. J Magn Reson Imaging. 1999;9(4):531–538. [PubMed]
28. Cho S, Jones D, Reddick W, Ogg R, Steen R. Estabilishing norms for age-related changes in proton t1 of human brain tissue in vivo. Magn Reson Imaging. 1997;15:1133–1143. [PubMed]
29. Supprian T, Hofmann E, Warmuth-Metz M, Franzek E, Becker T. MRI t2 relaxation times of brain regions in schizophrenic patients and control subjects. Psychiatry Res. 1997;75(3):173–182. [PubMed]
30. Warntjes JBM, Dahlqvist O, Lundberg P. Novel method for rapid, simultaneous t1, t*2, and proton density quantification. Magn Reson Med. 2007;57(3):528–537. [PubMed]
31. Grabowski TJ, Frank RJ, Szumski NR, Brown CK, Damasio H. Validation of partial tissue segmentation of single-channel magnetic resonance images of the brain. Neuroimage. 2000;12(6):640–656. [PubMed]
32. Kovacevic N, Lobaugh NJ, Bronskill MJ, Levine B, Feinstein A, Black SE. A robust method for extraction and automatic segmentation of brain images. Neuroimage. 2002;17(3):1087–1100. [PubMed]
33. Nocera L, Gee J. Robust partial volume tissue classification of cerebral MRI scans. SPIE Medical Imaging 1997: Image Processing. 1997:312–322.
34. Pappas T. An adaptive clustering algorithm for image segmentation. IEEE Transactions on Signal Processing. 1992;40:901–914.
35. Yan M, Karp J. Segmentation of 3D brain MR using an adaptive k-means clustering algorithm. 1994 Nuclear Science Symposium and Medical Imaging Conference; 1994. pp. 1529–1533.
36. Yan M, Karp J. An adaptive bayesian approach to three-dimensional MR brain segmentation. Proceedings of Information Processing in Medical Imaging (IPMI95) 1995:201–213.
37. Rajapakse JC, Giedd JN, Rapoport JL. Statistical approach to segmentation of single-channel cerebral MR images. IEEE Trans Med Imaging. 1997;16(2):176–186. [PubMed]
38. Li Z. Markov random fields in computer vision. Springer Verlag; 1995.
39. Besag J. On the statistical analysis of dirty pictures. J R Stat Soc, Ser B. 1986;48(3):259–302.
40. Kaiser M, Cressie N. The construction of multivariate distributions from markov random fields. J Multivariate Analysis. 2000;73:199–220.
41. Geman S, Geman D. Stochastic relaxation, Gibbs distributions and the Bayesin restoration of images. IEEE Trans Patt Anal Mach Intell. 1984;6(6):721–741. [PubMed]
42. Mega M, Dinov I, Thompson P, Manese M, Lindshield C, Moussai J, Tran N, Olsen K, Felix J, Zoumalan C, Woods R, Toga A, Mazziotta J. Automated brain tissue assessment in the elderly and demented population: Construction and validation of a sub-volume probabilistic brain atlas. NeuroImage. 2005;26(4):1009–1018. [PubMed]
43. Kamber M, Shinghal R, Collins DL, Francis GS, Evans AC. Model-based 3-D segmentation of multiple sclerosis lesions in magnetic resonance brain images. IEEE Transactions in Medical Imaging. 1995;14(3):442–453. [PubMed]
44. Mazziotta J, Toga AW, Evans AC, et al. A four-dimensional probabilistic atlas of the human brain. J Am Med Inform Assoc. 2001;8:401–430. [PMC free article] [PubMed]
45. Kittler J. Combining classifiers: A theoretical framework. Pattern Analysis and Applications. 1998;1:18–27.
46. Choi HS, Haynor DR, Kim Y. Partial volume tissue classification of multichannel magnetic resonance images - a mixel model. IEEE Trans Med Imag. 1991;10(3):395–407. [PubMed]
47. Gudbjartsson H, Patz S. The rician distribution of noisy mri data. Magn Reson Med. 1995;34(6):910–914. [PMC free article] [PubMed]
48. Santago P, Gage HD. Statistical models of partial volume effect. IEEE Trans Image Process. 1995;4(11):1531–40. [PubMed]
49. Tohka J, Zijdenbos A, Evans A. Fast and robust parameter estimation for statistical partial volume models in brain MRI. NeuroImage. 2004;23(1):84–97. [PubMed]
50. McLachlan G, Peel D. Finite Mixture Models. Wiley; New York: 2000.
51. Hu F, Zidek J. The weighted likelihood. The Canadian Journal of Statistics. 2002;30:347–371.
52. Szeliski R, Zabih R, Scharstein D, Veksler O, Kolmogorov V, Agarwala A, Tappen M, Rother C. A comparative study of energy minimization methods for markov random fields with smoothness-based priors. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2008;30:1068–1080. [PubMed]
53. Jackson D, Somers K, Harvey H. Similarity coefficients: Measures of co-occurrence and association or simply measures of occurrence? The American Naturalist. 1989;133(3):436–453.
54. Collins DL, Zijdenbos A, Kollokian V, Sled J, Kabani N, Holmes C, Evans AC. Design and construction of a realistic digital brain phantom. IEEE Trans Med Imaging. 1998;17(3):463–468. [PubMed]
55. Tzourio-Mazoyer N, Landeau B, Papathanassiou D, Crivello F, Etard O, Delcroix N, Mazoyer B, Joliot M. Automated anatomical labeling of activations in spm using a macroscopic anatomical parcellation of the mni mri single-subject brain. Neuroimage. 2002;15(1):273–289. [PubMed]
56. Dinov I. Statistics online computational resource. Journal of Statistical Software. 2006;16:1–16. [PMC free article] [PubMed]
57. Zijdenbos A, Dawant BM, Margolin RA. Intensity correction and its effect on measurement variability in the computer-aided analysis of MRI. Proc. of the 9th Internation Symposium and Exhibition on Computer Assisted Radiology(CAR); 1995. pp. 216–221.
58. Collins DL, Neelin P, Peters TM, Evans AC. Automatic 3d inter-subject registration of mr volumetric data in standardized talairach space. J Comput Assist Tomogr. 1994;18(2):192–205. [PubMed]
59. Greenspan H, Ruf A, Goldberger J. Constrained gaussian mixture model framework for automatic segmentation of mr brain images. IEEE Trans Med Imaging. 2006;25(9):1233–1245. [PubMed]
60. Zhang Y, Brady M, Smith S. Segmentation of brain mr images through a hidden random markov field model and the expectation-maximization algorithm. IEEE Trans Med Imaging. 2001;20(1):45–57. [PubMed]
61. Tsang O, Gholipour A, Kehtarnavaz N, Gopinath K, Briggs R, Panahi I. Comparison of tissue segmentation algorithms in neuroimage analysis software tools. Engineering in Medicine and Biology Society; EMBS 2008. 30th Annual International Conference of the IEEE; 2008; 2008. pp. 3924–3928. [PubMed]
62. Awate SP, Tasdizen T, Foster N, Whitaker RT. Adaptive markov modeling for mutual-information-based, unsupervised mri brain-tissue classification. Med Image Anal. 2006;10(5):726–739. [PubMed]
63. Vovk U, Pernus F, Likar B. A review of methods for correction of intensity inhomogeneity in mri. IEEE Trans Med Imaging. 2007;26(3):405–421. [PubMed]
64. Manjon J, Tohka J, Garcia-Marti G, Carbonell-Caballero J, Lull J, Marti-Bonmati L, Robles M. Robust MRI brain tissue parameter estimation by multi-stage outlier detection. Magn Reson Med. 2008;59:866–873. [PubMed]
65. Koenderink J. The structure of images. Biol Cybern. 1984;50:363–370. [PubMed]
66. Bottomley PA, Foster TH, Argersinger RE, Pfeifer LM. A review of normal tissue hydrogen nmr relaxation times and relaxation mechanisms from 1–100 mhz: dependence on tissue type, nmr frequency, temperature, species, excision, and age. Med Phys. 1984;11(4):425–448. [PubMed]