Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Neuroinformatics. Author manuscript; available in PMC 2018 January 1.
Published in final edited form as:
PMCID: PMC5438876

Metric Learning for Multi-atlas based Segmentation of Hippocampus

Hancan Zhu,1 Hewei Cheng,2 Xuesong Yang,3 Yong Fan,4 and for the Alzheimer's Disease Neuroimaging Initiative*


Automatic and reliable segmentation of hippocampus from MR brain images is of great importance in studies of neurological diseases, such as epilepsy and Alzheimer’s disease. In this paper, we proposed a novel metric learning method to fuse segmentation labels in multi-atlas based image segmentation. Different from current label fusion methods that typically adopt a predefined distance metric model to compute a similarity measure between image patches of atlas images and the image to be segmented, we learn a distance metric model from the atlases to keep image patches of the same structure close to each other while those of different structures are separated. The learned distance metric model is then used to compute the similarity measure between image patches in the label fusion. The proposed method has been validated for segmenting hippocampus based on the EADC-ADNI dataset with manually labelled hippocampus of 100 subjects. The experiment results demonstrated that our method achieved statistically significant improvement in segmentation accuracy, compared with state-of-the-art multi-atlas image segmentation methods.

Keywords: Multi-atlas image segmentation, Hippocampus segmentation, Metric learning, Label fusion


Hippocampus is an important subcortical structure whose function is associated with learning and memory (den Heijer et al., 2012). Volumetric analysis of the hippocampus based on magnetic resonance imaging (MRI) has been widely adopted in studies of neurological diseases, such as epilepsy (Akhondi-Asl et al., 2011) and Alzheimer’s disease (Wolz et al., 2014). However, manual segmentation of the hippocampus from MRI brain images is time consuming (Carmichael et al., 2005) and suffers from high intra-operator and inter-operator variability (Chupin et al., 2007). Therefore, automatic and reliable segmentation of the hippocampus from MR brain images has been a hot research topic in medical image analysis.

In the last decade, multi-atlas based image segmentation (MAIS) methods have been developed and widely adopted in studies of the hippocampus segmentation (Warfield et al., 2004, Heckemann et al., 2006, Artaechevarria et al., 2009, Dill et al., 2014, Iglesias and Sabuncu, 2015). A typical MAIS method consists of three steps: atlas image selection, atlas image registration, and segmentation label fusion. In the atlas image selection step, a subset of atlas images is selected for a given target image based on a pre-defined measurement of anatomical similarity, usually according to image intensities, e.g., sum of squared differences, correlation, or mutual information (Aljabar et al., 2009, Xie and Ruan, 2014, Yan et al., 2015). In the atlas image registration step, the spatial correspondence between each atlas image and the target image is determined and the atlas images and their corresponding label maps are aligned to the target image (Lötjönen et al., 2010, Doshi et al., 2015). Finally, in the segmentation label fusion step, the warped label maps are fused to get a consensus label map for the target image (Warfield et al., 2004, Artaechevarria et al., 2009, Coupé et al., 2011, Hao et al., 2014).

Although a variety of atlas image selection strategies and different image registration techniques can be adopted in an MAIS method, the existing MAIS methods are typically characterized by their label fusion strategies. Among the existing label fusion strategies, weighted voting label fusion methods have attracted considerable attention. Assuming that the image registration from atlas images to the target image is reliable, traditional weighted voting label fusion strategies combine the corresponding labels based on predefined weighting models (Rohlfing et al., 2004, Heckemann et al., 2006, Artaechevarria et al., 2009, Sabuncu et al., 2010). The simplest method might be the majority voting which assigns a constant weight value for all atlases (Rohlfing et al., 2004, Heckemann et al., 2006). Better segmentation performance can be obtained with more sophisticated voting strategies, such as local weighted voting with inverse similarity metric (Artaechevarria et al., 2009) and local weighted voting with Gauss similarity metric (Sabuncu et al., 2010). It has been shown that local weighted voting strategies outperform global methods in segmenting high-contrast structures, but global techniques are less sensitive to noise when contrast between neighboring structures is low (Artaechevarria et al., 2009). Some of the weighted voting label fusion methods can be seen as special cases of a probabilistic generative model (Sabuncu et al., 2010).

Due to inter-subject anatomical variability, the registered atlas images are not always aligned with the target image perfectly. The image registration errors may hamper the label fusion if it is based on local image similarity measures with an assumption that voxel to voxel correspondence exists between atlas images and the target image. Such a problem can be effectively overcome by nonlocal patch based weighted voting methods (Coupé et al., 2011, Rousseau et al., 2011). In the nonlocal patch based weighted voting methods, all voxels in a searching region are selected and patches centered at these voxels are extracted as image patches in each warped atlas image. Voting weights are then computed according to the intensity similarities between the atlas image patches and the target image patch.

Many approaches have been proposed to obtain weighting coefficients for improving segmentation accuracy and robustness of the nonlocal patch based weighted voting methods, for example reconstruction based methods (Liao et al., 2013, Wu et al., 2014) and joint label fusion (JLF) method (Wang et al., 2013). Reconstruction based methods computed the reconstruction coefficients of the target patch from a patch library by sparse representation (Liao et al., 2013) or local independent projection (Wu et al., 2014), and then used them to combine atlas labels to label the target voxel. Since different atlases may produce similar label errors (Wang et al., 2013), the JLF method minimized the total expectation of labeling error by explicitly modeling pair-wise dependency between atlases as a joint probability of two atlases that make similar segmentation errors.

The existing MAIS methods typically measure the similarity of image patches based on Euclidean distance metric. However, Euclidean distance metric is not necessarily optimal for the label fusion since they do not characterize any statistical distributions of image intensities in the patches. The statistical distributions of image intensities could be estimated from the atlas images and their associated segmentation labels, but might vary at different image locations. It has been reported that patches with similar intensity values may have different segmentation labels, which will lead to segmentation errors in MAIS methods (Bai et al., 2015). To overcome this problem, we present a kernel classification method for metric learning such that image patches of the same structure keep close to each other and those of different structures are separated. With the obtained metric, we develop an optimal nonlocal weighted voting label fusion method. We have validated the proposed method for segmenting the hippocampus from MRI brain images, and compared our method with state-of-the-art MAIS techniques, including majority voting method (MV) (Rohlfing et al., 2004, Heckemann et al., 2006), local weighted voting with Inverse similarity metric (LW-INV) (Artaechevarria et al., 2009), local weighted voting with Gauss similarity metric (LW-GU) (Sabuncu et al., 2010), nonlocal patch based weighted voting with Gauss similarity metric (NLW-GU) (Coupé et al., 2011, Rousseau et al., 2011), local label learning (LLL) (Hao et al., 2014), and the JLF method (Wang et al., 2013). The experimental results have demonstrated that our method could achieve better segmentation performance than the state-of-the-art MAIS methods.

Materials and Methods

Image Dataset

The proposed algorithm was validated for segmenting the hippocampus based on the first release of EADC-ADNI dataset, consisting MRI scans and their corresponding hippocampus labels of 100 subjects ( These images were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (, and the subjects are from 3 diagnosis groups, including normal controls (NC), mild cognitive impairment (MCI), and patients with Alzheimer’s disease (AD).

The Principal Investigator of the ADNI is Michael W. Weiner, MD, VA Medical Center and University of California-San Francisco. ADNI is the result of efforts of many co-investigators from a broad range of academic institutions and private corporations, and subjects have been recruited from over 50 sites across the U.S. and Canada. The initial goal of ADNI was to recruit 800 adults, ages 55 to 90, to participate in the research, approximately 200 cognitively normal older individuals to be followed for 3 years, 400 people with MCI to be followed for 3 years, and 200 people with early AD to be followed for 2 years. For up-to-date information, see

Each of the MRI brain images was manually labeled according to a harmonized protocol (Boccardi et al., 2015). All images have been processed using a standard preprocessing protocol, including alignment along the line passing through the anterior and posterior commissures of the brain (AC-PC line) and bias field correction. And they have been warped into the MNI152 template space using linear image registration with affine transformation. We randomly select 40 subjects as training set, and other 60 subjects as testing set. Clinical scores and demographic information of these subjects are summarized in Table 1.

Table 1
Demographic data and clinical scores of the subjects.

Metric Learning for Multi-atlas based Image Segmentation

Given a target image I, and N atlases Ãi = (Ĩi, Li), i = 1,2, …, N, where Ĩi is the i-th image and Li is its segmentation label with value 1 indicating foreground and 0 indicating background, the multi-atlas segmentation method registers each atlas image Ĩi to the target image and propagates the corresponding segmentation Li to the target space, resulting N warped atlases Ai = (Ii, Li), i = 1,2, …, N. Then, it infers the label of each voxel of the target image from the warped atlases. Figure 1 shows a flowchart for segmenting an image with a typical multi-atlas image segmentation method.

Figure 1
The flowchart for segmenting a target image with the multi-atlas based image method.

Identification of a bounding box of hippocampus

Since all images were aligned to the MNI152 template using linear image registration with affine transformation and resampled to have voxel size of 1×1×1mm3, a bounding box can be identified for both the left and right hippocampus to cover the hippocampus of unseen target image. In particular, we scan all the atlases to find the minimum and maximum x, y, z positions of the hippocampus and add 7 voxels in each direction to cover the hippocampus of unseen testing images.

Atlas selection and Image Registration

For each target image, we select the top 20 most similar atlases based on normalized mutual information (NMI) between the target image and the atlas images within the bounding box (Hao et al., 2014). After the atlas selection, we register each atlas image to the target image using a nonlinear, cross-correlation-driven image registration algorithm, namely ANTs (Avants et al., 2008), with the following command: ANTS 3 -m CC [target.nii, source.nii, 1, 2] −i 100×100×10 −o output.nii -t SyN[0.25] −r Gauss[3,0]. The nonlinear registration was applied to the image blocks within the bounding box.

Initial segmentation with majority voting

To reduce the computational cost, we adopt the majority voting based label fusion to obtain an initial segmentation result of the target image. For each voxel, the output of the majority voting label fusion is a probability value of the voxel belonging to the hippocampus. The segmentation result of voxels with 100% certainty (probability value of 1 or 0) can be directly taken as the final segmentation result (Hao et al., 2014). Then, our method is applied to voxels with probability values greater than 0 and smaller than 1.

Training patch library construction

To label a voxel of the target image, a set of voxel-wise training samples is identified from the warped atlases. Since the registered atlas images are not always aligned with the target image perfectly, we adopt the nonlocal patch based label fusion framework to construct a training library of image patches (Coupé et al., 2011, Rousseau et al., 2011). For labeling a target voxel, voxels in a cube-shaped searching neighborhood V with size (2rs + 1) × (2rs + 1) × (2rs + 1) of each atlas image are selected, and patches centered at these voxels are extracted and vectorized to form a patch library P = [p1, p2, …, pn], where n = N · (2rs + 1)3 is the number of selected patches. And the segmentation label of each image patch’s center voxel is used as the image patch’s label, li, i = 1,2, …, n. Thus, we construct a training dataset Δ = {(pi, li)|i = 1,2, …, n}, where pi is the i-th image patch in the patch library P and li is the label of its center voxel.

Metric learning

Learning a distance metric from training samples is an important machine learning topic. Many methods have been proposed to learn distance/similarity metrics (Xing et al., 2002). Among them, learning a Mahalanobis distance metric for k-nearest neighbor classification has been successfully applied to many computer vision problems (Guillaumin et al., 2009). In this study, we adopt a supervised metric learning method to learn a Mahalanobis distance metric from the training dataset of image patches (Wang et al., 2015).

Given any two samples (pi, li) and (pj, lj) from the training dataset Δ, we obtain a doublet (pi, pj) with a label h, where h = −1 if li = lj, and h =1 otherwise. For each training sample pi, we find its m1 nearest similar neighbors, denoted by {pi,1s,,pi,m1s}, and its m2 nearest dissimilar neighbors, denoted by {pi,1d,,pi,m2d}, and construct (m1 + m2) doublets:


By collecting all possible doublets, we build a doublet set, denoted by {z1, …, zNd}, where zj = (pj,1, pj,2), j = 1,2, …, Nd, and the label of zj is denoted by hj. Given the doublet set {z1, …, zNd}, we use a kernel method to learn a classifier


where zj is the j-th doublet, hj is its label, z = (pk1, pk2) is a testing doublet, K (·,·) is a degree-2 polynomial kernel, defined as


Then, we have


where M = Σj hj αj(pj,1pj,2)T is the matrix to be learned in the Mahalanobis distance metric. Once M is obtained, the kernel decision function g(z) can be used to determine whether pk1 and pk2 are similar or dissimilar to each other.

To learn M in the Mahalanobis metric, we adopt a support vector machine (SVM) model:




where ‖·‖F is the Frobenius norm. The Lagrange dual problem of the above doublet-SVM model is




The optimization problem can be solved using SVM solvers. In the current study, we implemented the metric learning method based on LibSVM and metric learning codes (Chang and Lin, 2011, Wang et al., 2015).

To ensure M to be positive semi-definite, we compute a singular value decomposition of M = UΛV, and preserve only the positive singular values in Λ to form another diagonal matrix Λ+. Then, we let M+ = UΛ+V.

Label fusion with the learned metric

With the learned Mahalanobis distance metric M, we obtain a new metric space by introducing a norm ·M:xM=xTMx. And the distance between two samples is defined by d (x, y) = ‖xyM.

Given a target image patch px and training image patches pi, i = 1,2, …, n, we compute their distances by


According to these distances, we select k nearest training samples {(psj, lsj)|j = 1,2, …, k} to form a nearest neighborhood set [mathematical script N]k(pk) and assign their similarity weights to be one, others to be zero:


Then, we use L^(x)=Σi=1nw(px,pi)liΣi=1nw(px,pi) to compute the target voxel’s label. Finally, the estimated label of L(x) is thresholded to obtain a binary segmentation label L(x)={1,L^(x)>0.50,L^(x)<0.5.

Our label fusion method is essentially a k-nearest neighbor (k-NN) classification method. In the weighted voting label fusion, two strategies are available to achieve label fusion: single-point estimation strategy and multi-point strategy. In the single-point estimation strategy the label estimated from each image patch is applied to its center voxel. In the multi-point estimation strategy, the label estimated from each patch is applied to all voxels covered by the image patch itself (Rousseau et al., 2011, Wang et al., 2013, Sanroma et al., 2015). Since each voxel has multiple estimated labels from image patches that cover the voxel itself, majority voting of the multiple estimated labels can be adopted to compute a final segment label.


We optimized the parameters of our method based on the training dataset, and then evaluated the segmentation performance based on the testing dataset. We adopted 9 segmentation evaluation measures to evaluate the image segmentation results (Jafari-Khouzani et al., 2011). By denoting A as the manual segmentation, B as the automated segmentation, and V(X) as the volume of segmentation result X, these evaluation measures are defined as:




HD95:similar to HD, except that 5% data points with the largest distance are removed before calculation,


RMSD=DA2+DB2card{A}+card{B}, where DA2=eA(minfBd(e,f)).

In the above definition, [partial differential]A denotes the boundary voxels of A, d(·,·) is the Euclidian distance of two points, card{·} is the cardinality of a set.

Optimization of parameters

The proposed method has following parameters: patch radius rp, searching radius rs, regularization parameter C in SVM, numbers of the nearest similar and dissimilar neighbors m1, m2 for constructing doublets, and number of the nearest neighbors k for selecting the most similar samples for label fusion. According to (Wang et al., 2015), we set C = 1, m1 = m2 = 1. We also fixed the searching radius rs = 1 (within a searching neighborhood of 3×3×3), since a nonlinear image registration algorithm was used to warp atlas images to the target image.

The other two parameters rp and k were determined empirically in {1, 2, 3} and {3, 9, 27}, respectively, based on the training set with 40 leave-one-out cross-validation experiments. Figure 2 shows average segmentation accuracy measured by the Dice index across 40 leave-one-out cross-validation experiments with different combinations of parameters, indicating that the optimal segmentation performance could be obtained with rp = 1 and k = 9.

Figure 2
Average segmentation accuracy measured by Dice index for segmentation results obtained in 40 leave-one-out cross-validation experiments with different combinations of parameters rp and k.

Comparison with existing MAIS methods

The proposed method, referred to as nonlocal patch based weighted voting with metric learning (NLW-ML) hereafter, was compared with 6 state-of-the-art MAIS methods, including MV (Rohlfing et al., 2004, Heckemann et al., 2006), LW-INV (Artaechevarria et al., 2009), LW-GU (Sabuncu et al., 2010), NLW-GU (Coupé et al., 2011, Rousseau et al., 2011), LLL (Hao et al., 2014), and JLF (Wang et al., 2013).

The parameters of all these methods were optimized based on the same training dataset with the same parameter selection strategy. For LW-GU, patch radius rp and σx in the Gauss similarity metric need to be determined. With cross-validation, the optimal value of rp was 2 selected from {1, 2, 3}, and σx was adaptively set as σx = minxi{‖P(x) − P(xi)‖2 + ε}, i = 1‥N, where ε is a small constant to ensure numerical stability with a value 1e-20. LW-INV has 2 parameters, namely patch radius rp and γ in the inverse function model. The optimal values were rp = 2 and γ = −3, obtained from the range of {1, 2, 3} and {−0.5, −1, −2, −3} respectively. NLW-GU has 3 parameters, namely searching radius rs, patch radius rp, and σx in the Gauss similarity metric model. Similar to the NLW-ML, the searching radius rs was set to be 1. Based on the same cross-validation strategy, the optimal value of rp was 1 selected from {1, 2, 3}, and σx was adaptively set as σx = minxs,j {‖P(x) − P(xs,j)‖2 + ε}, s = 1‥N, j [set membership] V, where ε is a small constant to ensure numerical stability with a value 1e-20.

The only difference between NLW-GU and LW-GU was the image patches that they used. Particularly, NLW-GU used nonlocal image patches, i.e., the searching radius rs > 0 was used to extract image patches. In contrast, LW-GU used local image patches, i.e., the searching radius rs = 0. Since both of the NLW-GU method and the proposed NLW-ML method use non-local image patches, the only difference between them is the distance metric for measuring similarity between image patches. In the experiment, we found that the multi-point estimate strategy was better than the single-point strategy in all of these label fusion methods. Thus, we only report the results obtained with the multi-point strategy.

Similar to NLW-ML and NLW-GU, the searching radius rs for LLL and JFL was set to be 1. Other parameters of these two methods were optimized based on the same training set with the same parameter optimization strategy as adopted by the proposed method. For the LLL method, the optimal patch radius rp and the optimal number of training samples K were rp = 3 and K = 300, selected from {1, 2, 3} and {300, 400, 500}, respectively. Sparse linear SVM classifiers with default parameter (C=1) were built to fuse labels in the LLL method. The single-point label fusion strategy was used in the LLL method. For the JLF method, the optimal patch radius rp and the optimal parameter β in the pairwise joint label difference term were rp = 1 and β = 1, selected from {1, 2, 3} and {0.5, 1, 1.5, 2}, respectively.

Table 2 summarizes segmentation results of the testing images obtained by the segmentation methods under comparison, including MV, LW-INV, LW-GU, NLW-GU, LLL, JLF, and NLW-ML. For each segmentation evaluation measure, the best value is shown in bold. These results indicated that the proposed method achieved the best overall performance. Specifically, Wilcoxon signed rank tests indicated that the proposed method performed significantly better than MV, LW-INV, LW-GU, NLW-GU, LLL (p<0.001) and JLF (p<0.05) in terms of Dice and Jaccard index values of their segmentation results. The results also demonstrated that NLW-GU performed better than LW-GU, indicating that the non-local patch based methods had better performance than traditional methods that adopted only corresponding image patches for label fusion (Coupé et al., 2011, Rousseau et al., 2011).

Table 2
Segmentation results of different label fusion methods (mean±std).

Figure 3 shows box plots of Dice and Jaccard index values of segmentation results obtained by different methods, indicating that our proposed method performed consistently better than other label fusion methods. The superior performance of our method was also confirmed by the visualization results, as shown in Figure 4.

Figure 3
Comparison of different methods for segmenting left hippocampus (denoted by red boxes) and right hippocampus (denoted by green boxes) with respect to Dice index and Jaccard index. In each box, the central mark is the median and edges are the 25th and ...
Figure 4
Hippocampal segmentation results obtained by different methods. One subject was randomly chosen from the dataset. The first row shows the segmentation results produced by different methods, the second row shows their corresponding surface rendering results, ...


The proposed method is a voting based label fusion method (Liao et al., 2013, Wu et al., 2014, Tong et al., 2015, Wu et al., 2015) with an integrated learning component (Hao et al., 2012b, Hao et al., 2014, Wang et al., 2014, Bai et al., 2015, Zhu et al., 2015). The voting based label fusion methods compute the voting weights by comparing the target image patch with each atlas image patches, and use them to combine atlas labels. In contrast, machine learning based methods utilize machine learning techniques to build a mapping between the segmentation label and the image appearance. The voting based methods typically assume that image patches with similar intensity information have the same segmentation label. Although this assumption is valid in most cases, a recent study has shown that similar image patches could bear different labels (Bai et al., 2015). The machine learning based methods overcome this limitation by learning a mapping function between the image patch and the label. The proposed method combines advantages of the existing methods by first adopting a classification method to learn the relationship between image patches and segmentation labels and then fusing the labels based on weights obtained with the learned distance metric.

The metric learning is essentially a preprocessing step in pattern recognition, aiming to learn from a given training dataset a distance metric, with which data samples can be more effectively classified (Weinberger and Saul, 2009). In this study, we empirically demonstrated that the metric learning in conjunction with a k-NN classifier could lead to better performance for segmenting the hippocampus from MRI scans than state-of-the-art MAIS methods, including the LLL and JFL methods. We postulate that its promising performance might due to that the k-NN classifier could potentially capture nonlinear relationship that better model the image patches of background and hippocampus than linear models built by the other methods, such as the sparse linear SVM adopted in the LLL method. In fact, many metric learning methods have been demonstrated to achieve state-of-the-art performance on pattern recognition problems (Weinberger and Saul, 2009).

In our method, we used nonlinear image registration to register image blocks of the hippocampus. Our results demonstrated that a small patch size was good enough to capture inter-subject anatomical differences. Since the metric learning could adaptively learn a distance metric for image patches from training data, our method is not sensitive to the patch size as the traditional patch based methods.

The computational burden for image registration is a major issue in the multi-atlas segmentation methods. To avoid the high computational cost of non-rigid image registration, non-local patch-based image labeling strategies were proposed so that linear image registration could be used to align the image to be segmented and the atlas images (Coupé et al., 2011). However, a non-local image patch searching procedure has to be adopted to identify similar image patches in the label fusion step, which often leads to higher computational cost than using non-rigid image registration in the atlas image registration (Rousseau et al., 2011). More recently, an optimized patch match strategy was proposed to improve the segmentation (Giraud et al., 2016). In the current study, we adopted an atlas selection strategy to reduce the computational cost associated with the nonlinear image registration (Aljabar et al., 2009, Hao et al., 2014). Particularly, the most informative atlases were first selected before the nonlinear image registration. Following (Hao et al., 2014), we selected 20 atlas images for segmenting each target image. The computational complexity of our label fusion method is similar to classification based methods (Hao et al., 2014, Bai et al., 2015). For a MATLAB based implementation of our algorithm, it took ~20 min to fuse labels for segmenting one side of the hippocampus on a personal computer with 4 cores of 3.4G HZ CPU.

It is straightforward to extend the metric learning method for multi-class classification problems in that the metric learning maximizes margin between differences of intra-class and inter-class samples. However, for most brain region segmentation problems with multiple regions to be segmented we could formulate the multi-class classification problem as multiple one-against-the-rest binary classification problems. Such a setting might be better to handle unbalanced training samples since we build local classifiers for different voxels of the brain instead of a global one for the whole brain voxels.

Our future work will integrate the supervised metric learning method and more sophisticated weighted voting label fusion methods, such as joint label fusion (Wang et al., 2013), in which label error is measured by the distance of patches with a predefined distance metric. Furthermore, our method can also be adopted in the shape constrained segmentation framework (Hao et al., 2012a). We will also combine our method with functional MRI image based hippocampus parcellation (Cheng and Fan, 2014).


In the paper, we propose a novel nonlocal patch based weighted voting label fusion method with a learned distance metric for measuring similarity between image patches. The validation experimental results have demonstrated that the proposed method could achieve better segmentation performance than start-of-the-art MAIS methods, indicating that the learned distance metric for measuring similarity of image patches could improve the segmentation performance.


This work was supported in part by National Key Basic Research and Development Program (No. 2015CB856404), National Natural Science Foundation of China (No. 81271514, 61473296), and NIH grants EB022573 and AG014971.


Information Sharing Statement

Software developed in this manuscript is available upon request from Dr. Fan or Dr. Zhu.


  • Akhondi-Asl A, Jafari-Khouzani K, Elisevich K, Soltanian-Zadeh H. Hippocampal volumetry for lateralization of temporal lobe epilepsy: automated versus manual methods. NeuroImage. 2011;54:S218–S226. [PMC free article] [PubMed]
  • Aljabar P, Heckemann R, Hammers A, Hajnal J, Rueckert D. Multi-atlas based segmentation of brain images: Atlas selection and its effect on accuracy. NeuroImage. 2009;46:726–738. [PubMed]
  • Artaechevarria X, Munoz-Barrutia A, Ortiz-de-Solorzano C. Combination strategies in multi-atlas image segmentation: Application to brain MR data. Medical Imaging, IEEE Transactions on. 2009;28:1266–1277. [PubMed]
  • Avants BB, Epstein CL, Grossman M, Gee JC. Symmetric diffeomorphic image registration with cross-correlation: evaluating automated labeling of elderly and neurodegenerative brain. Med Image Anal. 2008;12:26–41. [PMC free article] [PubMed]
  • Bai W, Shi W, Ledig C, Rueckert D. Multi-atlas segmentation with augmented features for cardiac MR images. Medical image analysis. 2015;19:98–109. [PubMed]
  • Boccardi M, Bocchetta M, Morency FC, Collins DL, Nishikawa M, Ganzola R, Grothe MJ, Wolf D, Redolfi A, Pievani M. Training labels for hippocampal segmentation based on the EADC-ADNI harmonized hippocampal protocol. Alzheimer's & Dementia. 2015;11:175–183. [PubMed]
  • Carmichael OT, Aizenstein HA, Davis SW, Becker JT, Thompson PM, Meltzer CC, Liu Y. Atlas-based hippocampus segmentation in Alzheimer's disease and mild cognitive impairment. NeuroImage. 2005;27:979–990. [PMC free article] [PubMed]
  • Chang CC, Lin CJ. LIBSVM: A Library for Support Vector Machines. Acm T Intel Syst Tec 2. 2011
  • Cheng H, Fan Y. Functional parcellation of the hippocampus by clustering resting state fMRI signals; 2014 IEEE 11th International Symposium on Biomedical Imaging (ISBI); 2014. pp. 5–8.
  • Chupin M, Mukuna-Bantumbakulu AR, Hasboun D, Bardinet E, Baillet S, Kinkingnéhun S, Lemieux L, Dubois B, Garnero L. Anatomically constrained region deformation for the automated segmentation of the hippocampus and the amygdala: Method and validation on controls and patients with Alzheimer’s disease. NeuroImage. 2007;34:996–1019. [PubMed]
  • Coupé P, Manjón JV, Fonov V, Pruessner J, Robles M, Collins DL. Patch-based segmentation using expert priors: Application to hippocampus and ventricle segmentation. NeuroImage. 2011;54:940–954. [PubMed]
  • den Heijer T, van der Lijn F, Vernooij MW, de Groot M, Koudstaal P, van der Lugt A, Krestin GP, Hofman A, Niessen WJ, Breteler MM. Structural and diffusion MRI measures of the hippocampus and memory performance. NeuroImage. 2012;63:1782–1789. [PubMed]
  • Dill V, Franco AR, Pinho MS. Automated Methods for Hippocampus Segmentation: the Evolution and a Review of the State of the Art. Neuroinformatics. 2014:1–18. [PubMed]
  • Doshi J, Erus G, Ou Y, Resnick SM, Gur RC, Gur RE, Satterthwaite TD, Furth S, Davatzikos C. Initiative AsN. MUSE: MUlti-atlas region Segmentation utilizing Ensembles of registration algorithms and parameters, and locally optimal atlas selection. NeuroImage. 2015 [PMC free article] [PubMed]
  • Giraud R, Ta V-T, Papadakis N, Manjón JV, Collins DL, Coupé P. Initiative AsDN. An Optimized PatchMatch for multi-scale and multi-feature label fusion. NeuroImage. 2016;124:770–782. [PubMed]
  • Guillaumin M, Verbeek J, Schmid C. Computer Vision, 2009 IEEE 12th International Conference on. IEEE; 2009. Is that you? Metric learning approaches for face identification; pp. 498–505.
  • Hao Y, Jiang T, Fan Y. Shape-constrained multi-atlas based segmentation with multichannel registration. Medical Imaging 2012: Image Processing, vol. Proc. SPIE 8314. 2012a
  • Hao Y, Liu J, Duan Y, Zhang X, Yu C, Jiang T, Fan Y. Local label learning (L3) for multi-atlas based segmentation. SPIE Medical Imaging. 2012b:83142E-83142E-83148.
  • Hao Y, Wang T, Zhang X, Duan Y, Yu C, Jiang T, Fan Y. Local label learning (LLL) for subcortical structure segmentation: Application to hippocampus segmentation. Human brain mapping. 2014;35:2674–2697. [PubMed]
  • Heckemann RA, Hajnal JV, Aljabar P, Rueckert D, Hammers A. Automatic anatomical brain MRI segmentation combining label propagation and decision fusion. NeuroImage. 2006;33:115–126. [PubMed]
  • Iglesias JE, Sabuncu MR. Multi-atlas segmentation of biomedical images: A survey. Medical image analysis. 2015;24:205–219. [PMC free article] [PubMed]
  • Jafari-Khouzani K, Elisevich KV, Patel S, Soltanian-Zadeh H. Dataset of magnetic resonance images of nonepileptic subjects and temporal lobe epilepsy patients for validation of hippocampal segmentation techniques. Neuroinformatics. 2011;9:335–346. [PMC free article] [PubMed]
  • Liao S, Gao Y, Lian J, Shen D. Sparse patch-based label propagation for accurate prostate localization in CT images. Medical Imaging, IEEE Transactions on. 2013;32:419–434. [PMC free article] [PubMed]
  • Lötjönen JMP, Wolz R, Koikkalainen JR, Thurfjell L, Waldemar G, Soininen H, Rueckert D. Fast and robust multi-atlas segmentation of brain magnetic resonance images. NeuroImage. 2010;49:2352–2365. [PubMed]
  • Rohlfing T, Brandt R, Menzel R, Maurer CR., Jr Evaluation of atlas selection strategies for atlas-based image segmentation with application to confocal microscopy images of bee brains. NeuroImage. 2004;21:1428–1442. [PubMed]
  • Rousseau F, Habas PA, Studholme C. A supervised patch-based approach for human brain labeling. Medical Imaging, IEEE Transactions on. 2011;30:1852–1862. [PMC free article] [PubMed]
  • Sabuncu MR, Yeo BTT, Van Leemput K, Fischl B, Golland P. A generative model for image segmentation based on label fusion. Medical Imaging, IEEE Transactions on. 2010;29:1714–1729. [PMC free article] [PubMed]
  • Sanroma G, Wu G, Gao Y, Thung K-H, Guo Y, Shen D. A transversal approach for patch-based label fusion via matrix completion. Medical image analysis. 2015;24:135–148. [PMC free article] [PubMed]
  • Tong T, Wolz R, Wang Z, Gao Q, Misawa K, Fujiwara M, Mori K, Hajnal JV, Rueckert D. Discriminative dictionary learning for abdominal multi-organ segmentation. Medical image analysis. 2015;23:92–104. [PubMed]
  • Wang F, Zuo W, Zhang L, Meng D, Zhang D. A kernel classification framework for metric learning. Neural Networks and Learning Systems, IEEE Transactions on. 2015;26:1950–1962. [PubMed]
  • Wang H, Cao Y, Syeda-Mahmood T. Multi-atlas Segmentation with Learning-Based Label Fusion. Machine Learning in Medical Imaging. 2014:256–263.
  • Wang H, Suh JW, Das SR, Pluta JB, Craige C, Yushkevich PA. Multi-atlas segmentation with joint label fusion. Pattern Analysis and Machine Intelligence, IEEE Transactions on. 2013;35:611–623. [PMC free article] [PubMed]
  • Warfield SK, Zou KH, Wells WM. Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation. Medical Imaging, IEEE Transactions on. 2004;23:903–921. [PMC free article] [PubMed]
  • Weinberger KQ, Saul LK. Distance Metric Learning for Large Margin Nearest Neighbor Classification. J Mach Learn Res. 2009;10:207–244.
  • Wolz R, Schwarz AJ, Yu P, Cole PE, Rueckert D, Jack CR, Raunig D, Hill D. Initiative AsDN. Robustness of automated hippocampal volumetry across magnetic resonance field strengths and repeat images. Alzheimer's & Dementia. 2014;10:430–438. e432. [PubMed]
  • Wu G, Kim M, Sanroma G, Wang Q, Munsell BC, Shen D. Initiative AsDN. Hierarchical multi-atlas label fusion with multi-scale feature representation and label-specific patch partition. NeuroImage. 2015;106:34–46. [PMC free article] [PubMed]
  • Wu Y, Liu G, Huang M, Guo J, Jiang J, Yang W, Chen W, Feng Q. Prostate segmentation based on variant scale patch and local independent projection. Medical Imaging, IEEE Transactions on. 2014;33:1290–1303. [PubMed]
  • Xie Q, Ruan D. Low-complexity atlas-based prostate segmentation by combining global, regional, and local metrics. Medical physics. 2014;41:041909. [PubMed]
  • Xing EP, Jordan MI, Russell S, Ng AY. Distance metric learning with application to clustering with side-information. Advances in neural information processing systems. 2002:505–512.
  • Yan P-g, Cao Y, Yuan Y, Turkbey B, Choyke PL. Label Image Constrained Multiatlas Selection. Cybernetics, IEEE transactions on. 2015;45:1158–1168. [PubMed]
  • Zhu H, Cheng H, Fan Y. SPIE Medical Imaging. International Society for Optics and Photonics; 2015. Random local binary pattern based label learning for multi-atlas segmentation; p. 94131B-94131B-94138.