All the GUI-based packages we examined require minimal prerequisite computer skills and were perceived as very user friendly. We consider SliceOmatic a very good choice for segmentation of abdominal adipose tissue on MR images. It performed very well for all observed areas and the processing time was low for both observers. Since the agreement of the experts on the segmentation results proved to be equivalent for all examined software packages, we find it hard to recommend one software package solely based on the reproducibility. Analyze was rated lower because the multitude of options this package provides made it hard to determine an optimal segmentation routine. SliceOmatic offers a number of additional features, but the segmentation routine was easily determined and performed. HippoFat was rated slightly lower although the basic use is easy and straightforward, but the manual correction of mislabeled areas was very tedious.
The time it took to segment the MR images varied between the two observers and, again, NIHImage performed poorer than other software packages, most likely due to the fact that it was the first package we examined. Both observers had the lowest processing times using SliceOmatic. Although EasyVision required a manual segmentation, the processing time was equal to the time required to perform the semi-automated segmentation with Analyze. Using Analyze or NIHImage was perceived as more complicated and time consuming than segmenting images with SliceOmatic.
The amount of effort and time laboratory members are willing to commit to understanding and choosing between particular segmentation algorithms and software packages is an important consideration. The trade-off between manual interaction and performance is a significant factor. Manual interaction can improve accuracy by incorporating prior anatomical knowledge of the operator. The type of interaction can vary from complete manual delineating of adipose tissue to thresholding or selection of a seed point for region growing. In this study, the Philips EasyVision workstation was the only software package that did not offer any of these approaches, but the manual segmentation was straightforward and did not require more time than some of the computer-aided approaches, although the reproducibility of the VAT was not as good as the one of Analyze or SliceOmatic. Analyze, NIHImage and SliceOmatic required some user interaction for specifying initial parameters, like a threshold or the manual separation of SAT and VAT. The values of these parameters can significantly affect performance and results. An automated and unsupervised approach would allow data sets to be processed in a short time without any user interaction and would be inherently free from any variability introduced by the human observer. Positano et al.36
reported a high correlation (SAT: r
<0.0001; VAT: r
=0.0001) between the unsupervised method they proposed and manual segmentation of 20 patients (32 images each) using their software package HippoFat. They reported that the unsupervised method slightly overestimated the SAT and underestimated the VAT. Using the same software package for the automated segmentation of the MR images examined in this study resulted in unsatisfactory segmentation. This was most likely due to the complex structure of the viscera, patient movement, partial volume averaging, tissue inhomogeneity, poor contrast and poor signal-to-noise ratio or the image acquisition protocol, although the image quality did not affect any other software package to this extent. The results show that it requires substantial operator involvement for the re-delineation of VAT. Sometimes the SAT selection had to be adjusted as well. While the unsupervised algorithm itself was fairly quick, both observers perceived the manual corrections as time consuming and tedious.
Financial constraints might also affect the decision. NIH-Image and HippoFat were two free software packages. Many NIHImage users have added plug-ins, which are also free and can be downloaded. The upside of acquiring commercial software is the better support and a very comprehensive manual, which both Analyze and SliceOmatic offer.
Nonetheless the intra- and inter-rater reproducibility were very good for all packages. We decided to use NIHImage as the standard for our comparisons. We report intra-class correlation comparing NIHImage to all other packages, since it is a freely available program package, which has been frequently used to measure adipose tissue compartments. Using the pairwise correlations (data not shown) among all programs, we obtained similar results. The reproducibility of our measurements was affected by the quality of the image. Because of a smaller partial volume effect, the SAT could be best distinguished from the lean tissue and the inter-observer reproducibility is highest. The intra- and inter-rater reproducibility of VAT measurements were slightly lower. This might be due to the fact that signal of intestinal content with short T1 and bone marrow could be interpreted as VAT. Another issue is that the small amount of intramuscular and paravertebral adipose tissues may influence the accuracy of the measurement of VAT. However, it is still controversial whether such compartments of adipose tissue should be included in VAT. We tried to exclude these adipose tissues in VAT. Since our aim was the analysis of large amounts of image data such as in epidemiological studies, a simplified adipose tissue classification and ignoring small numbers of high-intensity non-fat pixels could be justified. When comparing the results of the five packages, we noted that the slightly inferior results for NIHImage, especially the low VAT reproducibility, might be due to the fact that it was the first software package the observers used. This can account for a larger number of misclassified pixels and as a result a higher variability of measurement results. It was, therefore, found more difficult to understand and it took longer to establish a routine, whereas the other packages benefited from the knowledge we had acquired using NIHImage.
Limitations to this study must be acknowledged. The reliability and reproducibility of MR assessment of adipose tissue might vary across degrees of adiposity, but only obese individuals were included in this study. However, we are reassured that in cadaver and animal studies of lean and obese subjects, the MRI areas of adipose tissue correlated well with dissection results.20,22
We evaluated only one fully automated program, one free software package and two commercially available programs, whereas a number of other programs are available. Most in-house packages had been developed to process MR images and estimate adipose tissue, none of the free software packages or the commercial packages were similarly specialized. Since we did not expect to find large difference between the commercial image processing packages, we selected the commercially available programs based on the cost and frequency of citations they received in other publications assessing abdominal adipose tissue. At the time this project started, only a few fully automated approaches had been published and had been made available. Recently published algorithms, like the one developed by Liou et al.37
may perform well or better and should be validated. Furthermore, we did not validate the software packages through use of a phantom, animal or cadaver model, but relied on previously published studies.17,20,22,24
Instead we compared the segmentation of semi-automated, automated and manual delineation approaches, none of which guarantees a perfect true model.
The fact that the results obtained with NIHImage and the results obtained with the four other software packages respectively correlate highly, supports our opinion that each of the examined software packages will allow for rapid, reliable and accurate measurements of abdominal SAT and VAT. We conclude that rather than trying to fully automate the segmentation process, the aim should be to support the medical expert with an image-based, objective segmentation tool, but without giving up the visual control of the expert and the expert’s possibility to interact.
In summary, four packages provided essentially the same results with respect to the inter- and intra-rater reproducibility, only the fully automated approach (HippoFat) proved slightly inferior. The choice of which package to adopt would depend on financial resources of the users as well as the user’s computer skills. NIHImage is free and can essentially perform most routines that Analyze and SliceOmatic offer. SliceOmatic has been widely used for MR image segmentation and is quick and reliable. Analyze requires a little more computer knowledge, but it is a very powerful software package that can be used for a large variety of image analysis purposes. The use of the EasyVision software is limited to a certain group of users since it is associated with a Philips scanner. To be able to compare measurements in longitudinal studies, researchers should consider reporting the reliability of their data, especially if more than one observer is evaluating the MR images. Although it may be beneficial to use the same software package for repeated measurements, our results using SliceOmatic, Analyze or NIHImage were comparable and could be used interchangeably. New fully automated approaches, like the one published by Liou et al.37
should be compared to the semi-automated measurements, since they could shorten the time and effort dramatically and perhaps provide even better reproducibility of the results.