In the continuing battle against lung cancer, computed tomography (CT) scanning has been found to increase the detection rate of pulmonary nodules.1
Much work has been done to develop computer assisted diagnosis and detection (CAD) systems for pulmonary nodules in CT. We hypothesize that we can also reduce the uncertainty of the radiologist in identifying suspicious pulmonary nodules by providing a visual comparison of a given nodule to a collection of similar nodules of known pathology. To eventually test this hypothesis we first need to develop a content-based image retrieval (CBIR) system for pulmonary nodules in CT. The human observer (radiologist) manually (or semi-automatically or automatically) segments a nodule from a clinical case. The system computes a set of quantitative descriptors for that nodule (our current work focuses on texture-based descriptors) and compares those descriptors to the descriptors of known nodules. The underlying assertion is that if a known malignant nodule has certain computable features, then unknown nodules with similar computable features would be malignant.
Simply put, our system provides a way of performing a “look-up” on a query image to return similar images from a collection. Much research is being done to see which methods of comparing and retrieving similar images are best. For a detailed description of CBIR systems for the medical field, we suggest the review by Muller et al.2
Our work compares three different sets of texture feature descriptors to determine which one has the best precision in retrieving similar nodules.
There are generally two types of medical CBIR systems: (1) those that retrieve entire anatomic structures, and (2) those that retrieve abnormalities or pathologies within an anatomical structure. The latter problem is more complex than the former, but more useful for CAD. Thus we have focused our efforts on images of pulmonary nodules, rather than images of the entire lung.
The first known large-scale comparison of texture features was done by Ohanian and Dubest in 1992.3
They tested 16 Haralick co-occurrence features, 4 Markov random field features, 16 Gabor filter features, and 4 fractal geometry features on 3200 32
32 sub-images and found that co-occurrence performed the best. However, whereas Ohanian and Dubest evaluated the feature types in respect to their ability to classify an image’s texture correctly, we sought to evaluate the features by their performance in an image retrieval system. There are several other CBIR projects currently underway in the medical field in general and particularly with lung CT images. One of these, called ASSERT, is being developed at Purdue University and uses a variety of different image features, including co-occurrence statistics, shape descriptors, Fourier transforms, and global gray level statistics. The system also includes physician-provided ratings of features such as homogeneity, calcification, and artery size.4,5
There are, however, problems associated with content-based retrieval of medical images, such as the difficulty of automatic segmentation, the large variability of feature selection, and the lack of standardized toolkits and evaluation methods.6–8
There have been several efforts over recent years to solve some of these problems. For instance, the Lung Image Database Consortium (LIDC) collection was specifically developed to support evaluation and comparison of chest CAD systems.9
It can be used similarly to develop, evaluate and compare CBIR systems.
There are also a growing number of open source frameworks for medical imaging applications, such as the Visualization Tookit (VTK),10
the Insight Toolkit (ITK)11
for segmentation and registration, and the Image-Guided Surgery Toolkit (IGstk).12
All of these projects are community-driven and freely available on their websites. In addition, the National Cancer Institute is funding the development of an eXtensible Imaging Platform (XIP) through its Cancer Bioinformatics Grid (caBIG) program.13
We believe that the nature of pulmonary nodules (characterized by very small images and significant physician disagreement) justifies the creation of a specialized system for nodule retrieval. Our goal was to build an open source, independent, extensible, CBIR system for pulmonary nodules in CT images and to contribute this system to the growing open source medical imaging community.