We have created a content-based image retrieval framework for computed tomography images of pulmonary nodules. When presented with a nodule image, the system retrieves images of similar nodules from a collection prepared by the Lung Image Database Consortium (LIDC). The system (1) extracts images of individual nodules from the LIDC collection based on LIDC expert annotations, (2) stores the extracted data in a flat XML database, (3) calculates a set of quantitative descriptors for each nodule that provide a high-level characterization of its texture, and (4) uses various measures to determine the similarity of two nodules and perform queries on a selected query nodule. Using our framework, we compared three feature extraction methods: Haralick co-occurrence, Gabor filters, and Markov random fields. Gabor and Markov descriptors perform better at retrieving similar nodules than do Haralick co-occurrence techniques, with best retrieval precisions in excess of 88%. Because the software we have developed and the reference images are both open source and publicly available they may be incorporated into both commercial and academic imaging workstations and extended by others in their research.
Content-based image retrieval; open source; lung; computer-aided diagnosis (CAD); extensible markup language (XML); image database; software design; computed tomography; texture feature; nodule
Rationale and Objectives
Computer-aided diagnostic (CAD) systems fundamentally require the opinions of expert human observers to establish “truth” for algorithm development, training, and testing. The integrity of this “truth,” however, must be established before investigators commit to this “gold standard” as the basis for their research. The purpose of this study was to develop a quality assurance (QA) model as an integral component of the “truth” collection process concerning the location and spatial extent of lung nodules observed on computed tomography (CT) scans to be included in the Lung Image Database Consortium (LIDC) public database.
Materials and Methods
One hundred CT scans were interpreted by four radiologists through a two-phase process. For the first of these reads (the “blinded read phase”), radiologists independently identified and annotated lesions, assigning each to one of three categories: “nodule ≥ 3mm,” “nodule < 3mm,” or “non-nodule ≥ 3mm.” For the second read (the “unblinded read phase”), the same radiologists independently evaluated the same CT scans but with all of the annotations from the previously performed blinded reads presented; each radiologist could add marks, edit or delete their own marks, change the lesion category of their own marks, or leave their marks unchanged. The post-unblinded-read set of marks was grouped into discrete nodules and subjected to the QA process, which consisted of (1) identification of potential errors introduced during the complete image annotation process (such as two marks on what appears to be a single lesion or an incomplete nodule contour) and (2) correction of those errors. Seven categories of potential error were defined; any nodule with a mark that satisfied the criterion for one of these categories was referred to the radiologist who assigned that mark for either correction or confirmation that the mark was intentional.
A total of 105 QA issues were identified across 45 (45.0%) of the 100 CT scans. Radiologist review resulted in modifications to 101 (96.2%) of these potential errors. Twenty-one lesions erroneously marked as lung nodules after the unblinded reads had this designation removed through the QA process.
The establishment of “truth” must incorporate a QA process to guarantee the integrity of the datasets that will provide the basis for the development, training, and testing of CAD systems.
lung nodule; computed tomography (CT); thoracic imaging; database construction; computer-aided diagnosis (CAD); annnotation; quality assurance (QA)
Rationale and Objectives
To investigate the effects of choosing between different metrics in estimating the size of pulmonary nodules as a factor both of nodule characterization and of performance of computer aided detection systems, since the latters are always qualified with respect to a given size range of nodules.
Materials and Methods
This study used 265 whole-lung CT scans documented by the Lung Image Database Consortium using their protocol for nodule evaluation. Each inspected lesion was reviewed independently by four experienced radiologists who provided boundary markings for nodules larger than 3 mm. Four size metrics, based on the boundary markings, were considered: a uni-dimensional and two bi-dimensional measures on a single image slice and a volumetric measurement based on all the image slices. The radiologist boundaries were processed and those with four markings were analyzed to characterize the inter-radiologist variation, while those with at least one marking were used to examine the difference between the metrics.
The processing of the annotations found 127 nodules marked by all of the four radiologists and an extended set of 518 nodules each having at least one observation with three-dimensional sizes ranging from 2.03 to 29.4 mm (average 7.05 mm, median 5.71 mm). A very high inter-observer variation was observed for all these metrics: 95% of estimated standard deviations were in the following ranges [0.49, 1.25], [0.67, 2.55], [0.78, 2.11], and [0.96, 2.69] for the three-dimensional, the uni-dimensional, and the two bi-dimensional size metrics respectively (in mm). Also a very large difference among the metrics was observed: 0.95 probability-coverage region widths for the volume estimation conditional on uni-dimensional, and the two bi-dimensional size measurements of 10mm were 7.32, 7.72, and 6.29 mm respectively.
The selection of data subsets for performance evaluation is highly impacted by the size metric choice. The LIDC plans to include a single size measure for each nodule in its database. This metric is not intended as a gold standard for nodule size; rather, it is intended to facilitate the selection of unique repeatable size limited nodule subsets.
Quantitative image analysis; X-ray CT; Detection; Lung nodule annotation; Size metrics
Ideally, an image should be reported and interpreted in the same way (e.g., the same perceived likelihood of malignancy) or similarly by any two radiologists; however, as much research has demonstrated, this is not often the case. Various efforts have made an attempt at tackling the problem of reducing the variability in radiologists’ interpretations of images. The Lung Image Database Consortium (LIDC) has provided a database of lung nodule images and associated radiologist ratings in an effort to provide images to aid in the analysis of computer-aided tools. Likewise, the Radiological Society of North America has developed a radiological lexicon called RadLex. As such, the goal of this paper is to investigate the feasibility of associating LIDC characteristics and terminology with RadLex terminology. If matches between LIDC characteristics and RadLex terms are found, probabilistic models based on image features may be used as decision-based rules to predict if an image or lung nodule could be
characterized or classified as an associated RadLex term. The results of this study were matches for 25 (74%) out of 34 LIDC terms in RadLex. This suggests that LIDC characteristics and associated rating terminology may be better conceptualized or reduced to produce even more matches with RadLex. Ultimately, the goal is to identify and establish a more standardized rating system and terminology to reduce the subjective variability between radiologist annotations. A standardized rating system can then be utilized by future researchers to develop automatic annotation models and tools for computer-aided decision systems.
Chest CT; digital imaging; image data; image interpretation; imaging informatics; lung; radiographic image interpretation; computer-assisted; reporting; RadLex; semantic; LIDC
Rationale and Objectives
The purpose of this study was to analyze the variability of experienced thoracic radiologists in the identification of lung nodules on CT scans and thereby to investigate variability in the establishment of the “truth” against which nodule-based studies are measured.
Materials and Methods
Thirty CT scans were reviewed twice by four thoracic radiologists through a two-phase image annotation process. During the initial “blinded read” phase, radiologists independently marked lesions they identified as “nodule ≥ 3mm (diameter),” “nodule < 3mm,” or “non-nodule ≥ 3mm.” During the subsequent “unblinded read” phase, the blinded read results of all radiologists were revealed to each of the four radiologists, who then independently reviewed their marks along with the anonymous marks of their colleagues; a radiologist’s own marks then could be deleted, added, or left unchanged. This approach was developed to identify, as completely as possible, all nodules in a scan without requiring forced consensus.
After the initial blinded read phase, a total of 71 lesions received “nodule ≥ 3mm” marks from at least one radiologist; however, all four radiologists assigned such marks to only 24 (33.8%) of these lesions. Following the unblinded reads, a total of 59 lesions were marked as “nodule ≥ 3 mm” by at least one radiologist. 27 (45.8%) of these lesions received such marks from all four radiologists, 3 (5.1%) were identified as such by three radiologists, 12 (20.3%) were identified by two radiologists, and 17 (28.8%) were identified by only a single radiologist.
The two-phase image annotation process yields improved agreement among radiologists in the interpretation of nodules ≥ 3mm. Nevertheless, substantial variabilty remains across radiologists in the task of lung nodule identification.
lung nodule; computed tomography (CT); thoracic imaging; inter-observer variability; computer-aided diagnosis (CAD)
Traditionally, image studies evaluating the effectiveness of computer-aided diagnosis (CAD) use a single label from a medical expert compared with a single label produced by CAD. The purpose of this research is to present a CAD system based on Belief Decision Tree classification algorithm, capable of learning from probabilistic input (based on intra-reader variability) and providing probabilistic output. We compared our approach against a traditional decision tree approach with respect to a traditional performance metric (accuracy) and a probabilistic one (area under the distance–threshold curve—AuCdt). The probabilistic classification technique showed notable performance improvement in comparison with the traditional one with respect to both evaluation metrics. Specifically, when applying cross-validation technique on the training subset of instances, boosts of 28.26% and 30.28% were noted for the probabilistic approach with respect to accuracy and AuCdt, respectively. Furthermore, on the validation subset of instances, boosts of 20.64% and 23.21% were noted again for the probabilistic approach with respect to the same two metrics. In addition, we compared our CAD system results with diagnostic data available for a small subset of the Lung Image Database Consortium database. We discovered that when our CAD system errs, it generally does so with low confidence. Predictions produced by the system also agree with diagnoses of truly benign nodules more often than radiologists, offering the possibility of reducing the false positives.
Chest CT; Computer-aided diagnosis (CAD); Feature extraction; Image analysis; Machine learning; Radiographic image interpretation; Computer-assisted
Rationale and Objectives
Integral to the mission of the National Institutes of Health–sponsored Lung Imaging Database Consortium is the accurate definition of the spatial location of pulmonary nodules. Because the majority of small lung nodules are not resected, a reference standard from histopathology is generally unavailable. Thus assessing the source of variability in defining the spatial location of lung nodules by expert radiologists using different software tools as an alternative form of truth is necessary.
Materials and Methods
The relative differences in performance of six radiologists each applying three annotation methods to the task of defining the spatial extent of 23 different lung nodules were evaluated. The variability of radiologists’ spatial definitions for a nodule was measured using both volumes and probability maps (p-map). Results were analyzed using a linear mixed-effects model that included nested random effects.
Across the combination of all nodules, volume and p-map model parameters were found to be significant at P < .05 for all methods, all radiologists, and all second-order interactions except one. The radiologist and methods variables accounted for 15% and 3.5% of the total p-map variance, respectively, and 40.4% and 31.1% of the total volume variance, respectively.
Radiologists represent the major source of variance as compared with drawing tools independent of drawing metric used. Although the random noise component is larger for the p-map analysis than for volume estimation, the p-map analysis appears to have more power to detect differences in radiologist-method combinations. The standard deviation of the volume measurement task appears to be proportional to nodule volume.
LIDC drawing experiment; lung nodule annotation; edge mask; p-map; volume; linear mixed-effects model
There are lots of work being done to develop computer-assisted diagnosis and detection (CAD) technologies and systems to improve the diagnostic quality for pulmonary nodules. Another way to improve accuracy of diagnosis on new images is to recall or find images with similar features from archived historical images which already have confirmed diagnostic results, and the content-based image retrieval (CBIR) technology has been proposed for this purpose. In this paper, we present a method to find and select texture features of solitary pulmonary nodules (SPNs) detected by computed tomography (CT) and evaluate the performance of support vector machine (SVM)-based classifiers in differentiating benign from malignant SPNs. Seventy-seven biopsy-confirmed CT cases of SPNs were included in this study. A total of 67 features were extracted by a feature extraction procedure, and around 25 features were finally selected after 300 genetic generations. We constructed the SVM-based classifier with the selected features and evaluated the performance of the classifier by comparing the classification results of the SVM-based classifier with six senior radiologists′ observations. The evaluation results not only showed that most of the selected features are characteristics frequently considered by radiologists and used in CAD analyses previously reported in classifying SPNs, but also indicated that some newly found features have important contribution in differentiating benign from malignant SPNs in SVM-based feature space. The results of this research can be used to build the highly efficient feature index of a CBIR system for CT images with pulmonary nodules.
Feature selection; content-based image retrieval; classification; CT images; lung diseases
Rationale and Objectives
To retrospectively investigate the effect of a computer aided detection (CAD) system on radiologists’ performance for detecting small pulmonary nodules in CT examinations, with a panel of expert radiologists serving as the reference standard.
Materials and Methods
Institutional review board approval was obtained. Our data set contained 52 CT examinations collected by the Lung Image Database Consortium, and 33 from our institution. All CTs were read by multiple expert thoracic radiologists to identify the reference standard for detection. Six other thoracic radiologists read the CT examinations first without, and then with CAD. Performance was evaluated using free-response receiver operating characteristics (FROC) and the jackknife FROC analysis methods (JAFROC) for nodules above different diameter thresholds.
241 nodules, ranging in size from 3.0 to 18.6 mm (mean 5.3 mm) were identified as the reference standard. At diameter thresholds of 3, 4, 5, and 6 mm, the CAD system had a sensitivity of 54%, 64%, 68%, and 76%, respectively, with an average of 5.6 false-positives (FPs) per scan. Without CAD, the average figures-of-merit (FOMs) for the six radiologists, obtained from JAFROC analysis, were 0.661, 0.729, 0.793 and 0.838 for the same nodule diameter thresholds, respectively. With CAD, the corresponding average FOMs improved to 0.705, 0.763, 0.810 and 0.862, respectively. The improvement achieved statistical significance for nodules at the 3 and 4 mm thresholds (p=0.002 and 0.020, respectively), and did not achieve significance at 5 and 6 mm (p=0.18 and 0.13, respectively). At a nodule diameter threshold of 3 mm, the radiologists’ average sensitivity and FP rate were 0.56 and 0.67, respectively, without CAD, and 0.67 and 0.78 with CAD.
CAD improves thoracic radiologists’ performance for detecting pulmonary nodules under 5 mm on CT examinations, which are often overlooked by visual inspection alone.
Lung Nodule; CT; Computer-aided detection
This paper describes part of content-based image retrieval (CBIR) system that has been developed for mammograms. Details are presented of methods implemented to derive measures of similarity based upon structural characteristics and distributions of density of the fibroglandular tissue, as well as the anatomical size and shape of the breast region as seen on the mammogram. Well-known features related to shape, size, and texture (statistics of the gray-level histogram, Haralick’s texture features, and moment-based features) were applied, as well as less-explored features based in the Radon domain and granulometric measures. The Kohonen self-organizing map (SOM) neural network was used to perform the retrieval operation. Performance evaluation was done using precision and recall curves obtained from comparison between the query and retrieved images. The proposed methodology was tested with 1,080 mammograms, including craniocaudal and mediolateral-oblique views. Precision rates obtained are in the range from 79% to 83% considering the total image set. Considering the first 50% of the retrieved mages, the precision rates are in the range from 78% to 83%; the rates are in the range from 79% to 86% considering the first 25% of the retrieved images. Results obtained indicate the potential of the implemented methodology to serve as a part of a CBIR system for mammography.
Mammography; contend-based image retrieval; Kohonen self-organizing map; texture features; granulometric measures; radon transform domain; breast density
We have been developing a computer-aided diagnostic (CAD) scheme for lung nodule detection in order to assist radiologists in the detection of lung cancer in thin-section computed tomography (CT) images. Our database consisted of 117 thin-section CT scans with 153 nodules, obtained from a lung cancer screening program at a Japanese university (85 scans, 91 nodules) and from clinical work at an American university (32 scans, 62 nodules). The database included nodules of different sizes (4-28 mm, mean 10.2 mm), shapes, and patterns (solid and ground-glass opacity (GGO)). Our CAD scheme consisted of modules for lung segmentation, selective nodule enhancement, initial nodule detection, feature extraction, and classification. The selective nodule enhancement filter was a key technique for significant enhancement of nodules and suppression of normal anatomic structures such as blood vessels, which are the main sources of false positives. Use of an automated rule-based classifier for reduction of false positives was another key technique; it resulted in a minimized overtraining effect and an improved classification performance. We employed a case-based four-fold cross-validation testing method for evaluation of the performance levels of our computerized detection scheme. Our CAD scheme achieved an overall sensitivity of 86% (small: 76%, medium-sized: 94%, large: 95%; solid: 86%, mixed GGO: 89%, pure GGO: 81%) with 6.6 false positives per scan; an overall sensitivity of 81% (small: 69%, medium-sized: 91%, large: 91%; solid: 79%, mixed GGO: 88%, pure GGO: 81%) with 3.3 false positives per scan; and an overall sensitivity of 75% (small: 60%, medium-sized: 88%, large: 87%; solid: 70%, mixed GGO: 87%, pure GGO: 81%) with 1.6 false positives per scan. The experimental results indicate that our CAD scheme with its two key techniques can achieve a relatively high performance for nodules presenting large variations in size, shape, and pattern.
nodule detection; computer-aided diagnosis; CAD; CT scan; rule-based classifier
An algorithm was developed to segment solid pulmonary nodules attached to the chest wall in computed
tomography scans. The pleural surface was estimated and used to segment the nodule from the
chest wall. To estimate the surface, a robust approach was used to identify points that lie on the pleural
surface but not on the nodule. A 3D surface was estimated from the identified surface points. The
segmentation performance of the algorithm was evaluated on a database of 150 solid juxtapleural pulmonary
nodules. Segmented images were rated on a scale of 1 to 4 based on visual inspection, with 3 and
4 considered acceptable. This algorithm offers a large improvement in the success rate of juxtapleural
nodule segmentation, successfully segmenting 98.0% of nodules compared to 81.3% for a previously published
plane-fitting algorithm, which will provide for the development of more robust automated nodule
Searching for relevant knowledge across heterogeneous geospatial databases requires an extensive knowledge of the semantic meaning of images, a keen eye for visual patterns, and efficient strategies for collecting and analyzing data with minimal human intervention. In this paper, we present our recently developed content-based multimodal Geospatial Information Retrieval and Indexing System (GeoIRIS) which includes automatic feature extraction, visual content mining from large-scale image databases, and high-dimensional database indexing for fast retrieval. Using these underpinnings, we have developed techniques for complex queries that merge information from heterogeneous geospatial databases, retrievals of objects based on shape and visual characteristics, analysis of multiobject relationships for the retrieval of objects in specific spatial configurations, and semantic models to link low-level image features with high-level visual descriptors. GeoIRIS brings this diverse set of technologies together into a coherent system with an aim of allowing image analysts to more rapidly identify relevant imagery. GeoIRIS is able to answer analysts’ questions in seconds, such as “given a query image, show me database satellite images that have similar objects and spatial relationship that are within a certain radius of a landmark.”
Geospatial intelligence; image database; information mining
A complete texture image retrieval system includes two techniques: texture feature extraction and similarity measurement. Specifically, similarity measurement is a key problem for texture image retrieval study. In this paper, we present an effective similarity measurement formula. The MIT vision texture database, the Brodatz texture database, and the Outex texture database were used to verify the retrieval performance of the proposed similarity measurement method. Dual-tree complex wavelet transform and nonsubsampled contourlet transform were used to extract texture features. Experimental results show that the proposed similarity measurement method achieves better retrieval performance than some existing similarity measurement methods.
The objectives of this study were to evaluate the influence of iterative reconstruction (IR) on pulmonary nodule volumetry with chest computed tomography (CT).
Twenty patients (12 women and 8 men, mean age 61.9, range 32–87) underwent evaluation of pulmonary nodules with a 64-slice CT-scanner. Data were reconstructed using filtered back projection (FBP) and IR (Philips Healthcare, iDose4-levels 2, 4 and 6) at similar radiation dose. Volumetric nodule measurements were performed with semi-automatic software on thin slice reconstructions. Only solid pulmonary nodules were measured, no additional selection criteria were used for the nature of nodules. For intra-observer and inter-observer variability, measurements were performed once by one observer and twice by another observer. Algorithms were compared using the concordance correlation-coefficient (pc) and Friedman-test, and post-hoc analysis with the Wilcoxon-signed ranks-test with Bonferroni-correction (significance-level p<0.017).
Seventy-eight nodules were present including 56 small nodules (volume<200 mm3, diameter<8 mm) and 22 large nodules (volume≥200 mm3, diameter≥8 mm). No significant differences in measured pulmonary nodule volumes between FBP, iDose4-levels 2, 4 and 6 were found in both small nodules and large nodules. FBP and iDose4-levels 2, 4 and 6 were correlated with pc-values of 0.98 or higher for both small and large nodules. Pc-values of intra-observer and inter-observer variability were 0.98 or higher.
Measurements of solid pulmonary nodule volume measured with standard-FBP were comparable with IR, regardless of the IR-level and no significant differences between measured volumes of both small and large solid nodules were found.
Computer tomography (CT) imaging plays an important role in cancer detection and quantitative assessment in clinical trials. High-resolution imaging studies on large cohorts of patients generate vast data sets, which are infeasible to analyze through manual interpretation.
In this article we describe a comprehensive architecture for computer-aided detection (CAD) and surveillance on lung nodules in CT images. Central to this architecture are the analytic components: an automated nodule detection system, nodule tracking capabilities and volume measurement, which are integrated within a data management system that includes mechanisms for receiving and archiving images, a database for storing quantitative nodule measurements and visualization, and reporting tools.
We describe two studies to evaluate CAD technology within this architecture, and the potential application in large clinical trials. The first study involves performance assessment of an automated nodule detection system and its ability to increase radiologist sensitivity when used to provide a second opinion. The second study investigates nodule volume measurements on CT made using a semi-automated technique and shows that volumetric analysis yields significantly different tumor response classifications than a 2D diameter approach. These studies demonstrate the potential of automated CAD tools to assist in quantitative image analysis for clinical trials.
“Computer-Aided Diagnosis”; “Lung Nodules”; “CT”
Rationale and Objectives
We developed a computerized scheme for detection of lung nodules in the lateral views of chest radiographs, in order to improve the overall performance in combination with the computer-aided diagnostic (CAD) scheme for posterior-anterior (PA) views.
Materials and Methods
We used 106 pairs of PA and lateral views of chest radiographs (122 lung nodules) for development of the CAD scheme. In the CAD scheme for lateral views, initial candidates of lung nodules were identified by use of a nodule enhancement filter based on the edge gradients. Thirty four image features extracted from the original and the nodule-enhanced images were employed for the rule-based scheme and for artificial neural networks (ANNs) for removal of some false-positive candidates. The computer performance was evaluated with a leave-one-case-out test method for ANNs. For PA views, we used the existing CAD scheme, which was trained with one half of 924 chest images and then tested with the remaining images.
When the CAD scheme was applied only to PA views, the sensitivity in the detection of lung nodules was 70.5%, with 4.9 false positives per image. Although the performance of the computerized scheme for lateral views was relatively low (60.7% sensitivity with 1.7 false positives per image), the overall sensitivity (86.9%) was improved (6.6 false positives per two views), because 20 (16.4%) of the 122 nodules were detected only on lateral views.
The CAD scheme by use of lateral-view images has the potential to improve the overall performance for detection of lung nodules on chest radiographs when combined with a conventional CAD scheme for standard PA views.
The Structural Descriptor Database (SDDB) is a web-based tool that predicts the function of proteins and functional site positions based on the structural properties of related protein families. Structural alignments and functional residues of a known protein set (defined as the training set) are used to build special Hidden Markov Models (HMM) called HMM descriptors. SDDB uses previously calculated and stored HMM descriptors for predicting active sites, binding residues, and protein function. The database integrates biologically relevant data filtered from several databases such as PDB, PDBSUM, CSA and SCOP. It accepts queries in fasta format and predicts functional residue positions, protein-ligand interactions, and protein function, based on the SCOP database.
To assess the SDDB performance, we used different data sets. The Trypsion-like Serine protease data set assessed how well SDDB predicts functional sites when curated data is available. The SCOP family data set was used to analyze SDDB performance by using training data extracted from PDBSUM (binding sites) and from CSA (active sites). The ATP-binding experiment was used to compare our approach with the most current method. For all evaluations, significant improvements were obtained with SDDB.
SDDB performed better when trusty training data was available. SDDB worked better in predicting active sites rather than binding sites because the former are more conserved than the latter. Nevertheless, by using our prediction method we obtained results with precision above 70%.
Wind field analysis from synthetic aperture radar images allows the estimation of wind direction and speed based on image descriptors. In this paper, we propose a framework to automate wind direction retrieval based on wavelet decomposition associated with spectral processing. We extend existing undecimated wavelet transform approaches, by including à trous with B3 spline scaling function, in addition to other wavelet bases as Gabor and Mexican-hat. The purpose is to extract more reliable directional information, when wind speed values range from 5 to 10 ms−1. Using C-band empirical models, associated with the estimated directional information, we calculate local wind speed values and compare our results with QuikSCAT scatterometer data. The proposed approach has potential application in the evaluation of oil spills and wind farms.
SAR; wind direction; FFT; CMOD4; wind speed
In this paper, we present a comprehensive framework to support classification of nuclei in digital microscopy images of diffuse gliomas. This system integrates multiple modules designed for convenient human annotations, standard-based data management, efficient data query and analysis. In our study, 2770 nuclei of six types are annotated by neuropathologists from 29 whole-slide images of glioma biopsies. After machine-based nuclei segmentation for whole-slide images, a set of features describing nuclear shape, texture and cytoplasmic staining is calculated to describe each nucleus. These features along with nuclear boundaries are represented by a standardized data model and saved in the spatial relational database in our framework. Features derived from nuclei classified by neuropathologists are retrieved from the database through efficient spatial queries and used to train distinct classifiers. The best average classification accuracy is 87.43% for 100 independent five-fold cross validations. This suggests that the derived nuclear and cytoplasmic features can achieve promising classification results for six nuclear classes commonly presented in gliomas. Our framework is generic, and can be easily adapted for other related applications.
Nuclei classification; feature selection; microscopy image analysis; metadata model; diffuse glioma
The analysis of natural images with independent component analysis (ICA) yields localized bandpass Gabor-type filters similar to receptive fields of simple cells in visual cortex. We applied ICA on a subset of patches called position-centered patches, selected for forming a translation-invariant representation of small patches. The resulting filters were qualitatively different in two respects. One novel feature was the emergence of filters we call double-Gabor filters. In contrast to Gabor functions that are modulated in one direction, double-Gabor filters are sinusoidally modulated in two orthogonal directions. In addition the filters were more extended in space and frequency compared to standard ICA filters and better matched the distribution in experimental recordings from neurons in primary visual cortex. We further found a dual role for double-Gabor filters as edge and texture detectors, which could have engineering applications.
We are developing a molecular image-directed, 3D ultrasound-guided, targeted biopsy system for improved detection of prostate cancer. In this paper, we propose an automatic 3D segmentation method for transrectal ultrasound (TRUS) images, which is based on multi-atlas registration and statistical texture prior. The atlas database includes registered TRUS images from previous patients and their segmented prostate surfaces. Three orthogonal Gabor filter banks are used to extract texture features from each image in the database. Patient-specific Gabor features from the atlas database are used to train kernel support vector machines (KSVMs) and then to segment the prostate image from a new patient. The segmentation method was tested in TRUS data from 5 patients. The average surface distance between our method and manual segmentation is 1.61 ± 0.35 mm, indicating that the atlas-based automatic segmentation method works well and could be used for 3D ultrasound-guided prostate biopsy.
Automatic 3D segmentation; atlas registration; Gabor filter; support vector machine; ultrasound imaging; prostate cancer
The Gene Expression Omnibus (GEO) repository at the National Center for Biotechnology Information (NCBI) archives and freely disseminates microarray and other forms of high-throughput data generated by the scientific community. The database has a minimum information about a microarray experiment (MIAME)-compliant infrastructure that captures fully annotated raw and processed data. Several data deposit options and formats are supported, including web forms, spreadsheets, XML and Simple Omnibus Format in Text (SOFT). In addition to data storage, a collection of user-friendly web-based interfaces and applications are available to help users effectively explore, visualize and download the thousands of experiments and tens of millions of gene expression patterns stored in GEO. This paper provides a summary of the GEO database structure and user facilities, and describes recent enhancements to database design, performance, submission format options, data query and retrieval utilities. GEO is accessible at
Thyroid nodules are a common, yet challenging clinical problem. The vast majority of these nodules are benign; however, deciding which nodule should undergo biopsy is difficult because the imaging appearance of benign and malignant thyroid nodules overlap. High resolution ultrasound is the primary imaging modality for evaluating thyroid nodules. Many sonographic features have been studied individually as predictors for thyroid malignancy. There has been little work to create predictive models that combine multiple predictors, both imaging features and demographic factors. We have created a Bayesian classifier to predict whether a thyroid nodule is benign or malignant using sonographic and demographic findings. Our classifier performed similar to or slightly better than experienced radiologists when evaluated using 41 thyroid nodules with known pathologic diagnosis. This classifier could be helpful in providing practitioners an objective basis for deciding whether to biopsy suspicious thyroid nodules.
To evaluate performance of computer-aided detection (CAD) beyond double reading for pulmonary nodules on low-dose computed tomography (CT) by nodule volume.
A total of 400 low-dose chest CT examinations were randomly selected from the NELSON lung cancer screening trial. CTs were evaluated by two independent readers and processed by CAD. A total of 1,667 findings marked by readers and/or CAD were evaluated by a consensus panel of expert chest radiologists. Performance was evaluated by calculating sensitivity of pulmonary nodule detection and number of false positives, by nodule characteristics and volume.
According to the screening protocol, 90.9 % of the findings could be excluded from further evaluation, 49.2 % being small nodules (less than 50 mm3). Excluding small nodules reduced false-positive detections by CAD from 3.7 to 1.9 per examination. Of 151 findings that needed further evaluation, 33 (21.9 %) were detected by CAD only, one of them being diagnosed as lung cancer the following year. The sensitivity of nodule detection was 78.1 % for double reading and 96.7 % for CAD. A total of 69.7 % of nodules undetected by readers were attached nodules of which 78.3 % were vessel-attached.
CAD is valuable in lung cancer screening to improve sensitivity of pulmonary nodule detection beyond double reading, at a low false-positive rate when excluding small nodules.
• Computer-aided detection (CAD) has known advantages for computed tomography (CT).
• Combined CAD/nodule size cut-off parameters assist CT lung cancer screening.
• This combination improves the sensitivity of pulmonary nodule detection by CT.
• It increases the positive predictive value for cancer detection.
Computer-aided detection; Multi-detector computed tomography; Pulmonary nodules; Low dose; Volumetry