We develop an automatic histological image classification system that uses biologically interpretable shape-based features. These features capture the distribution of shape patterns, described by Fourier shape descriptors, in different stains of a histological image. We use this system to classify hematoxylin and eosin (H&E) stained renal tumor images and assess its classification performance by comparing it to methods based on textural, morphological, and topological features.
The application of this system to cancer is important because, despite progress in treatment (e.g., early diagnosis, reduction of mortality rates, and improvement of survival), cancer is still a major health problem in the United States. Specifically, it is estimated that there were 60,920 new kidney and renal pelvis cancer cases in the United States in 2011, resulting in 13,120 deaths [1
]. Successful prognosis or treatment of renal cell carcinoma (RCC) depends on disease subtype, each of which exhibits distinct clinical behavior and underlying genetic mutations [2
]. Thus, it is important to accurately determine the subtype of an RCC patient from among the most common subtypes: clear cell (CC, 70% of cases), papillary (PA, 15%), and chromophobe (CH, 5%) [3
]. In addition, it is also important to identify benign renal tumors, the most common of which are the renal oncocytomas (ON, 5% of cases). Figure
shows typical examples of H&E-stained renal tumor images. Pathologists, guided by the World Health Organization (WHO) system, manually classify renal tumors using light microscopy based on typical features [3
]. Even though the WHO system is capable of classifying typical examples, some cases are more difficult. For example, ON and CH are often confused because both have granular cytoplasm. CH and CC can also be confused because both have prominent cell membranes. Moreover, there are two reported subtypes of PA that have varying visual appearance [3
]. Thus, a pathologist’s diagnosis may be subjective.
Figure 1 Example images of four H&E stained histological renal tumor subtypes in datasets A (a-d) and B (e-h). Among four subtypes, three are renal cell carcinoma (RCC) subtypes: (a and e) clear cell, (b and f) chromophobe, and (c and g) papillary. The (more ...)
Over the last decade, several automatic or automated systems have been developed to aid histological cancer diagnosis and to reduce subjectivity. All of these systems attempt to mimic pathologists by extracting features from histological images. Some important features include color, nuclear shape, fractal, textural gray-level co-occurrence matrices (GLCM), wavelets, and topological, among others [4
]. Several diagnostic systems for renal cell carcinoma (RCC) are good examples of the utility of these features. For example, Chaudry et al. proposed a system using textural and morphological features with automated region-of-interest selection for RCC subtype classification [6
]. Waheed et al. performed a similar analysis but included fractal as well as textural and morphological features [8
]. Choi et al. extended the morphological analysis to three-dimensional nuclei and applied their system to RCC grading [9
]. In addition to morphological features, Francois et al. used cell kinetic features in their RCC grading system [10
]. Finally, Raza et al. used a scale invariant feature transform (SIFT) method to classify RCC subtypes [11
]. Despite the success of these systems in terms of diagnostic accuracy, widespread use of these systems is limited by a lack of feature interpretability. Some researchers have provided visual interpretation of features. For example, some topological features have been related to the amount of differentiation in varying cancer grades [12
]. In contrast, pathologists may not be receptive to, or confident in, features such as wavelet or fractal representations of images because they are not easy to interpret biologically. Moreover, most existing systems exploit morphological properties of nuclear shapes and ignore cytoplasmic and glandular structures despite evidence of their utility [13
]. Thus, methods based on a holistic view of shapes and colors may more accurately reflect the process by which a pathologist interprets a renal tumor image [3
Fourier shape descriptors, described by Kuhl and Giardina [14
] have been reported to be very useful as shape descriptors. They are highly robust to high frequency noise because of their ability to reject higher harmonic shape descriptors. Researchers have used Fourier shape descriptors for various medical imaging applications, including shape-based vertebral image retrieval [15
], and classification of breast tumors [16
]. The medical images involved in these studies typically have definite shapes with consistent landmarks. In addition, researchers have used Fourier shape descriptors for analyzing the shapes of nuclear structures [17
]. Histological images, however, lack such landmarks and they tend to exhibit multiple highly variable shapes. As such, it is difficult to compare histological images using common techniques such as template matching with an image atlas [20
] or using shape-based similarity measures after registration of the shapes in a histological image [21
]. Therefore, in order to characterize and compare histological images in terms of shapes, we quantify the distribution of shape patterns in an image using Fourier shape descriptors.
We use three steps to build a diagnostic model from a set of histological images: (1) shape-based feature extraction, (2) feature selection, and (3) classifier model selection (Figure
). We then evaluate this model-building process by examining the biological relevance of shapes (i.e., examining the subtype-specific tissue shapes and cellular structures that correspond to the best features of the classification model) and testing the classifier prediction performance using independent images. Finally, we compare the shape-based diagnostic model to diagnostic models based on traditional histological image features. We show that Fourier shape-based features (1) are capable of classifying H&E-stained renal tumor histological images, (2) out-perform or complement traditional histological image features used in existing automated systems, and (3) are biologically interpretable.
Figure 2 Building and evaluating a shape-based diagnostic model using histological images. We use three steps to derive a shape-based diagnostic model from histological images: 1) shape-based feature extraction (including automatic color segmentation, individual (more ...)