|Home | About | Journals | Submit | Contact Us | Français|
Pathology is essential for research in disease and development, as well as for clinical decision making. For more than 100 years, pathology practice has involved analyzing images of stained, thin tissue sections by a trained human using an optical microscope. Technological advances are now driving major changes in this paradigm toward digital pathology (DP). The digital transformation of pathology goes beyond recording, archiving, and retrieving images, providing new computational tools to inform better decision making for precision medicine. First, we discuss some emerging innovations in both computational image analytics and imaging instrumentation in DP. Second, we discuss molecular contrast in pathology. Molecular DP has traditionally been an extension of pathology with molecularly specific dyes. Label-free, spectroscopic images are rapidly emerging as another important information source, and we describe the benefits and potential of this evolution. Third, we describe multimodal DP, which is enabled by computational algorithms and combines the best characteristics of structural and molecular pathology. Finally, we provide examples of application areas in telepathology, education, and precision medicine. We conclude by discussing challenges and emerging opportunities in this area.
Digital pathology (DP) can be broadly defined to include all aspects of acquisition, process management, and data interpretation to yield pathology information from a digitized pathology sample's image. First, the field is largely based on utilizing the advantages of modern computer and electronic (digital) systems (1–4) to improve and enhance pathology practice. Second, the development of whole-slide imaging (WSI) technology has enabled digitization of tissue slides, giving rise to new areas such as telepathology (5) and remote diagnosis (6, 7). Third, informatics and big data analytic methods (8) are providing unprecedented detail about data from the subcellular to the tissue level [in, e.g., nuclei (9–13), mitoses (14, 15), and lymphocytes (12, 16)]. Moreover, recently proposed machine learning–based approaches, in conjunction with “subvisual” image biomarkers of disease architecture, could provide information about the state of aggressiveness of the disease and enable prognostic prediction of therapeutic outcome (17).
A major shift is also taking place by including markers derived from molecular biology or chemical analysis of tissues. Whereas long-used stains such as hematoxylin and eosin (H&E) are fairly nonspecific, chemically or molecularly specific stains are increasingly being employed to understand disease progression (18). Immunohistochemical (IHC) or molecular imaging is becoming increasingly quantitative (19) and is evolving for multiplexed analysis (20). More recently, label-free methods for pathology have been developed using spectral imaging (19, 21) due to advances in instrumentation (22–25). Direct recording of chemical composition eliminates the need for dyes or stains and could be much more informative because it is not restricted to the imaging of known molecular species. The complex spectral data, however, need informatics approaches in order to be interpreted and used robustly in routine analysis (26).
The focus of this review is on the two major themes of structural and molecular trends in DP enabled by informatics. Complementing related reviews (27–31), we first discuss advances in computer algorithms. Next, we describe the state of the art in instrumentation and informatics that enable molecular DP (mDP). Because excellent reviews and monographs are available for IHC imaging and spectral analysis in molecular imaging, we focus on recent, fast-moving developments in label-free spectral imaging and the role of informatics therein. Finally, using illustrative examples, we discuss opportunities for both DP and mDP in the context of applications areas such as telepathology, education, and precision medicine. We also briefly discuss the regulatory and infrastructural challenges involved in incorporating these technologies into routine practice, both in the United States and globally.
Instrumentation for DP consists of a microscope and computing, storage, and visualization hardware. The microscope and computing hardware have been extensively described. Enhancements for fluorescence or multispectral optical imaging are commercially available. Instrumentation has been transformed from static snapshots of specific fields of view taken with camera-equipped microscopes to scanning of whole glass slides with integrated, often quantitative histologic evaluation. The primary challenge involves ensuring high quality and fidelity of digital images so that pathologists' diagnostic performance is not compromised when they rely on digital data. Whereas pathologists have traditionally had to pore over stained tissues on glass slides, technology will enable them to use the same basic equipment but will advance information content, visualization, and analytical tools to greatly improve decision making.
Computerized image analysis is at least 50 years old. Although face recognition algorithms (32) were proposed in the early 1970s, image processing and pattern recognition algorithms began to be applied to medical data with the digitization of radiology 30 years ago (33). Similarly, although analysis algorithms for microscopic cellular images (34) were put forward nearly 50 years ago, only recently, with the introduction of WSI, did tools to aid the pathologist (for, e.g., mitosis and nuclear counting) become available. These approaches aim to extend human cognition by predicting disease outcome and aggressiveness (35, 36), thereby providing true decision support. In this review, we focus first on the emerging theme of image informatics as histologic companion diagnostics. Although visually reading routine pathologic slides can allow pathologists to make diagnoses, sophisticated computer-aided image analysis can uncover more-revealing sub-visual attributes from morphology (e.g., texture, shape, architecture). Researchers have shown that these histologic biomarkers are correlated with disease progression independently of existing clinical and pathologic features. Careful processing steps are needed in order to capture subvisual cues from DP that may be associated with tumor heterogeneity and disease outcome, as described in the following subsections.
The visual appearance of digital histopathology slides has many sources of variance, two of which are the staining protocol and the choice of digital slide scanner. Although pathologists are unaffected by these factors when rendering a diagnosis from H&E images, these variances often limit their ability to generalize computer-based approaches. Consequently, development of color standardization and normalization algorithms to help improve the performance of subsequent image analysis algorithms (13, 37) has been a primary goal of recent research. Often, a single image with optimal tissue staining and visual appearance is designated the template. Intensity distributions of subsequent images are mapped to match the distribution of the template image. Several studies (38–40) have suggested that matching distributions of tissue subtypes (e.g., epithelium, nuclei, stroma) is more optimal than aligning global image distributions. In the context of histopathology, this process might involve identifying stromal tissue, nuclei, lymphocytes, fatty adipose tissue, and cancer epithelium, within both the target and template images, and then specifically establishing correspondences between the tissue partitions in the two images. This two-level matching process can be performed either by images with one contrast mechanism or by multimodal imaging and alignment (41).
Automated counting and enumeration of histologic primitives such as nuclei (42) in H&E images have attracted significant attention. Recently, there has been interest in grand challenge competitions in digital pathology image analysis, in which groups compete with their algorithms on specific problems by using a set of common data sets. These grand challenge competitions have involved detection algorithms to identify nuclei (29), lymphocytes (43), and mitoses (14). In addition to aiding pathologists in manual disease grading (9, 44–47), these identifications are also critical for subsequent automated feature analysis algorithms. Several of these detection algorithms have employed well-established computer vision and image processing approaches with heuristic rules such as active contours (48), level sets (49), and active shape models (50). Deep learning approaches, which involve the use of multiple layers of convolutional neural networks for unsupervised learning of features in order to identify the object of interest from a set of training images, led their creators to win recent grand challenges in DP image analysis (51). Figure 1 shows results of a deep learning model used to distinguish areas of invasive cancer from benign stromal areas in breast cancer H&E images (52).
Quantitative histomorphometry (QH) involves computerized image analysis tools for quantitatively assessing cancer tissue and non–cancer tissue morphology and architecture. QH measurements can be divided broadly into three groups: architectural, shape, and texture based.
Architectural features capture the arrangement and spatial topology of histologic primitives such as individual nuclei, tubules, mitoses, and lymphocytes. The spatial location of a particular primitive is considered to be a node in a graph. The nodes are then connected using graph construction algorithms [e.g., Voronoi (53), Delaunay (16), minimum spanning tree (54)]. Quantitative measurements (e.g., internode distance, clustering coefficient of the nodes) can quantitatively characterize the graph and, hence, the image. The spatial architecture and orientation of all nuclei and glands on DP images have recently been expanded to encompass the local architecture of the primitives through the use of algorithms such as cell cluster graphs (55, 56), which successfully predicted progression in p16+ oropharyngeal tumors (57). Figure 2 shows an example of global and cell cluster graphs from a digitized image of a radical prostatectomy specimen.
The shape of individual histologic primitives can indicate the presence of disease. Shape features such as fractal dimension (59), angularity, size, and smoothness of the boundary (47) differ between nuclei and glands in high and low grades of prostate and breast cancers. Finally, disorder (or entropy) in the orientation of nuclei and glands in prostate tissue predicts biochemical recurrence (BCR) postsurgery in patients with prostate cancer (58, 60).
Texture refers to quantitative measures of spatial neighborhood interactions between pixel intensities within local neighborhoods in an image. These could include first-order spatial intensity interactions (e.g., mean, standard deviation, median, variance) within local neighborhoods and second-order interactions (e.g., co-occurrence features). More complex textural features can also be extracted; these include steerable and multiscale gradient features via mathematical operators such as Gabor filters (61), local binary patterns (62), and Laws filters (63). The shape and texture of nuclei within the stroma are significantly correlated with disease recurrence and patient outcome in breast (64), prostate (65), and oropharyngeal cancers (55). Figure 3 shows the digital stain representation of a routine H&E image, with overlays of nuclear architecture networks and capture of stromal and epithelial textural variations.
A typical analysis pipeline involves a machine learning classifier that takes as input a series of manually or computer-extracted features and employs those features to render a prediction. In the context of DP, predictions might involve a low-level recognition (e.g., Is the primitive a nucleus or not?), a diagnostic decision (e.g., Is the tissue region of interest cancerous or not?), or a prognostication (e.g., Will the patient have early or distant disease recurrence?). Machine learning classifiers can be broadly categorized into supervised and unsupervised approaches. Whereas unsupervised learning approaches try to identify natural groupings directly from the feature space, supervised approaches rely on labeled training exemplars in order to learn the best classifier that optimally distinguishes instances from the classes of interest. However, the challenges involved in training supervised classifiers in the context of WSI data are data density and, hence, the need to address issues pertaining to computational complexity. To address these issues in processing WSI data, one can use multiresolution classifiers that initially perform predictions at lower resolutions and map results to the next level of image resolution. A new, higher-resolution classifier is then employed at the next level in the image pyramid and is more efficient because the interrogation is limited to areas identified as suspicious at the lower resolution. This approach, modeled on the way pathologists tend to review slides under a microscope, is employed for both diagnosis (66) and grading (45).
Both supervised and unsupervised classification approaches tend to be susceptible to the curse of dimensionality (67), an issue when the size of the feature space is significantly larger than the number of data samples. To address this issue, especially for multispectral images, investigators are increasingly using data dimensionality reduction methods. These methods can be broadly categorized into linear and nonlinear methods (68) on the basis of the approach used to map the higher-dimensional feature space onto a set of eigenvectors. Supervised and unsupervised classifiers can then be constructed in the lower-dimensional eigenvector space instead of the original high-dimensional feature vector space. A problem posed by dimensionality reduction methods is that they tend to obfuscate the original feature space, leading to a lack of transparency in the resultant classifiers. A method called FINE (feature importance in nonlinear methods) (69) has recently been proposed to reverse map eigenvector representations to the original set of image features.
Applications of image analysis for DP include not only computer-assisted diagnosis of disease but also telepathology for remote consulting, education, and precision medicine. In this section, we describe some key problems in health care and use selected examples to show how DP solutions are emerging to address them.
There is a broad need for predictive and prognostic assays to distinguish aggressive from less-aggressive phenotypes of cancer so as to identify optimal therapies in individual patients and guide clinical trials (70). Most prognostic tests in the United States and Europe are based on gene expression assays. Recent studies have shown extensive genetic heterogeneity among cancer cells between tumors and even within the same tumor. In fact, molecular signatures for both good and bad prognoses can be found in the same tumor. Interestingly, tumor morphology observed on standard H&E staining is still remarkably useful for tumor characterization. In reality, tumor morphology reflects the sum of all temporal genetic and epigenetic changes and alterations in tumor cells, thereby providing incredible utility for predicting tumor biology, clinical behavior, and treatment response. Consequently, there has been recent interest in identifying computer-assisted image features from routine H&E images that can help predict disease progression (71) and disease recurrence (58, 64). The successful validation of these H&E-based tests could have a significant impact in low- and middle-income countries that do not have access to the more expensive, molecular-based prognostic tests and assays. Figure 4 shows an example of the use of image informatics in the context of prediction outcome in oropharyngeal cancers. Specifically, Lewis and colleagues (71) showed how the combination of nuclear graphs within both the stromal and epithelial compartments can predict progression in p16+ oropharyngeal cancers.
Telepathology (5, 72, 73)—the remote analysis of biopsy samples using digital imaging technology—has been steadily increasing in popularity, not only in the United States (74) and Europe but also in low- and middle-income countries, for two reasons. First, the technology sector and Internet infrastructure are rapidly growing. Second, WSI scanners are becoming increasingly available, enabling rapid digitization in routine pathology. Telepathology can be highly effective. For example, a recent two-center study showed that a clinically important diagnosis was achieved via telepathology in 93% of 100 patient cases (75). Most cases (79%) were analyzed within 3 days, demonstrating the strong potential of telepathology for servicing low-resource areas.
Digitized slides can be shared and accessed from any location with an Internet connection (76, 77), which can prove beneficial for medical practice as well as education. For example, Ventana Medical Systems, Inc. has developed PathXchange (78), an educational image sharing system that has a global professional community and case library for networking, telepathology, sharing, and archival. A recent study (79) showed that second- and third-year medical students easily adapted to virtual microscopy, found it user friendly, and thought that the opportunity to view slides remotely was a huge advantage. The study also found that virtual microscopy increased collaboration and interactions among students. Recently, researchers introduced an Internet-based training tool called Score the Core, which employs tissue microarrays to train pathologists to visually score estrogen receptor, progesterone receptor (percent positive), and Ki-67 (percent positive) (80). A software infrastructure for tracking viewed regions in WSI and, using the regions tracked during a practical exam in oral pathology (81), analyzed the collected data to discover students' viewing behavior.
Because the acquisition of IHC microscopic images is similar to the use of current instrumentation in pathology, we focus on the state of the art of spectroscopic imaging, or chemical imaging (CI). IR imaging (82) is one of the major modalities that promises excellent image contrast, fast data recording, and exceptional molecular sensitivity; therefore, we consider it an example of the progress and potential of using CI for DP.
Given that vibrational frequencies within molecules directly resonate with optical frequencies in the mid-IR spectral region, light absorption provides a quantitative molecular fingerprint of the material, providing ample molecular biomarkers. No dyes or stains are needed to visualize molecular content, so data can be recorded from a variety of samples without prior knowledge of the type or composition of the sample (Figure 5a–e). Thereafter, informatics techniques are used to extract the desired information or to discover new information (83).
In state-of-the-art instrumentation, an interferometer is combined with a microscope to yield both spectral multiplexing and spatial multichannel advantages. An image is produced with a full spectrum acquired for every pixel. In this Fourier transform (FT) approach, entire spectra (typically, 1,024–2,048 points for 4–8 cm−1 resolution and 2n points for fast FT) from many pixels (typically, 16–16,384) are rapidly and simultaneously acquired (84). The small format of these detectors remains a major impediment to data recording speed and will likely be improved in order to enable WSI.
Interestingly, classification methods have shown (85) that collection of entire spectra is unnecessary. This idea drove the development of discrete frequency IR (DFIR) spectroscopy and imaging (82). Although DFIR imaging was initially achieved with the use of inexpensive micro-fabricated filters (86, 87), the throughput was low, and advances in tunable lasers can considerably enhance the DFIR approach and make it practical for use in mDP. In addition to hardware, another exciting development involves increasing the quality and speed of data from CI.
Akin to optical microscopes, the vast majority of experiments in IR imaging have generally relied on transmission sampling measurements. In attenuated total reflection imaging, the evanescent electric field generated from a solid immersion lens (SIL) probes a limited sample volume that is in contact with the lens. First, this technique can allow imaging of fresh-frozen samples, potentially enabling intraoperative mDP by reducing water absorption (88). Second, subwavelength SIL imaging provides ~1.25-μm spatial resolution (of the surface layer) with no need for sectioning, fixation, or staining. Numerous biomedical applications have been reported (89, 90); however, the unique optical configuration, loss of throughput, and need for contact complicate the experiment. Investigators have also used a subwavelength tip probe (91) for nanoscale measurements. Use of nanoscale imaging systems (92) in research to identify molecular changes is likely imminent, but clinical applications may not be practical in the near future.
Guided by new instrumentation (93) and theoretical developments (94), the information content of IR images has been shown to be much higher than previously thought, resulting in high-definition (HD) IR imaging. Figure 6 depicts an example of the improvement in image quality using HD optics compared with the conventional design. This approach, either with or without computational extensions (95), implies that, despite the longer wavelengths in the IR region, image quality similar to low-power optical microscopy can be obtained. HD instrumentation, with rationally designed optics (96), also poses challenges (e.g., lower throughput, need for greater speed), which can be addressed with signal processing approaches (97) and/or higher throughput, new sources (98, 99) such as quantum cascade lasers (QCLs) (100–102), and DFIR imaging. Recent studies on HD tissue imaging (103, 104), which has subcellular sensitivity, present an additional challenge to informatics methods. Broadband QCL microscopy was only recently shown to be faster than FT-IR imaging (25). Although initial attempts to use the new laser technology (105) did not provide data of sufficiently high quality, new commercial instruments (106) and laboratory prototypes have demonstrated great potential (107). In particular, through the coupling of spectral variable selection and imaging, a three-orders-of-magnitude-shorter time was recently achieved in comparison to the fastest available HD FT-IR imaging system. We anticipate that the new technology will allow imaging of biopsy samples in only a few minutes.
The straightforward hardware coupling of an interferometer with a microscope and array detector in FT-IR imaging incorrectly signaled that the theoretical models used to understand recorded data were also straightforward. Investigators now realize that a sample's geometry, morphology, microenvironment, and imaging optics also affect the spectrum (96, 108). Although one way to find features that are relatively insensitive to these effects is to use ANOVA (analysis of variance) (109) or bar coding (110), another is to use models (111, 112) of scattering prior to employing informatics approaches. Accurate modeling of the image formation process (113) to recover structural, optical, and morphologic properties of the sample requires more research. Complementary to informatics experiments, many studies seek to understand the biological origins of the signal itself. The important transformations and factors relevant to pathology that have been studied (114–117) include the cell cycle, sample preparation, and the response of biological systems to various stimuli. Although it is impossible to review the large body of research here, fundamental studies remain necessary to provide input for theory-based reconstruction efforts as well as to correlate pathologic identities and optical readouts.
Because CI requires no prior sample knowledge, whole slides can typically be imaged. Pixel-wise computer algorithms typically identify cell types or the presence of disease, as shown on the left side of Figure 3 and described in, for example, References 118 and 119. Whereas informatics methods mirror those for DP, the inputs are spectral and the domain of operation is spectral and not usually spatial. In two schools of thought, either specific spectral features or full spectra (raw or derived quantities such as principal components) serve as input. The advantages of both discrete data points and continuous spectra can be exploited through the use of sparse sampling methods (120). More commonly, dimensionality reduction and classification are combined to provide a classifier that is robust and fast. To this end, a recent trend has been to use optimized hardware–software approaches for real-time processing (121).
Because many frequencies are imaged, data sets are considerably larger for spectroscopic imaging. As a result, many more features become available for classification and data sizes are often more extensive than in optical microscopy. The information obtained is also multifaceted (molecular, morphologic, and diagnostic), presenting unique challenges that go beyond those for DP. Below is a brief review of informatics and computational approaches to the quantification of single and multiplexed biomarkers, as well as to spectroscopic imaging and stainless staining modalities.
IHC methods are frequently used to assess biomarkers in order to aid diagnoses. Because IHC staining is variable and nonlinear and the manual interpretation is subjective, several approaches have been developed for automated biomarker quantification (see Table 1 for a list of examples). Most studies use supervised learning to identify pixels stained for the specific biomarker, typically through the use of a low-level color or texture feature.
Apart from simply quantifying the expression of the individual biomarkers, understanding the spatial interaction between the differently expressing biomarkers (122) has attracted researchers' interest; an example is the three epitopes (123) involved in human tonsil pathology. Through quantitation of the fluorescence intensity for each marker, a colocalization pattern demonstrated that 12% of the total CD34 was colocalized with CK18, whereas only 1% of the CK18 was colocalized with CD34. Similarly, software in conjunction with a multispectral imaging system has been used to analyze CK18, α-methylacyl-CoA racemase, and androgen receptor expression in prostate cancer (124). A multiplexed biomarker imaging approach (8) used to study the tissue systems of Barrett's esophagus showed 14 epithelial and stromal biomarkers, providing significant differences between high-grade dysplasia and reactive atypia.
Although mass spectrometry and Raman imaging show great promise, the discussion in this review is limited to IR imaging. An overwhelming majority of studies utilize standard statistical pattern recognition techniques, including linear discriminant analysis, Bayesian methods, neural networks, and random forests, to find cell types or signatures of disease (reviewed in Reference 125). Investigators are now focusing on patient issues and outcomes (126). Notably, recent studies show that it is possible to classify tissue into carcinoma and benign at human-competitive accuracy (127), relate the cancer detected to the most probable grade (128), and to prognosticate (129).
Initial studies typically involved hundreds of spectra, a few cell types, and tens of samples. A paradigm change has occurred with the common use of tissue microarrays (TMAs) in high-throughput sampling, fast imaging, and robust informatics to provide statistically validated (130) protocols (131, 132). Whereas early techniques focused on reproducing histopathology, newer approaches are exploring data in finer detail in terms of progression in time (133), involvement of the spatial microenvironment (134), secretions (135), variants of tumors (136), and engineered model systems (137). Both supervised and unsupervised methods can be used for these purposes.
Although staining is used for routine pathology (138), molecular techniques often require multiple markers for disease stratification (139) to be imaged on the same platform. The label-free methods described above can be useful in this regard, but because of their color-coded images they do not easily translate to these established practices in pathology. In a new approach (140), CI data were used to generate computationally stained images resembling molecular stains that are common in mDP. Because the tissue is not actually stained and the histopathologic information is algorithmically obtained, this approach has been termed stainless staining. Figure 5f shows a metrics approach using a neural network model to relate the biochemical input to molecular or dye parameters in order to evaluate various stains. Eliminating the need to stain can enable faster mDP in time-limited (intraoperative) settings, precious or limited samples can be imaged without perturbation, and exploratory mDP for many epitopes can be performed.
Tumors with similar morphologic phenotypes may have significantly different behaviors and outcomes (141, 142). There is a general consensus that the intelligent combination of multiple, independent sources of clinical, molecular, and pathological data (both visual and subvisual) can provide more predictive power (143–150), but combining these diverse channels of information is challenging because the dimensionalities of the individual features differ. Multikernel learning (MKL) and canonical correlation analysis (CCA) (151, 152) enable combinations of these attributes to create a unified predictor. MKL strategies (153) allow one to mathematically project an individual set of measurements into new eigenvector representations (which have a lower dimensionality than the original features), which can then be combined into a fused representation (150). CCA strategies (151, 152) aim to find a common representation for multimodal data in which class separation is maximized while noise is minimized. Supervised classifiers can then be constructed in the MKL and CCA reduced spaces. For example, in glioblastoma multiforme (GBM) (154), a multidimensional representation of the computer-extracted nuclear shape and architecture features from whole-mount tissue sections enabled identification of subtypes on the basis of morphometric indices and the predictive capability of each subtype, and the molecular correlates of the subtypes were consistent with previous findings.
The use of CI data for mmDP is less well established but is becoming more common (Figure 7). For example, cell type information (IR histology) has been combined with morphologic variables from H&E-stained images (41). The morphological features thus extracted, optimized by a two-stage feature selection method using a minimum-redundancy maximal-relevance criterion and sequential floating forward selection, were used to classify tissue samples as either cancer or noncancer. This study achieved high accuracy [area under the receiver operating characteristic (ROC) curve (AUC) > 0.97] in cross-validation trials on each of two data sets that were stained under different conditions. In the absence of IR data, the performance of the same classification system decreased both within data sets and between data sets. Only through a combination of IR and optical data could high-accuracy classification be achieved. Other investigators have combined IR with mass spectrometric data and overlain IHC (155) and IR data (156).
Recently, there has been interest in using DP informatics to predict disease recurrence. Research (157) has shown that the combination of nuclear architecture features from H&E- and Feulgen-stained images enabled prediction of BCR in prostate cancer patients; the combination of Feulgen and H&E resulted in better prognosis prediction than did H&E- or Feulgen-derived measurements alone. The results of this study are shown in Figure 8. The 10 top-ranked computer-extracted features from among nuclear shape, nuclear architecture (described via the Voronoi and Delaunay tessellations and cell graphs), orientation, and texture for each stain type were identified via feature selection and used to train a random forest classifier to predict BCR. The features derived from Feulgen, H&E, and both Feulgen and H&E measurements resulted in statistically significant separation of the predicted BCR and nonrecurrence groups; the combination yielded the best discrimination between BCR and nonrecurrence populations (Figure 8).
In a recent study (152), machine learning and data fusion methods (i.e., CCA) were used to computationally combine computer-extracted histomorphometric features and protein expression features (via mass spectrometry) from prostate tissue specimens. The fused predictor predicted the risk of BCR within 5 years of surgery with significantly greater accuracy (AUC=0.92) than morphometric or proteomic features alone.
The use of CI demonstrated both epithelial and stromal changes associated with cancer recurrence without the use of dyes or destructive analysis. Furthermore, because no stains were required, the stromal changes could be probed on a single sample, providing an effective means of adding microenvironment data. In a midgrade dominant prostate cancer patient cohort for whom existing prognostic tools were ineffective, the CI approach outperformed two widely used tools, the Kattan nomogram and the CAPRA-S (Cancer of the Prostate Risk Assessment, postsurgical) score, while providing a histologic explanation for the prediction (Figure 9). An interesting aspect of this study was the use of a frequent pattern mining approach, which enabled similar groups of cells in both the tumor and its microenvironment to be probed simultaneously. The IR spectral information was found to be independent of currently used variables in nomograms in a logistic regression model, providing a new avenue for investigation both of other aspects of the disease, such as therapy response, and of other tumors. Currently, it is not possible to make these numerous stromal observations in a laboratory setting practical for a clinical test. The challenges of measuring multiple cell types via multiple cumbersome steps and of manually combining the data from multiple sections and with clinical management have limited the potential of using the microenvironment. mmDP addresses this need directly.
Technological advances and innovations, in both instrumentation and informatics tools, are driving several exciting new directions in DP. We have summarized the state of the art with selected examples that introduce the reader to this emerging area. In particular, we note that the field of pathology is moving toward more informative multimodal tissue imaging, wherein integration of computer-derived morphologic and functional tissue-based measurements is becoming a central theme. In addition to conventional decision making, these transitions in DP will lead to a greater impact in the areas of telepathology, education, and precision medicine. The present challenges extend beyond the need for progress in hardware or software to encompass engineering pipelines for robust decision making and improving patient outcomes. This progress will require the integration of engineering and pathology at an unprecedented scale.
DP has not yet been approved for primary diagnosis in the United States. The slow pace of regulatory approval is changing (158), but further impetus is needed to drive clinical adoption (159). Pathology has been practiced with a traditional microscope for more than 100 years; understandably, mass adoption is unlikely to happen spontaneously. Lessons from radiology are pertinent: Following the introduction of digital mammography, radiologists soon gave up looking at mammographic films and made the transition to digital techniques. DP involves additional barriers, including the fact that (unlike radiology) it is not implicitly digital—that is, the tissue slide still needs to be constructed prior to digitization—but the feature sets (both morphologic and molecular) are much richer. The use of advanced computational image analytic tools will offer substantial help to pathologists in efficiently navigating and interpreting the digitized imagery. This issue will become more important as multiplexed tissue biomarkers and multimodal tissue imaging becomes more routine, requiring analytic and informatics methods to scale up to the impending digital data deluge (160). To be accessible to a practicing pathologist, these tools must be able to run on a stand-alone desktop computer. Clearly, in order for DP to be broadly adopted in the clinic, it will be critical to optimize these methods.
The rapid pace of progress in mDP and multimodal DP has been driven by advances in instrumentation, but emerging measurement techniques now need to be standardized (161) and broadly validated. These two advances can offer a wealth of new, independent information that not only allows clinicians to make more effective clinical decisions but also helps them discover underlying biological changes. Such data can also inspire research and aid in the formulation of new hypotheses. In addition to conventional clinical diagnoses, the additional information obtained with mDP and mmDP may enable more molecular-specific diagnoses and rapid intraoperative assessment (162).
Recent progress has led to a transformation in the practice of pathology—one that may rival the introduction of the microscope by Antonie van Leeuwenhoek in the 1600s. The impact and spread of this revolution, however, will hinge on mass adoption following regulatory approvals and represent a rich area of investigation for biomedical engineers and data scientists.
Research reported by R.B. in this review is currently supported by the National Institutes of Health (NIH) (grants R01CA197516, R01GM117594, UOIMHl09062, R21CA190120, and R01EB009745). Research reported by A.M. in this review was supported by the National Cancer Institute of the NIH (awards 1U24CA199374-01, R21CA179327-01, and R21CA195152-01); the National Institute of Diabetes and Digestive and Kidney Diseases (award R01DK098503-02); a US Department of Defense (DOD) Prostate Cancer Synergistic Idea Development Award (PC120857); a DOD Lung Cancer Idea Development New Investigator Award (LC130463); a DOD Prostate Cancer Idea Development Award; a Case Comprehensive Cancer Center Pilot Grant; and a VeloSano Grant from the Cleveland Clinic and the Wallace H. Coulter Foundation Program in the Department of Biomedical Engineering at Case Western Reserve University. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.
Disclosure Statement: A.M. is a scientific advisory board member/consultant and has equity in Inspirata Inc. A.M. also has equity in Elucid Bioimaging and is working with PathCore Inc. on National Institutes of Health grant 1U24CA199374-01.