|Home | About | Journals | Submit | Contact Us | Français|
We used reflectance and fluorescence spectroscopy to noninvasively and quantitatively distinguish benign from dysplastic/malignant oral lesions. We designed diagnostic algorithms to account for differences in the spectral properties among anatomic sites (gingiva, buccal mucosa, etc).
In vivo reflectance and fluorescence spectra were collected from 71 patients with oral lesions. The tissue was then biopsied and the specimen evaluated by histopathology. Quantitative parameters related to tissue morphology and biochemistry were extracted from the spectra. Diagnostic algorithms specific for combinations of sites with similar spectral properties were developed.
Discrimination of benign from dysplastic/malignant lesions was most successful when algorithms were designed for individual sites (area under the receiver operator characteristic curve [ROC-AUC], 0.75 for the lateral surface of the tongue) and was least accurate when all sites were combined (ROC-AUC, 0.60). The combination of sites with similar spectral properties (floor of mouth and lateral surface of the tongue) yielded an ROC-AUC of 0.71.
Accurate spectroscopic detection of oral disease must account for spectral variations among anatomic sites. Anatomy-based algorithms for single sites or combinations of sites demonstrated good diagnostic performance in distinguishing benign lesions from dysplastic/malignant lesions and consistently performed better than algorithms developed for all sites combined.
Currently, definitive detection and diagnosis of oral cancer requires biopsy followed by histopathologic assessment of the excised tissue.1 However, there are several shortcomings to this scheme. First, only a limited number of biopsy specimens can be taken because of the invasiveness of the procedure. On the basis of his or her experience, the physician selects the area of the lesion most likely to show significant disease as the biopsy site. The absence or presence of disease in this specimen is assumed to be representative of the extent of disease in the suspicious lesion as a whole, and this finding often determines whether treatment is indicated.2 Given the subjective nature of this process, regions of disease can be missed. Significant underdiagnosis has been noted with biopsy, particularly when the lesion is nonhomogeneous or only a single biopsy specimen is obtained.2,3 Second, the accuracy of pathological classification is limited by significant interobserver and intraobserver variability, largely due to the qualitative nature of the markers used for assessment.4–6
Spectroscopy may provide an objective and non-invasive tool for disease diagnosis. Promising findings have been reported for identifying oral lesions by using reflectance and fluorescence spectroscopy 7–13 The majority of these studies have focused on distinguishing healthy mucosa from visible lesions (either a group consisting of dysplastic and malignant lesions [“dysplastic/malignant”] or malignant lesions).7,8,12–19 However, this separation has no clinical relevance, because these categories are easily distinguished by visual inspection. In some cases, benign lesions are grouped with healthy mucosa samples.9,20 Most studies combine data from several anatomic sites (eg, gingiva, buccal mucosa) in each category in generating diagnostic algorithms based on differences in the spectral properties (spectral contrast).7,10,11,13,14,18,21 This approach may not be ideal, because numerous studies have shown significant differences in the spectral properties between various anatomic sites, even for healthy oral mucosa.7,15,22–24 We refer to the spectral contrast produced by variations in histologic characteristics from site to site as anatomic spectral contrast. The presence of keratin on some sites (particularly the gingiva and hard palate) produces marked anatomic spectral contrast between these sites and nonkeratinized sites.9,24 A recent study of clinically normal mucosa in healthy volunteers (HVs) by our group demonstrated considerable anatomic spectral contrast even among nonkeratinized sites.25 The results of our investigation suggested that spectral diagnostic algorithms must be site-specific to ensure accurate disease diagnosis. We use the term anatomy-based algorithms to refer to algorithms that meet this condition, in that they are developed from and applied to a specific site or group of sites. Because most studies combine several sites, the reported diagnostic accuracies may be confounded by anatomic spectral contrast and therefore unreliable.
The distinction of benign lesions from dysplastic/malignant lesions is not possible by visual inspection alone and has considerable clinical significance, yet few spectroscopy studies focus on this separation.21,26–30 Anatomic spectral contrast can also influence the diagnostic results for this separation if it is not taken into account. In a study by de Veld et al,28 for most analysis methods, values of less than 0.70 were obtained for the area under the receiver operator characteristic (ROC) curve (ROC-AUC) when data from 11 anatomic sites were combined. Using fluorescence spectroscopy, van Staveren et al11 combined all sites and achieved a sensitivity of 31 % and a specificity of 67%. Greater success has been achieved in distinguishing lesions when the analysis was limited to a single site or a specific group of sites. Wang et al30 performed a fluorescence study of patients with lesions located only on the buccal mucosa. They obtained a sensitivity of 81% and a specificity of 96% for distinguishing benign lesions from dysplastic/malignant lesions. Mallia et al21 successfully developed diagnostic algorithms to discriminate various oral lesions by using fluorescence spectroscopy, but only after excluding lesions of the vermilion border of the lip and of the dorsal and lateral surfaces of the tongue. They suggested that a separate spectral database is needed for these 3 sites. In another study by Mallia et al,26 reflectance spectroscopy was used to develop algorithms for all sites combined and for the buccal mucosa (ie, an anatomy-based algorithm). The sensitivities and specificities obtained for algorithms specific to the buccal mucosa were consistently higher for the discrimination of benign lesions from dysplastic lesions and dysplastic lesions from malignant lesions. Müller et al9 developed a diagnostic algorithm to distinguish benign lesions from dysplastic/malignant lesions specifically for keratinized sites, and they obtained a higher sensitivity and specificity than when keratinized and nonkeratinized sites were combined. Although diagnostic algorithms for single sites or limited groups of sites have been investigated, a systematic approach for developing spectral algorithms for multiple sites has not been developed.
The goal of the present study was to develop a strategy for designing anatomy-based algorithms for multiple anatomic sites. Although the spectral properties of different anatomic sites may vary considerably, certain subsets may be similar enough in their spectral properties such that they can be combined. Furthermore, in many settings, developing a single diagnostic algorithm for spectrally similar sites may be more practical and less time-consuming than developing a diagnostic algorithm for each individual site. Anatomy-based spectral diagnostic algorithms are developed for 2 applications: 1) distinguishing visibly healthy mucosa from visible lesions, and 2) distinguishing benign lesions from dysplastic/malignant lesions. The distinction of clinically normal (healthy) mucosa from clinically abnormal mucosa (lesions) has been the focus of most of the previous studies in the oral cavity. We address the question of whether anatomic spectral contrast affects the diagnostic accuracy of this separation. The second application is the primary objective of this work, since lesions demonstrating disease (dysplasia or malignancy) require intervention.
From the tissue spectra, we extract parameters related to the morphological and biochemical properties. To account for anatomic spectral contrast, we first study healthy mucosa from several anatomic sites to characterize the similarities and differences in their spectroscopy parameters. We then apply these findings by developing common diagnostic algorithms for combinations of sites that share a high degree of spectral similarity. To evaluate the benefit of anatomy-based algorithms, we compared the performance of algorithms developed for these specific combinations of sites to those developed for all anatomic sites combined and to those for a single anatomic site. We also examine the trends in the spectroscopy parameters on the basis of disease category.
Patients were recruited from the Department of Otolaryngology–Head and Neck Surgery at Boston Medical Center. The experimental protocol for in vivo data collection was approved by the Institutional Review Board of Boston Medical Center and the Committee on the Use of Humans as Experimental Suhjects at the Massachusetts Institute of Technology. Subjects enrolled in the study included patients undergoing biopsy for clinically suspicious lesions, as well as patients undergoing surgical resection of known dysplastic or malignant lesions. Pregnant women and individuals under the age of 18 years were excluded from the study. Written informed consent was obtained from all subjects to indicate their willingness to participate. Spectral data were collected from tissues with visible abnormalities identified by the clinician. The clinical appearance of the lesions (eg, leukoplakia, erythroplakia) was recorded. After collecting the spectral data, the physician scored the exact area from which the data were acquired using the blade of a small punch biopsy instrument (1.5 or 2 mm in diameter). The tissue sample was subsequently removed using a larger punch biopsy instrument (3.5 mm diameter) and sent for histopathologic evaluation. In some cases, a scalpel biopsy was performed because of the lesion architecture or location. Spectral data were collected from multiple tissue samples in a patient when the physician selected multiple areas of the lesion for biopsy, or when several biopsies were performed of a lesion to be resected. The histopathologic assessment was performed by 3 pathologists. Specimens were classified as benign, dysplastic, malignant, or indefinite for dysplasia. The consensus diagnosis (agreement between at least 2 of the 3 pathologists) was considered the final diagnosis. An additional set of 710 spectra previously collected from 79 HVs at 9 anatomic sites was also analyzed, as described below.25 Tissue samples from HVs were assumed to represent healthy mucosa and were not biopsied.
Reflectance and fluorescence spectra were collected from patients using the Fast Excitation Emission Matrix (FastEEM), an investigational instrument developed by our laboratory that has been previously described.31 A 1.3-mm-diameter optical fiber probe was used to deliver the excitation light and collect the light emitted from the tissue. The optical fiber probe was disinfected with Cidex OPA (Advanced Sterilization Products, Irvine, California) according to the manufacturer’s specifications, and placed in contact with the tissue during data collection. No exogenous contrast agents were applied to the tissue. The data collection procedures have been further described elsewhere.25 Data were collected and analyzed from the following anatomic sites: buccal mucosa (BM), dorsal surface of the tongue (DT), floor of the mouth (FM), gingiva (GI), hard palate (HP), lateral surface of the tongue (LT), retromolar trigone (RT), soft palate (SP), and ventral surface of the tongue (VT).
We employed physical models to extract morphological and biochemical parameters from the tissue reflectance and fluorescence spectra. Details of the models are presented in other reports.25,32,33 The inputs to the reflectance model are the reduced scattering coefficient, μs′(λ), and the absorption coefficient, μa(λ). By modeling the wavelength dependence of μs′, we extract 3 parameters: A, B, and C. The A parameter is a scaling parameter proportional to the density of scatterers. The B parameter reflects the size of the scattering particles.34 The C parameter represents the magnitude of scattering by small scatterers.
The absorption coefficient was modeled as the sum of the contributions from 2 absorbers, hemoglobin (Hb) and β-carotene (βC). Hemoglohin absorption, μaHb(λ), was modeled with use of a correction for the effect of vessel packaging, which provides the effective blood vessel radius as an additional fitting parameter.25 By modeling the absorption, we extracted the following 4 parameters: the concentration of Hb (cHb), the oxygen saturation (α), the effective vessel radius, and the concentration of βC (cβC). Hence, from the reflectance spectra we extracted 7 parameters in all: 3 scattering parameters (A, B, C) and 4 absorption parameters.
We applied a physical model to the measured fluorescence spectra in order to remove distortions introduced by scattering and absorption and obtain what we call the intrinsic fluorescence.33,35 In order to apply the findings of a study of the spectral properties of healthy mucosa by McGee et al25 to the present study, data were analyzed at the same 2 excitation wavelengths (308 nm and 340 nm). At 308-nm excitation, the spectra were fit with a linear combination of tryptophan (308 nm), Coll (308 nm), and NADH (308 nm). The number in parentheses represents the excitation wavelength, Coll refers to collagen, and NADH refers to the reduced form of nicotinamide adenine dinucleotide. At 340 nm excitation, the spectra were fit with a linear combination of NADH (340 nm), Coll401 (340 nm), and Coll427 (340 nm), Coll401 (340 nm), and Coll427 (340 nm) are attributed to 2 distinct collagen components.25 We obtained a total of 6 fluorescence parameters. Overall, a total of 13 spectroscopy parameters were used to characterize the degree of spectral similarity between various sites and to construct diagnostic algorithms.
We compared the spectroscopy parameters extracted from each of the 9 sites in the HV data to identify anatomic sites that shared similar spectral properties and could be combined. The interquartile range exclusion criteria were applied to the data to remove outliers (eg, from probe slippage, tissue movement).36 For each spectroscopy parameter, a multiple comparison test was performed to compare each pair of sites and identify those that had statistically different distributions. The Tukey-Kramer correction was applied to compensate for multiple comparisons. For each pair of anatomic sites, we calculated a “similarity score,” the total number of spectroscopy parameters for which the distributions for the 2 sites were not statistically different. Perfect correspondence between the spectroscopy parameter distributions for a pair of sites would be indicated by a similarity score of 13 (total number of spectroscopy parameters). We considered pairs of sites with a similarity score of at least 10 to be spectrally similar sites.
The diagnostic performance of anatomy-based algorithms developed for spectrally similar sites or individual sites was compared to that of algorithms designed for all sites combined. Logistic regression was used to develop the diagnostic algorithms. The log-likelihood ratio test was used to identify the spectroscopy parameters with the greatest diagnostic potential for the distinction of healthy mucosa from lesions and of benign lesions from dysplastic/malignant lesions.37 The discriminatory power of each algorithm was evaluated by leave-one-out cross-validation and by calculating the ROC-AUC, sensitivity, and specificity. The sensitivities and specificities quoted in the subsequent analysis were selected based on the Youden index, the point on the ROC curve with the maximum vertical distance from the 45° line.38
The fewest parameters necessary to achieve the maximum ROC-AUC for a comparison were retained. To prevent overtraining, we limited the maximum number of spectroscopy parameters used in the model to n/5, where n is the number of training samples in the category with the fewest samples. To further ensure the reliable assessment of the diagnostic algorithms, we only analyzed categories in which there were at least 10 samples.
A total of 71 patients were recruited for this study. The average age (±SD) of the patients was 56±14 years. Of the 101 spectra collected from patients, 12 spectra from 7 patients were excluded because of an error noted during data collection, a broken probe, or the poor quality of the biopsy specimen, or because the pathology results were unavailable. Two samples, one of which was a lymphoma and another of which was classified as indefinite for dysplasia, were also excluded. Thus, a total of 87 spectra collected from 64 patients were included in the final analysis (Table 1). The abbreviations for the 9 sites are also shown in Table 1. The average age (±SD) of the 64 patients was 56 ± 15 years. Data collected from HVs on the BM, DT, FM, GI, HP, LT, RT, SP, and VT were used in the subsequent analysis. These groups consisted of 100, 124, 112, 58, 54, 112, 33, 43, and 74 spectra, respectively (not shown in Table 1).
The set of 710 spectra collected from 9 anatomic sites in HVs was used to calculate similarity scores for each pair of sites and to identify which specific pairs could be combined. There were pronounced differences in the spectral properties among the various anatomic sites (Table 2). For 75% of the comparisons, the similarity score was 7 or less. We identified sites with a high degree of similarity on the basis of a similarity score of at least 10. The following 3 pairs of sites met this criterion: 1) BM and SP, 2) FM and VT, and 3) GI and HP. We refer to pairs of sites with high similarity scores as spectrally similar sites. Just as for the dysplastic/malignant category, we use the slash notation (eg, FM/VT) in the text that follows to refer to a group that combines samples from 2 sites.
For algorithm development, it is important to have a sufficient number of samples in each category. In this study, we used the criterion of at least 10 samples per category. Table 3 lists the numbers of benign and dysplastic/malignant lesions for each of the 3 pairs. For the FM/VT group, we analyzed samples from every category (healthy mucosa, lesions, benign lesions, and dysplastic/malignant lesions). For the BM/SP group, we analyzed samples from the healthy mucosa, lesion, and benign lesion categories.
We first examined the benefit of anatomy-based algorithms when applied to the distinction of healthy mucosa samples from lesions. Table 4 lists the ROC-AUC results for discriminating healthy mucosa (HV data) from visible lesions (patient data). In each case, the results are shown for all sites combined (9 sites total), nonkeratinized sites (all sites except the GI and HP), individual sites, and spectrally similar sites. Our prior study in HVs clearly demonstrated that the GI and HP display significant spectral contrast from most other sites for several spectroscopy parameters.25 By analyzing nonkeratinized sites separately, we eliminate spectral contrast due to keratin, and can determine whether eliminating it improves the identification of lesions among the remaining sites in the group.
The results in Table 4 indicate that excellent discrimination between healthy mucosa and lesions is possible with spectroscopy (ROC-AUC, 0.81 to 0.97), particularly when the diagnostic algorithms are developed and applied to individual sites or spectrally similar sites. The diagnostic performance of the spectral algorithms was poorest for the group in which all sites were combined (ROC-AUC, 0.81 to 0.85). The spectroscopy parameters frequently identified as diagnostic included NADH (340 nm), cβC, tryptophan (308 nm), Coll427 (340 nm), and α. The effective vessel radius, Coll (308 nm), and A and C parameters were not diagnostic.
When lesions were compared to healthy mucosa, the parameter trends were found to be identical for all 3 separations: healthy mucosa versus all lesions, benign lesions, and dysplastic/malignant lesions, respectively. Lesions were characterized by higher B parameter, cHb, α, tryptophan (308 nm), and NADH (340 nm) and by lower cβC, NADH (308 nm), Coll401 (340 nm), and Coll427 (340 nm). We compared the parameter trends for lesions relative to healthy mucosa to the trends for healthy keratinized sites (GI and HP) relative to healthy nonkeratinized sites. The latter comparison was used as a marker for the influence of keratin; however, mucosal features specific to the GI and HP may also contribute to the trends. The trends for both comparisons were identical for 7 of the 9 spectroscopy parameters employed in the spectral diagnostic models (data not shown).
We next evaluated the benefit of anatomy-based algorithms for a clinically relevant distinction, the discrimination of benign lesions from dysplastic/malignant lesions. Spectral algorithms were developed for the 4 groups with at least 10 samples in the 2 lesion categories: all sites combined, nonkeratinized sites, the LT alone, and the FM/VT group. The Figure shows ROC curves for all sites combined, the LT alone, and the FM/VT group. The ROC curve for all sites combined demonstrated the poorest discriminatory power.
Table 5 lists the ROC-AUC values, sensitivities, and specificities for the discrimination of benign lesions from dysplastic/malignant lesions. The highest ROC-AUC value (0.75) was obtained for the LT, and once again, the lowest value (0.60) was obtained when all sites were combined. The diagnostic performance of the FM/VT group was considerably better than that of all sites combined.
Table 6 lists the spectroscopy parameters that were combined to yield the performance values shown in Table 5. The diagnostic parameters were almost entirely different from those used to separate healthy mucosa from lesions (data not shown). Although the C parameter was not diagnostic in distinguishing healthy mucosa from lesions, it played a significant role in separating benign lesions from dysplastic/malignant lesions. Similarly, Coll401 (340 nm) had greater significance in separating benign lesions from dysplastic/malignant lesions than in separating healthy mucosa from lesions. The cβC continued to be an important diagnostic parameter. Dysplastic/malignant lesions were characterized by higher C parameter and cHb and by lower cβC, Coll401 (340 nm), and Coll427 (340 nm).
The oral cavity is complex, in that it consists of multiple morphologically distinct anatomic sites, each of which is characterized by its own spectral properties. In many clinical studies, only a limited amount of data can be collected in a practical time period; therefore, data from all anatomic sites are combined to maximize the number of samples in each category. Consequently, differences in the histologic characteristics between the anatomic sites may confound the detection of spectral contrast related to the presence of disease. We developed a strategy to evaluate whether certain pairs of sites are comparable in their spectral properties so that one could combine those sites with a high degree of similarity. This approach was designed to address 2 potential pitfalls: 1) combining dissimilar sites may broaden parameter distributions and make it more difficult to detect disease-related spectral contrast and 2) if the sites being combined have different spectral properties and samples are unevenly distributed between the 2 categories being compared, the likelihood is increased that anatomic spectral contrast may compete with or contribute to the apparent disease-related contrast. We identified 3 pairs of spectrally similar sites: 1) BM and SP, 2) FM and VT, and 3) GI and HP.
The results of this comparison demonstrated the significant influence of anatomic spectral contrast on diagnostic accuracy, as well as a clear benefit to combining spectrally similar sites. As in other studies, we obtained excellent results for this comparison; however, anatomy-based algorithms were more accurate (ROC-AUC, 0.90 to 0.97) than were algorithms developed for all sites combined (ROC-AUC, 0.81 to 0.85). We tested an alternative approach for combining multiple sites on the basis of a morphological criterion, only combining nonkeratinized sites together. The performance of the diagnostic algorithm for the nonkeratinized group did not produce ROC-AUC values dramatically higher than those for all sites combined. The number of lesion samples from keratinized sites in our data set was limited (5 samples total); however, this finding is consistent with our previous observation of significant anatomic spectral contrast among nonkeratinized sites, and supports the need for a more effective strategy for determining which sites to combine.25
The excellent separations achieved for comparisons between healthy mucosa and lesions may be influenced by hyperkeratosis, ulceration, or other mucosal changes associated with lesions that are not directly linked to malignancy. The spectral contrast between healthy mucosa and visible lesions is largely independent of the presence of disease, as shown by the comparable ROC-AUC values for discriminating healthy mucosa from benign lesions and healthy mucosa from dysplastic/malignant lesions (0.81 to 0.96 and 0.85 to 0.97, respectively). The parallel parameter trends for lesions compared to healthy mucosa and healthy keratinized sites compared to healthy nonkeratinized sites provide added support for the hypothesis that keratin, rather than disease, is a major source of spectral contrast. These results indicate that clinically healthy mucosa samples and samples from clinically abnormal mucosa (lesions) should be treated as distinct entities. The high values for the diagnostic accuracy reported for this comparison in previous studies may in fact be unrealistically high because of the confounding effects of mucosal changes that are unrelated to disease.
The parameter trends we observed when comparing lesions to healthy mucosa are consistent with those of other studies. In distinguishing healthy mucosa from malignant lesions, Amelink et al12 noted that malignant lesions exhibited a lower α, an increase in blood content, and an increase in scattering slope (B parameter). A decrease in scattering amplitude (A parameter) was also observed, which was not found in our analysis. Müller et al9 also noted increased NADH and decreased collagen fluorescence for lesions excited at 340 nm. The increase in the B parameter is most likely due to scattering from keratin present in leukoplakias, as previous work has shown that this parameter is greatly affected by this feature.25 Vascular dilation or inflammation in ulcerated lesions or erythroplakias may give rise to an increased cHb. The decreased collagen fluorescence at 340 nm excitation, which may result from degradation of the extracellular matrix by matrix metalloproteinases, has been noted in numerous spectroscopic studies.27,39–41 The increased NADH at 340 nm excitation may be due to the increased metabolic activity of abnormal proliferating epithelial cells.
The results for the distinction of healthy mucosa from lesions demonstrated the benefit of anatomy-based algorithms, but this application has no real clinical importance. Therefore, we applied the anatomy-based algorithms to the distinction of benign lesions from dysplastic/malignant lesions. Once again, the best diagnostic performance was obtained for the individual site (LT), and the poorest performance for all sites combined (ROC-AUC, 0.60). The FM/VT group (spectrally similar sites) also produced more accurate results (ROC-AUC, 0.71) than did all sites combined. We did not evaluate the BM/SP group, because it did not meet the criterion of at least 10 samples in each category.
The ROC-AUC values obtained for the distinction of benign lesions from dysplastic/malignant lesions were significantly lower than those for the distinction of healthy mucosa from lesions. Furthermore, very few parameters were statistically significant in the diagnostic model for the former comparison. Therefore, the diagnostic accuracies reported in studies comparing healthy mucosa to lesions may be misleading, because they do not reflect how spectroscopy performs when applied to clinically important diagnostic distinctions. The heterogeneity of lesions within each category contributes to the challenge of distinguishing lesion categories and can further obscure subtle disease-related changes in the spectroscopy parameters. Some malignant lesions had small foci of invasion in a large area of relatively normal mucosa, whereas others demonstrated more diffuse disease. Similarly, the benign lesions ranged from mildly to highly hyperkeratotic, with various degrees of hyperplasia and inflammation. Interindividual variation is another source of spectral contrast that may affect diagnostic performance.
Another approach for combining sites, employed by some researchers, is to combine healthy mucosa samples with benign lesions. This can dramatically increase the number of samples in the negative-for-dysplasia/malignancy category, because healthy mucosa samples are readily available. However, this approach may produce misleading results for 2 reasons. First, we have shown that there is significant spectral contrast between healthy mucosa and lesions even in the absence of disease. Second, our findings demonstrate a marked difference in spectral contrast between benign lesions and dysplastic/malignant lesions, as compared to healthy mucosa and visible lesions. Therefore, the diagnostic performance may falsely appear to improve as the proportion of healthy mucosa samples in the negative group increases. To test this hypothesis, we combined 10 LT healthy mucosa samples with the benign LT lesions. Whereas an ROC-AUC value of 0.75 was obtained for distinguishing benign lesions from dysplastic/malignant lesions, the ROC-AUC value increased to 0.85 when we used the combined healthy mucosa/benign lesion category (data not shown). When 50 LT healthy mucosa samples were combined with the same benign samples, the ROC-AUC value increased further, to 0.91. These results underscore the importance of considering how data are combined in order to reliably evaluate spectral diagnostic algorithms.
We examined the trends in the spectral parameters used in the diagnostic models with increasing disease severity. The increase in the C parameter is consistent with increased scattering by small particles, as may occur with increased epithelial proliferation. The higher cHb may be the result of vascular dilation and inflammation as seen in erythroplakias or with ulceration (features that are more likely to be associated with dysplasia/malignancy), whereas white lesions are less likely to be dysplastic/malignant upon biopsy.42,43 Decreased collagen fluorescence was observed once again for this distinction. The reason for the decreased cβC is unknown. β-Carotene is an antioxidant and can be metabolized into vitamin A, which is involved in the differentiation of normal epithelial cells.44
The current standard screening method for oral cancer — inspection with palpation — enables inspection of the entire oral cavity. Our probe may not be appropriate for this purpose, because it samples only a small area of tissue. However, when patients with suspicious lesions are referred to an otolaryngologist, spectroscopy may be useful for distinguishing benign lesions from dysplastic/malignant lesions. A study by Waldron and Shafer42 of 3,256 leukoplakias (the most commonly encountered oral lesions) found that 42.9% of biopsies from FM lesions and 24.2% of biopsies from tongue lesions exhibited dysplasia/malignancy. Of the 3,256 cases of leukoplakia, 6.8% and 8.6%, respectively, were located at these sites. If we consider 1,000 patients with leukoplakia of the tongue or FM, a total of 442 FM lesions and 558 tongue lesions would be expected, given the relative frequency of leukoplakia at these 2 sites noted in that study. Of these patients, 32.5% would be expected to show dysplasia/malignancy (42.9% of FM lesions and 24.2% of tongue lesions). The performance values for the LT and FM/VT sites are similar, so we use the average values of 93% and 64% as the sensitivity and specificity, respectively, of the spectral diagnostic test. Based on these values and the aforementioned prevalence estimates for diseased lesions, the positive and negative predictive values for our test are 55% and 95%, respectively. These results indicate that our spectral algorithms can identify the absence of dysplasia/malignancy, with few false negatives, thereby reducing the number of unnecessary biopsies. A large prospective study is needed to rigorously confirm these findings.
The results of this study demonstrate the importance of developing anatomy-based algorithms for disease detection in the oral cavity. Combining multiple anatomic sites without accounting for anatomic spectral contrast reduced disease-related contrast and resulted in poor diagnostic performance. Spectral algorithms designed specifically for multiple anatomic sites sharing a high degree of spectral similarity considerably improved diagnostic accuracies. The best diagnostic performance was achieved when anatomy-based algorithms were developed for individual sites. The discrimination of healthy mucosa from lesions yielded excellent results, as in previous studies, but the findings do not extend to the clinically relevant distinction of benign lesions from dysplastic/malignant lesions. However, anatomy-based algorithms significantly improved diagnostic performance for the latter distinction. Diagnostic accuracies for distinguishing healthy mucosa/benign lesions from dysplastic/malignant lesions were dependent on the proportion of healthy mucosa samples and were therefore unreliable, even with the use of anatomy-based algorithms. This study demonstrates a successful approach for developing common algorithms for multiple anatomic sites while accounting for anatomic spectral contrast.
Developing a spectral diagnostic algorithm for every anatomic site may not be necessary to provide a substantial clinical benefit. One potential future direction is to develop a reliable diagnostic tool designed specifically for sites at high risk for oral cancer. To ensure that dysplasia or malignancy is not missed, biopsies are likely to be taken of lesions at these sites; therefore, a tool for identifying benign lesions could reduce the number of unnecessary biopsies. Alternatively, unnecessary biopsies could be reduced by developing a tool for sites at which lesions are frequently encountered but generally prove benign by histopathology (low likelihood of disease). Large, prospective clinical studies are needed to reliably evaluate the efficacy of spectroscopy for these applications. Ultimately, we would like to detect invisible disease, particularly in high-risk individuals and at surgical margins.
This research was supported by National Institutes of Health grants R01-CA097966 and P41-RR02594-21.
We thank Luis Galindo and Jon Nazemi for developing the spectroscopic clinical instrumentation.