|Home | About | Journals | Submit | Contact Us | Français|
Elastic scattering spectroscopy (ESS) may be used to detect high-grade dysplasia (HGD) or cancer in Barrett’s esophagus (BE). When spectra are measured in vivo by a hand-held optical probe, variability among replicated spectra from the same site can hinder the development of a diagnostic model for cancer risk. An experiment was carried out on excised tissue to investigate how two potential sources of this variability, pressure and angle, influence spectral variability, and the results were compared with the variations observed in spectra collected in vivo from patients with Barrett’s esophagus. A statistical method called error removal by orthogonal subtraction (EROS) was applied to model and remove this measurement variability, which accounted for 96.6% of the variation in the spectra, from the in vivo data. Its removal allowed the construction of a diagnostic model with specificity improved from 67% to 82% (with sensitivity fixed at 90%). The improvement was maintained in predictions on an independent in vivo data set. EROS works well as an effective pre-treatment for Barrett’s in vivo data by identifying measurement variability and ameliorating its effect. The procedure reduces the complexity and increases the accuracy and interpretability of the model for classification and detection of cancer risk in Barrett’s esophagus.
The incidence of esophageal adenocarcinoma has increased dramatically since the 1970s, and it is now the fifth commonest cause of cancer death in the UK. The five-year survival rate for this cancer is less than 10%.1,2 Barrett’s esophagus (BE) is a premalignant condition in which the normal squamous epithelium of the esophagus is replaced by metaplastic columnar epithelium,3 increasing the risk of developing adenocarcinoma by 30 to 125 times4–7 when compared to the general population. Systematic endoscopic surveillance of BE has been shown to detect esophageal adenocarcinoma at an early and curable stage.8 High-grade dysplasia (HGD) is the current most robust predictor of future cancer risk in patients with BE, with around 50% progressing to adenocarcinoma at five years if it is not treated.9,10
Endoscopic surveillance relies on regularly spaced, but essentially random, biopsies being taken from the four quadrants of the Barrett’s segment every 2 cm. It is time-consuming and labor intensive and has a low detection rate for HGD, even when abnormalities exist.11,12 The challenge for clinicians and scientists is to develop new technologies for detecting patients at high risk of progression to cancer. Ideally, this would be accurate, easy to use, and inexpensive and would provide results rapidly, preferably without the need to remove tissue.
Elastic scattering spectroscopy (ESS) is an in vivo optical point measurement, which, using an appropriate optical geometry, is sensitive to changes in the physical properties of tissue.13 The optical probe is passed through the working channel of an endoscope and is placed in direct contact with tissue. Short pulses of white light interrogate a small volume of superficial tissue approximately 0.5 mm in diameter and 0.5 mm deep. (See details in the following). Results are available within fractions of a second. Since the technology uses white light and produces a strong backscattered signal, components are inexpensive and the system is simple to manufacture. It is also safe, because only light in the range 320 to 900 nm is used for illumination, with shorter-wavelength ultraviolet light being filtered out. Many of the features that pathologists look for in diagnosing HGD have also been shown to affect light scattering, including the nuclear:cytoplasmic ratio (in Monte Carlo modeling14); the cellular packing density;15 and the nuclear size.16 The nuclear chromatin content has also been shown to affect the spectra of both singly scattered light17,18 and high angle scatter in ESS.19 ESS has been demonstrated clinically in a number of organ areas and disease types.20
The problem is how to maximize the discrimination between ESS spectra taken from high- and low-risk sites in order to accurately detect the patients at high future cancer risk. The difficulty is that the spectral differences between normal and abnormal tissue can be subtle compared to major sources of variation that are of little or no predictive value for the detection of cancer risk.
In the clinical setting, it is extremely difficult even for experienced endoscopists to accurately control all aspects of ESS spectral acquisition, especially with respect to the angle and pressure of the optical probe in relation to the tissue with which it is in contact. It is known that pressure affects the spectra.21 Movement and other artifacts also occur and can affect the optical measurements collected. This is in part due to the proximity of the esophagus to the heart and lungs in the mediastinum, making it susceptible to coughing and breathing artifacts during endoscopy. If replicated spectra were to be taken in rapid succession from the same site (typically four in under one second) an appropriate statistical pretreatment method might be able to identify and reduce this intermeasurement variability, thus helping in the construction of an effective diagnostic model.
In this work, we have employed a new statistical pretreatment method called error removal by orthogonal subtraction (EROS), which was introduced in an earlier paper.22 EROS uses replicated measurements to learn the structure of the measurement variability and an orthogonal projection matrix to subtract it mathematically from the spectra.
The ESS system consists of a pulsed xenon arc lamp, an optical probe, a spectrometer, and a computer to control these components and record the spectra. The arc lamp, spectrometer, and power supply are housed in a portable, briefcase-sized, unit to which the laptop computer is connected. Short pulses of white light (320 to 920 nm) from the xenon arc lamp (Perkin Elmer, Inc., Fremont, California) are directed through a flexible optical fiber, with the probe tip touching the tissue to be interrogated. Ultraviolet B and C (100 to 315 nm) light is filtered out to avoid risk to patients. A collection fiber (200 μm diameter), with a fixed separation distance of ~350 μm (center-to-center) from an illumination fiber (400 μm diameter), collects backscattered light from the upper layers of the tissue and conveys it to the spectrometer (S2000 Ocean Optics, Dunedin, Florida). The spectrometer outputs the spectrum to the computer for recording and further analysis (Fig. 1). The fiber assembly is housed in a plastic sheath (outer diameter 2.0 mmμ, which can pass into the esophagus via the biopsy channel of a standard endoscope. Collection and recording of a single spectrum takes approximately 200 ms.
Two potential sources of significant measurement variability when collecting data in vivo are variations in the pressure of the probe on the tissue and in the angle at which the probe is held to the tissue.23,24 These factors are difficult to control during measurement under typical clinical conditions. In the experiment reported here, they were deliberately varied in a controlled fashion in order to investigate their effects on the spectra.
Two different types of tissues, squamous-cell-lined pig esophagus and columnar-cell-lined pig stomach, were resected, using a portion of approximately 4 cm2 of each tissue, extended on a small piece of cork and fixed with pins. This preparation was placed on an electronic balance, as illustrated in Fig. 2. All measurements were carried out with a 2.5-mm (outer diameter) optical biopsy probe, which contains both the illuminating and collecting fibers. As described earlier, these fibers had a center-to-center separation of 350 μm.25 Data were collected at six random sites for each tissue. At each site, 10 replicate measurements were taken under the conditions of all possible combinations of four pressures (0 kPa, 10 kPa, 20 kPa, 30 kPa) and four angles to the vertical (0 deg, 15 deg, 30 deg, 45 deg). Thus, the total number of spectra measured for each tissue was 960. To achieve the conditions, the probe was fixed to a micromanipulator at the desired angle and pushed downward until the balance gave the desired reading. The force measured by the balance was converted to a pressure using an approximate contact area of 5 mm2 for the probe tip.
This study was approved by the joint University College London/University College London Hospitals (UCLH) Ethics Committee. During routine endoscopy, optical measurements were taken, followed immediately by biopsy from the same sites. A total of 152 matched optical and histological biopsy sites were collected from 81 patients referred to our tertiary referral center between 2000 and 2003 for the management of high-grade dysplasia (HGD) or early cancer in BE. Informed consent was obtained from all patients prior to their participation in the study.
Before any tissue spectra were taken, a white reference spectrum was recorded from the spectrally flat diffuse reflector (Spectralon, Labsphere, Inc., North Sutton, New Hampshire). This provided calibration of the overall system response, to account for spectral variations in the light source, spectrometer, fiber transmission, and fiber coupling. ESS spectral data used in our analysis is the ratio of the spectral intensity of backscattered light from the tissue to that of the standard reference spectrum from Spectralon. Each spectrum was made up of 1801 pixels from the detector, spanning the wavelength range 320 to 920 nm, although the actual spectral resolution was about 3 nm (~9 pixels).
Spectra were taken from one to three sites per patient, with a median of four spectra from each site (mean 3.3). Of the 152 matched optical and histological biopsies (corresponding to 506 spectra), 122 (corresponding to 413 spectra) were from low-grade dysplasia or no dysplasia (low risk) and 30 (corresponding to 93 spectra) were from high-grade dysplasia or cancer (high risk). Each biopsy was assessed and assigned as either high or low risk by three pathologists, who met to generate a consensus in cases of disagreement. This data set was used to train classification rules, as described in Sec. 2.4.
A new data set with a total of 68 matched optical and histological biopsies (corresponding to 276 spectra) was collected from another 20 patients. These measurements were made later in time than those in the training set and were not looked at until after the diagnostic models had been selected and trained. The set included 50 biopsies (corresponding to 202 spectra) from low-grade dysplasia or no dysplasia (low risk) and 18 biopsies (corresponding to 74 spectra) from high-grade dysplasia or cancer (high risk). This data set was used for prospective prediction, as described in Sec. 2.4.
All raw spectra were visually examined for any obvious outliers caused by acquisition errors, poor contact of the optical probe with the tissue, dirt/blood on the probe tip, or other artifacts. These spectra, from 16 biopsy sites, were excluded from subsequent analysis.
Standard data preprocessing was carried out on the spectra to improve signal quality.25,26 The spectra were first smoothed using the Savitzky-Golay method,27 spanning a 7-point window below 620 nm and spanning a 20-point window above 620 nm, where noise was greater. To speed subsequent manipulation, the smoothed data were then reduced from the 1801 pixels, given the spectrometer resolution of 9 pixels, by taking alternate points only. To remove the regions of the spectra with low signal-to-noise ratios arising from the lower system response, only the wavelengths between 370 and 800 nm, with 637 points, were used in the analysis. Using the standard normal variate (SNV)28 method, the spectra were then normalized by setting the mean intensity of each spectrum to zero and the variance to one.
To study the spectra from the laboratory experiment, a principal component analysis (PCA) based on the pooled within-site covariance matrix22 was carried out separately for each type of tissue. The loadings for the first few principal components describe how pressure and angle affect the experimental spectra. The results for the two types of tissue being similar, the within-site covariance matrix was then pooled over all 12 sites, and a further PCA carried out. A PCA was also carried out on the pooled within-site covariance matrix of the in vivo spectra. The loadings for the first few principal components from this analysis describe the variability in in vivo replicate measurements. The two sets of loadings, for experimental and in vivo spectra, were compared.
A new spectral pretreatment method, error reduction by orthogonal subtraction (EROS), described in detail elsewhere,22 was applied to the in vivo spectra. EROS deals with problems related to measurement variability. It uses replicated measurements from what is nominally the same site to model the structure of the measurement variability, and then subtracts this variability from the spectra. The modeling involves a PCA of the pooled within-site covariance matrix of the in vivo spectra, as described earlier. The result of this is a set of principal component loadings that describe the measurement variability. The subtraction is carried out by projecting the spectra onto the space orthogonal to that spanned by a chosen number, k, of these principal components. This is equivalent to subtracting from each spectrum the best fitting (in a least-squares sense) linear combination of the k selected principal component loadings.
Using the in vivo training data set described in Sec. 2.3, classification rules were derived from both the original spectra and the EROS pretreated spectra using principal component discriminant analysis (PCDA): an initial PCA on the spectra, followed by linear discriminant analysis (LDA) on the first q PC scores.22 EROS and PCDA were carried out for a grid of values of k and q, with k ranging from 0 (no dimensions subtracted) to 7 and q from 5 to 30. Leave-out-one-site cross-validation22 trained the algorithm, i.e., carried out the EROS pretreatment followed by the PCA and the LDA, on all the data except one site, and the excluded site was then predicted. This was used to assess the performance of the classification, as measured by sensitivity, specificity and AUC, the area under a receiver operating characteristic (ROC) curve showing the sensitivity versus the false positive rate (1-specificity), generated by varying the threshold canonical score. In this per-site analysis, if any one of the spectra from a particular biopsy site was classified as a spectrum from a high-risk site, the whole biopsy site was regarded as a high-risk one.
To test the robustness of the models, predictions were made for the separate Barrett’s test set described in Sec. 2.3. The same spectral pretreatment procedures were implemented on these data as mentioned earlier, followed by an application of the classification rules derived from the training data described in Sec. 2.3 using cutoffs also learned from the training data. All the computations and analyses were performed using the R statistical language (www.r-project.org). R is a free software environment for statistical computing and graphics.
Examination of the laboratory data, where pressure and angle have been deliberately varied, and comparison of these data with in vivo spectra showed that these factors do affect the spectra and are probably responsible for much of the variability in the in vivo spectra.
That pressure and angle affect the spectra is evident from Fig. 3, with a clear ordering of the mean spectra for the squamous tissue being visible as each factor is varied. The picture is similar for the columnar tissue. It is not surprising that these factors affect the spectra, for they will affect the contact of the probe and the density of the tissue beneath it. To demonstrate the similarity between the spectral variability in the laboratory and in vivo situations, the first three principal component loadings, based on the pooled within-site covariance matrix, are plotted in Fig. 4 for both the experimental data and the in vivo Barrett’s data. For each principal component, the experimental and in vivo loadings are similar. The first component, almost a linear trend with wavelength, and the second, resembling an inverted mean spectrum, suggest that variations in baseline and scale are dominant in both systems. The third component is harder to interpret, but still shows a broadly similar shape in the two cases. For the experimental data, the pooled within-sample covariance captures the variation caused by pressure and angle under controlled laboratory conditions. For the in vivo Barrett’s data, it describes the variability between replicate measurements. The similarity in the loadings supports the contention that much of the variability in the in vivo replicate measurements comes from differences in pressure and/or angle.
The two panels of Fig. 5 show the mean spectra for low- and high-risk sites, before and after pretreatment with EROS, in which k=5 dimensions were removed. Differences between the means are much more evident in the right-hand panel, after pretreatment. What cannot be seen from the figure, because the spectra have been rescaled, is that the pretreatment has removed 96.6% of the variability in the original spectra.
The in vivo data were used to construct diagnostic rules for the detection of HGD or cancer both with and without the use of EROS, fixing the sensitivity at 90% in each case. Examination of plots that show how each wavelength contributes to the discrimination revealed a reduction in noise and an improvement in interpretability when EROS was used.
Table 1 shows the leave-out-one-site cross-validation results for detection of HGD or cancer for various combinations of k, the number of dimensions removed by the EROS pre-treatment, and q, the number of principal components used in the construction of the diagnostic rule. In deriving the rule, the “cutoff” canonical score between the high-risk and low-risk sites was chosen to give a sensitivity of 90% each time. This high sensitivity comes at the expense of specificity, but it was felt to be a greater omission to potentially “miss” patients at high risk than to have to collect a few additional biopsies (due to ESS false positives). In the clinical model we envision, conventional biopsies would be taken only if the ESS spectrum was indicative of dysplasia or cancer.
Without pretreatment by EROS (k=0 in Table 1), the choice q=30 gives the best results with sensitivity, specificity, and AUC of 90%, 67%, and 0.82, respectively, comparable with those reported in Ref. 25 for a related but slightly different data set (92%, 60%, 0.85). When at least k=3 dimensions are removed by EROS, the number of components required to construct a good diagnostic rule falls substantially. In particular, the combination k=5, q=5 gives the best results, with sensitivity, specificity, and AUC of 90%, 82%, and 0.86, respectively.
Figure 6 compares the loadings for the LDA discriminant functions using k=0, q=30 (no EROS) in the left panel and k=5, q=5 in the right panel. These loadings show the contribution at each wavelength to the linear diagnostic rule, and thus permit interpretation of its spectral basis. With EROS, and thus using many fewer factors in the classification model, the loading vector is much less noisy and can be related much more easily to features in the spectra. In the right-hand panel of Fig. 6, the most obvious feature is a large positive LDA loading in the region of 650 to 800 nm and peaking at around 760 nm, corresponding to clear differences between the mean spectra. The peak at 760 nm is consistent with a lowered oxygen saturation of haemoglobin in dysplastic tissue, as has been noted by various research groups.29,30 Two peaks in the LDA loading around 540 and 580 nm may be due to absorption dips of HbO2 at 542 and 577 nm in the spectra of high-risk cancer due to increased Hb presence. It is known that cancers and precancerous tissues are characterized by increased microvascular volume, and hence increased blood content.31
The two best models, those using k=0, q=30 (no EROS) and k=5, q=5 (EROS), were applied to the independent test set. The EROS model gave better prediction results with sensitivity of 83% and specificity of 84%, compared with a sensitivity of 78% and specificity of 66% without EROS. Both models showed some loss of sensitivity compared with the 90% on the training set, but the specificity was maintained in both cases, which was the advantage of EROS.
The importance of variations in pressure and angle at the time of spectral measurement has been confirmed by a designed experiment. Pretreatment with EROS was used to characterize and remove this variability from in vivo spectral data before developing a diagnostic rule. In this case, the diagnostic rule employed PCA followed by LDA, but it would be perfectly possible to combine EROS with other approaches to classification. The resulting simplification of the spectral data has two benefits. The removal of noise should lead to more robust predictions, as demonstrated on the independent test set. This is crucial for real-time clinical diagnostic application. In addition, simpler and smoother loadings for the classification rule will facilitate interpretation of the spectral basis for the rule. This interpretation of the LDA loading will be further developed in future work. In conclusion, this work provides a better understanding of how spectral changes are generated by scattering and absorption and how they contributed to classification. EROS allows the development of a more accurate and robust diagnostic algorithm system in which ESS acts as a straightforward, reliable, and valuable technique for the rapid and accurate early detection of cancer risk in Barrett’s esophagus.
The authors are grateful to the Peacock Trust, the National Cancer Institute Network for Translational Research into Optical Imaging (NTROI) of NIH, Experimental Cancer Medicine Centre (ECMC), and Comprehensive Biomedical Research Centre (CBRC) for their support of this work at University College London (UCL).
Ying Zhu, University College London, National Medical Laser Centre, Academic Division of Surgery Specialties, Charles Bell House, 67-73 Riding House Street, London W1W 7EJ, United Kingdom and University College London, Department of Statistical Science, Gower Street, London, WC1E 6BT, United Kingdom.
Tom Fearn, University College London, Department of Statistical Science, Gower Street, London, WC1E 6BT, United Kingdom.
Gary Mackenzie, University College London, National Medical Laser Centre, 67-73 Riding House Street, London, W1W 7EJ, United Kingdom.
Ben Clark, University College London, National Medical Laser Centre, 67-73 Riding House Street, London, W1W 7EJ, United Kingdom.
Jason M. Dunn, University College London, National Medical Laser Centre, 67-73 Riding House Street, London, W1W 7EJ, United Kingdom.
Irving J. Bigio, Boston University, Departments of Biomedical Engineering and Electrical and Computer Engineering, 44 Cummington Street, Boston, Massachusetts 02215.
Stephen G. Bown, University College London, National Medical Laser Centre, 67-73 Riding House Street, London, W1W 7EJ, United Kingdom.
Laurence B. Lovat, University College London, National Medical Laser Centre, 67-73 Riding House Street, London, W1W 7EJ, United Kingdom.