|Home | About | Journals | Submit | Contact Us | Français|
Chromosome copy gain, loss and LOH involving most chromosomes have been reported in many cancers, but less is known about chromosome instability in premalignant conditions. 17p LOH and DNA content abnormalities have been previously reported to predict progression from Barrett’s esophagus (BE) to esophageal adenocarcinoma (EA). Here, we evaluated genome-wide chromosomal instability in multiple stages of BE and EA in whole biopsies.
42 patients were selected to represent different stages of progression from BE to EA. Whole BE or EA biopsies were minced, aliquots processed for flow cytometry and genotyped with a paired constitutive control for each patient using 33,423 SNPs.
Copy gains, losses, and LOH increased in frequency and size between early and late stage BE (p<0.001), with SNP abnormalities increasing from <2% to >30% in early and late stages, respectively. A set of statistically significant events were unique to either early, late or both stages, including previously reported and novel abnormalities. The total number of SNP alterations was highly correlated with DNA content aneuploidy and was sensitive and specific to identify patients with concurrent EA (empirical ROC AUC=0.91).
With the exception of 9p LOH, most copy gains, losses, and LOH detected in early stages of BE were smaller than those detected in later stages, and few chromosomal events were common in all stages of progression. Measures of chromosomal instability can be quantified in whole biopsies using SNP-based genotyping and have potential to be an integrated platform for cancer risk stratification in BE.
Genomic instability has been hypothesized to play a crucial role in development of cancer for decades(1), and there is substantial evidence to support this hypothesis. Numerous studies have investigated genome-wide copy number changes and LOH(2–5) in EA and other cancers, including renal cell(6), colon(7, 8), cervical(9), pancreas(10), lung(11, 12), and glioblastoma(13). These studies have reported that the cancer genome acquires extensive chromosomal changes during neoplastic progression from a normal cell to cancer. Many of these alterations confer a selective advantage for the cancers (“drivers”) relative to normal cells, but others appear to be hitchhikers that are evolutionarily neutral (“passengers”)(14). Few studies have utilized SNP-based technologies to evaluate genome-wide copy number and LOH in premalignant stages of neoplastic progression compared to those in cancer(15–17). Characterization of premalignant conditions is important to understand the biology of neoplastic progression as well as to identify potential targets for cancer prevention and to develop methods for cancer risk stratification or early diagnosis of an asymptomatic cancer.
The incidence of esophageal adenocarcinoma (EA) is rising more rapidly than any other cancer in the US(18, 19). EA is typically detected at an advanced stage in which it is rapidly fatal with mortality greater than 80%(20, 21). Barrett’s esophagus (BE) is a condition that develops in a subset of patients with chronic gastroesophageal reflux disease in which the normal squamous epithelium of the esophagus is replaced by intestinal metaplasia(22). Although BE is the only known precursor of EA, most patients with BE die of causes unrelated to BE or EA(23, 24). Further, recent evidence suggests that in most cases BE appears to be a successful adaptation to a harsh intraesophageal environment with chronic acid and bile reflux associated with erosion, ulceration, reactive oxygen species and oxidative damage(25–28). Thus, there is a need for biomarkers that can distinguish common, benign BE patients for whom surveillance can be safely prolonged, rare cases that will progress and therefore merit careful surveillance or intervention to prevent EA, and patients who require early detection because they have already developed a small EA that cannot be visualized endoscopically.
BE is one of the best models of human intraepithelial neoplasia because the premalignant epithelium can be safely visualized and biopsied so that genomic changes can be compared in different stages of neoplastic evolution and then studied longitudinally by endoscopic biopsy surveillance for EA risk management(29, 30). Genetic progression models and longitudinal studies of oral leukoplakia have also shown successes in using somatic genetic biomarkers for risk stratification in head and neck cancer(31–35). This is in contrast to many other premalignant conditions that are typically removed when detected, such as colonic adenomas, or are unable to be systematically sampled for biopsy examination over time because of limited accessibility or potential clinical complications. Studies of BE indicate that inactivation of CDKN2A by loss of heterozygosity (LOH), methylation and/or mutation is selected as an early event that predisposes to large clonal expansions in the BE segment(36–38). Inactivation of TP53 by mutation and LOH is subsequently selected, typically in a CDKN2A deficient background, and predisposes to progression to increased 4N fractions (G2/tetraploidy), aneuploidy and EA(14, 29, 32–34, 38–42).
We reported recently in a 10-year prospective cohort study that a panel of chromosome instability biomarkers, including LOH spanning the CDKN2A and TP53 loci on chromosome arms 9p and 17p, and DNA content abnormalities (increased 4N, and aneuploidy) identified a high-risk patient subset that had a 79% five-year cumulative incidence of EA and a low-risk subset with a 0% cumulative EA incidence to nearly eight years of follow-up after their baseline endoscopic biopsy surveillance(43). This and other studies support the hypothesis that progressive chromosomal instability and clonal evolution drive progression from a benign premalignant state to cancer(29, 32–34, 42, 44). However, it has been difficult to combine these biomarkers into a single platform for clinical use because various and complex sample processing methods are required, including flow cytometric cell sorting and multiple research assays for LOH detection. Further, regions of LOH evaluated in these studies were based on low density microsatellite polymorphisms(45–47). In addition, genome-wide studies of chromosomal instability in EA have identified common regions of copy gain, loss, copy neutral LOH, and a small number of homozygous deletions(2), but these events have not been quantified in early stages of progression in BE.
Here, we evaluate genome-wide copy gain, loss, LOH and assess SNP-based quantification of aneuploidy in a cross-sectional study of patients representing different stages of progression from early BE metaplasia to advanced EA using paired samples with Illumina multisample 33K SNP BeadChips. Selection of patients and biopsies in this study was guided by previous extensive characterization of the Seattle Barrett’s esophagus cohort, which allowed more precise estimation of risk based on molecular characterization and more endoscopic follow-up data than are typically available. Chromosome alterations including genome-wide copy gain, loss, and LOH were evaluated and compared across the progression stages using whole, unpurified biopsies. Overall genomic instability was quantified relative to DNA content aneuploidy in the same biopsy to assess feasibility of using SNP-arrays as a common platform for biomarkers of EA risk.
Participants were enrolled in the Seattle Barrett’s Esophagus Study, which was approved by the Human Subjects Division of the University of Washington in 1983 and renewed annually thereafter with reciprocity from the Fred Hutchinson Cancer Research Center (FHCRC) Institutional Review Board from 1993 to 2001. Since 2001, the study has been approved by the FHCRC IRB with reciprocity from the University of Washington Human Subjects Division.
Mucosal biopsies used in this cross sectional study were obtained from 42 patients using previously defined protocols for endoscopic mucosal biopsy and evaluation of surgical resection specimens(48, 49). A constitutive control sample from gastric tissue was collected for each of the 42 participants and used as the reference for paired LOH and copy number analysis. The 42 participants each had histologically diagnosed specialized intestinal metaplasia and represented four different stages of neoplastic progression in Barrett’s esophagus defined by
Whole biopsy specimens and gastric controls, including fresh/frozen endoscopic biopsies (n = 48) or surgical resection specimens (n=36), were processed. For the BE and EA samples, whole endoscopic biopsies (n=24) or surgical resection specimens (n=18) were minced in buffer used for preparing tissue for flow cytometry (146 mM NaCl, 10mM Tris Base (pH 7.5), 1 mM CaCl2, 0.5 mM MgSO4, 0.05% Bovine serum albumin (BSA), 21 mM MgCl2, 0.2% Igepal CA-630 Sigma I3021 as previously described(50). Approximately 90ul of unfixed nuclei were saturated with DAPI (5 µg/ml, Accurate Chemical, Westbury, NY) and evaluated for DNA content on a Cytopeia inFlux flow cytometer (Seattle, WA). Listmode files were analyzed using Multicycle (Phoenix Flow Systems, San Diego, CA) as described previously(51, 52).
DNA from the remaining minced tissue of each whole biopsy was extracted using standard Puregene DNA Isolation Kit as recommended by the manufacturer (Gentra Systems, Inc. Minneapolis, MN), and quantitated with the Quant-iT™ PicoGreen® dsDNA Assay Kit following manufacturers protocol (Invitrogen) on the CytoFluor II plate reader. DNA for SNP arrays was not purified by flow cytometry because this study is designed to identify abnormalities that ultimately may be assessed in whole biopsies without further purification in a clinical setting. Each sample was genotyped using a Illumina 12-sample Human Hap300_Pool10 Beadchip targeting 33,423 SNP loci (33K SNP array) from the HumanHap300 BeadChip and processed at Illumina using standard multi-sample Infinium protocols and reagents according to manufacturer’s instructions. DNA (average 223.8ng input) was amplified, enzymatically fragmented, precipitated, resuspended in hybridization buffer, denatured, and hybridized to the BeadChip overnight at 48°C. After the overnight incubation, the BeadChip was washed, primer extended, and stained on a Tecan Genesis/EVO robot using a Tecan GenePaint slide processing system. After staining, BeadChips were washed and coated, then scanned using Illumina’s BeadArray Reader. The probe intensity data were extracted and analyzed using Illumina's BeadStudio 3.1 software.
Genotyping data consists of two channel intensity data corresponding to the two alleles. Data is generated as rectangular coordinates of the raw A versus raw B allele intensities. After normalization with BeadArray Reader, the genotyping data were transformed to a polar coordinate plot of normalized intensity R = Xnorm + Ynorm and allelic intensity ratio theta=(2/pi)* arctan (Ynorm / Xnorm), where Xnorm and Ynorm represent transformed normalized signals from alleles A and B for a particular locus. For each patient, a pair of normal gastric (reference) and a BE/EA sample (subject) was processed in paired mode, using default reference clustering. The B allele frequency parameter, based upon the theta values, is calculated for each locus as described previously (53).
We examined the MA plots (signal intensity vs. signal ratio plot) for each sample and found no bias. Paired constitutive controls for each patient were used to ensure that all abnormalities detected were somatic lesions and not constitutive copy number variations that could arise when using a single reference sample for all patients(54, 55). Paired sample analysis of BE or EA samples (defined as “sub”) paired with the control gastric sample (defined as “ref”) from each patient generated two parameters, the log-ratio log2(Rsub/Rref) and |dAlleleFreq sub-ref| that were used for copy number and LOH detection(53). The |dAlleleFreq sub-ref| represents the absolute value of the difference in allele frequency between the subject (BE or EA) and reference (paired constitutive control) samples. Since both samples are from the same patient, the difference of allele frequency at any given locus should be approximately zero except in regions exhibiting somatic chromosomal anomalies such as copy number differences or LOH. Two stages of data analysis were performed in this study. The first identified copy gain, copy loss, and LOH for each chromosome arm. The second tested whether or not the frequency of a gain, loss or LOH identified in a given segment along the chromosome arm reached statistical significance across patients.
Specifically, we defined the normal baseline copy number (2N) on each chromosome using |dAlleleFreq sub-ref|. After wavelet processing (see below) of |dAlleleFreq sub-ref|, regions with signal values including at least 10% contiguous SNPs on either chromosome arm with a |dAlleleFreq sub-ref| equal to zero were identified. These regions were used to set the baseline chromosome copy number (adjusted Log2(Rsub/Rref) to zero and was applied to specifically to that chromosome). If a contiguous region of 10% of SNPs could not be found on a chromosome, then no adjustment was made for a baseline copy number. A wavelet method was used to smooth both copy number Log2(Rsub/Rref) and |dAlleleFreq sub-ref| signal values(56, 57). Daubechies3 wavelets were used in this study, and array signal was decomposed to the second level with the wavelets and reconstructed following dynamic thresholding (depending on the signal to noise ratio for that sample). For a specific constitutive genotype “AB”, after above wavelet procedure processing (including thresholding), probes with Log2(Rsub/Rref) signal values that deviated from zero were identified as either copy gain (allelic imbalance, e.g. AAB or balanced, e.g. AABB) or loss (deletion or homozygous deletion). Probes with |dAlleleFreq sub-ref| which deviated from zero after wavelet processing including thresholding were identified to be LOH (allelic imbalance, e.g A and AAB or copy neutral LOH, e.g.AA). Infrequently, there were a few noncontiguous probe signal levels that were more than four standard deviations from the mean signal of a chromosome arm. Since some of those signal values could be smoothed out by wavelets, those signal values were included as lesions for that patient and subjected to consistency testing in the second stage of analysis. A Kruskal-Wallis test was used to compare the median number of lesions and sizes of lesions in each group (Figure 1 and and2).2). For the size of events, we defined a gain, loss or LOH segment as continuous until there were at least two adjacent probes without that lesion. Thus, the boundaries of a lesion segment were defined by at least two or more probes without the lesion (Figure 2). Poisson regression was used to test the frequency of the lesion in each group (Figure 2).
In the second stage of analysis the significance of frequencies of lesions across the patients was tested. Copy number gain, loss, and LOH, were tested separately along the chromosome arms. Specifically for each of the three types of lesions, every chromosome arm was divided into 0.5Mb windows, although the last window may be smaller than 0.5Mb. If one or more lesions were included in a window in the first stage of the analysis, then one event was counted for that window. If no lesion was called in the first stage, the window was given a value of zero. Fisher’s exact test was used to test the number of events across the patients of a comparison group. We compared the lesion frequency within each window in both the early and late stage separately, relative to the null hypothesis of no lesion in a window. The p-values of these comparisons were adjusted to have a false positive discovery number of 1 or less(58). Starting and ending positions of any given lesion on a chromosome may be different across multiple patients. Therefore, to test whether a lesion was common among patients, we used the window method as a practical approach, considering sample size and SNP informativity rates. The 0.5Mb window size was determined by considering the specific array probe density (coverage of the genome) and SNP informativity (heterozygosity) rate for LOH detection(59). Larger windows have higher probability of containing an informative SNP, but the resolution for detection is lower; whereas a smaller window will have higher resolution but may not contain informative SNPs. Using an average SNP informativity rate of 0.26 in dbSNP and considering the multi-sample Illumina 33K BeadChip density, we selected a 0.5Mb window size in order to have 0.8 LOH detection power. We evaluated window sizes from 0.1–1.0 Mb and obtained essentially identical results. Empirical Receiver Operating Curves were generated according to previously described methods(60). All the analyses were carried out in Matlab (version 7.1, The Mathworks Inc.) and Statistical Analysis System (SAS version 9.1, SAS Institute Inc.)
The 42 pairs of biopsies evaluated in this study were obtained from 42 patients including 6 women and 36 men of mean age 64.3 years (range 43 to 93). Table 1 depicts the participant selection criteria with regard to molecular and pathology measures for each stage of progression. Patients were classified using previously evaluated flow-cytometrically purified biopsies (mean 9.64 biopsies per patient) for 17p LOH and TP53 mutation (at baseline only), and DNA content abnormalities by flow cytometry as described previously(29, 43).
Genome-wide chromosome instability was evaluated in patient samples representing the four different stages of neoplastic progression described in Table 1. Figure 1 shows the average percent of SNPs with copy gain or loss abnormalities, and percentage of informative SNPs with allelic imbalance due to copy gain, or loss of heterozygosity [LOH] including copy loss and copy neutral events using paired constitutive controls (see methods). The number of SNPs with copy gain, loss, and LOH show a highly statistically significant trend of increasing number of altered SNPs across each of the four progression stages (p<0.001). In early stages of progression, very few SNPs have copy number or LOH abnormalities (Figure 1). In late stages of BE progression, the frequency of SNP alterations increased dramatically (Figure 1). Direct comparisons of number of SNP probes with gain, loss and LOH between the early stages of progression (BE early vs. BE instability groups), and also between the late stages of BE (advanced BE and EA groups) did not reach p=0.01 significance level, given that the statistical power of the tests are small when the sample size of each group is not large.
To distinguish between a high a number of small chromosomal events versus a small number of larger events, for example 100 regions each including 10 SNPs, compared to 10 regions each including 100 SNPs, we assessed the size of altered chromosomal copy number regions. Figure 2 shows the distribution of copy gain and loss segment sizes in the four stages of BE progression.
There was a trend for increased copy gain segment size with progression stages (p<0.01). However, copy gain frequency and segment size did not reach statistical significance between early BE and BE instability stages, given the samples sizes of comparison groups (p=0.65, p=0.70 respectively, Figure 2). Similarly, the frequency and segment size of copy gain also did not reach statistical significance at p=0.01 between advanced BE and EA stages (p=0.84, p=0.04, respectively). However, when the combined early stages (BE early and BE instability) were compared to the combined late stages (advanced BE and EA), the frequency and segment size of copy gain were both significantly different (p<0.001).
There was a statistically significant difference in the number of regions of copy loss between the BE early and BE instability groups (p=0.005, Figure 2). However, many of these regions were based on loss of a single SNP. Neither the total number of SNPs with loss (p=0.12, Figure 1) nor the median sizes of loss (p=0.20, Figure 2) were different in the two groups. Similarly, neither frequency nor segment size of copy loss reached a p=0.01 level of statistical significance between advanced BE and EA (p=0.04, p=0.23, respectively) given the sample size. However, when the combined early stages were compared to the combined late stages, frequency and segment size of copy loss were both significantly different (p<0.001).
SNP distribution patterns along the chromosome and the heterozygosity rate of specific SNPs affect the ability to call LOH segment sizes precisely(59). Therefore, we plotted the distribution of LOH using allele frequency differences between paired constitutive controls and BE/EA samples (|dAlleleFreq sub-ref|) for each informative probe displayed by chromosome in each sample (Figure 3).
Somatic copy gain, loss and LOH lesions might confer a selective advantage to the clone. Alternatively, a lesion may be the result of random, non-selected events arising during neoplastic evolution, especially in later stages of progression after inactivation of TP53. Therefore, we examined the frequency of each lesion across patients. The numbers of patients in each of the four stages are small and the results from Figures 1 and and22 indicate that BE early and BE instability groups have statistically similar magnitudes (albeit not the same) of copy gain, loss, and LOH frequencies and sizes relative to each other as compared with advanced BE and EA groups. Therefore, patients from the BE early and BE instability groups were combined into a single early stage group in order to get more power for lesion detection. Also, advanced BE and EA have similar (albeit not the same) magnitude of instability relative to each other as compared to the two early groups. Therefore, the advanced BE and EA patients were combined into a late stage group and evaluated separately. Statistically significant levels of chromosome copy change and LOH events among patients are shown for early stage (supplemental Figure 1A, 1B) and late stage (supplemental Figure 1C, 1D). A complete list of statistically significant abnormalities identified in each 0.5Mb window detected in early and late stages of progression are presented in supplement materials (Table S1 and Table S2).
We also evaluated the relationship between the significant alterations identified independently in genome-wide analysis of early (BE early and BE instability) and late (Advanced BE and EA) stages of progression. Figure 4 illustrates the relationships among the significant copy gain (Figure 4A), copy loss (4B) and LOH (4C) windows. Of the 56 0.5Mb windows identified with copy gain in late stages, only 2 of 56 (4%) also reached statistical significance in early stages of progression. Of the windows identified as reaching statistical significance in late stages, 22 of 350 (6%) copy loss and 71 of 878 (8%) LOH events also reached statistical significance in early stages of progression. The specific lesions and locations that were common in both early and late stages are shown in Table 2. For the patients in this study, the average rate of informative SNPs on this array is 0.33. Thus, there is a significant probability for small segments with copy loss to have no informative SNPs available for LOH analysis. The LOH events listed in Table 2 that did not reach statistical significance in the small regions of copy loss on chromosomes 1, 4, 10, 11, 12, 18, and 20 were mainly due to lack of informative SNPs.
DNA content abnormalities as measured by flow cytometry have been reported to be useful for cancer risk stratification in BE(48, 61, 62). To test the ability of a SNP platform to detect populations of cells with abnormal DNA content, we minced each biopsy, divided the homogenate in half, and processed half using the multi-sample Illumina 33K SNP arrays and the other half by DNA content flow cytometry using our standard protocol and analysis as described in methods. Multiple genetic mechanisms can generate allelic imbalance resulting in LOH measured by allele frequency differences between paired constitutive controls and BE/EA samples (|dAlleleFreq sub-ref|)| including copy gain in a heterogenous population, copy loss and copy neutral LOH(55, 63). Although small regions of LOH may not be detected if there are insufficient informative SNPs, genome-wide chromosomal instability arising in a population of cells would be detected by the LOH signal of a SNP-based platform, with the exception of balanced copy number alterations such as a pure tetraploid population or pure population with homozygous deletions. We compared genome-wide LOH results obtained in unpurified biopsies to DNA content flow cytometry for aneuploidy detection in the same biopsies (Table 3). Aneuploidy was detected by flow cytometry in 11 of 42 biopsies. A threshold of ≥ 1000 SNPs with LOH throughout the genome (gw-LOH), which represents about 9% of the average number of informative heterozygous SNPs on the 33K platform, provided the optimal sensitivity and specificity for discriminating flow-cytometric diploid and aneuploid samples. 10 of 11 (~91%) of the aneuploid populations that were detected by flow cytometry had ≥ 1000 SNPs with LOH. The one exception was the only near-diploid aneuploid sample in the study. This sample had a flow-cytometric DNA content of 2.3N. Patients with near-diploid aneuploid populations are at lower risk of progressing to EA(61). There were 7 samples with ≥ 1000 SNPs with LOH but which were diploid by flow cytometry. Six of seven samples were from patients with EA or advanced BE. Thus, gw-LOH has potential to be a surrogate for flow cytometric DNA content aneuploidy.
We also evaluated the sensitivity and specificity of genome wide LOH in whole biopsies using the 33k array for distinguishing patients with or without concurrent cancer. The total number of informative SNPs with LOH (LOHTotal) across all 22 chromosomes was summed for each patient. The mean LOHTotal for early stages (BE early and BE-instability) was 456 (standard error = 123) compared to 3445 (standard error = 587) for advanced stages (advanced BE and EA). If LOHTotal was used to distinguish between early and late stages, we obtained an AUC of 0.91 as shown in the empirical ROC curves with the raw data in Figure 5. The results were essentially identical using paired samples for copy gain and loss.
Finally, we compared genome wide LOH measurements to DNA content flow cytometric aneuploidy for sample classification of early and late stages of progression (Table 4). The results are promising for SNP-based assessment of aneuploidy for cancer risk in whole, unpurified biopsies.
A previous chromosome instability biomarker panel for cancer risk stratification in Barrett’s esophagus required Ki67/DNA content multi-parameter flow cytometric cell sorting for enrichment of proliferating epithelial cells, a panel of STR polymorphisms on chromosomes 9 and 17 for LOH analysis, and DNA content flow cytometric assessment of tetraploidy and aneuploidy (43). Although this chromosome instability panel accurately identified patients at high- and low risk for progression to EA, it required a constellation of technologies that are difficult to perform outside of research centers. Here, we report that 9p and 17p LOH could be detected in whole, unprocessed biopsies, and that genome-wide SNP based measures of LOH and copy number performed as well or better than DNA content flow cytometry for detection of aneuploidy, providing a significant advance towards validating the biomarkers in a clinically compatible, SNP DNA array platform. A similar platform might also be useful in assessing cancer risk in oral leukoplakia, based on the previous panel of 9p and 3p LOH, TP53 abnormalities and chromosome polysomy (43). We used SNP DNA arrays on neoplastic and paired constitutive control samples from each individual to provide better resolution for smaller genomic events, improve signal-to-noise ratios, and differentiate constitutive copy number variation from somatic genetic events that arise during neoplastic progression. Since DNA arrays can measure LOH, copy number, and known mutations as well as DNA methylation, in which bisulfite treatment produces a C→T SNP at CpG sites, this technology may provide a common platform to compare these biomarkers in validation studies to efficiently select those markers most appropriate for patient management.
In this study, we demonstrated feasibility for SNP-based assessment of somatic copy number gain, loss and LOH in patients at different stages of neoplastic progression as well as a genome-wide measure of aneuploidy. Validation of a common biomarker platform as well as validation of reliable markers for risk stratification in BE will require one or more large cohort studies. The ideal cohort study would include patients at all levels of EA risk and the biomarker(s) would be evaluated on prospectively collected samples. Testing of multiple biopsies, for example, one biopsy every two cm, from each endoscopic procedure for a given patient will likely increase reliability because neoplastic clones occupy variably sized regions of a Barrett’s segment. Well designed cohort studies using a common platform to detect LOH, copy change, and DNA methylation abnormalities should be able to identify a small set of unbiased, reliable markers for BE risk stratification. Validated biomarkers from the cohort studies could be translated into a less expansive, clinically compatible SNP analysis platform, such as custom GoldenGate, VeraCode BeadExpress, or Pyrosequencing assays, for routine, large-scale sample testing in the clinical setting. New tools such as massively parallel sequencing technologies are likely to significantly advance our understanding of the underlying somatic-genetic mechanisms in cancer, including detection of tandem duplications, balanced translocations or inversions, and discovery of as yet unidentified critical genes. However, these technologies are not yet feasible for large scale cohort studies or clinical application.
Using paired constitutive controls, we consistently detected recurrent copy gains, losses and LOH in unpurified biopsies even in early benign BE that showed no evidence of progression to EA for a mean of 10 years. At advanced stages of progression, the neoplastic genome contained frequent and extensive regions of copy gain, loss and LOH, consistent with a recent report characterizing EA using high-density SNP arrays(2). To characterize chromosomal instability during neoplastic progression and identify regions that may contain genes useful for cancer prevention, cancer risk stratification and early detection, we examined chromosomal abnormalities found only in patients at early stages of progression, those common in patients at both early and late stages and those only detected in patients at late stages of progression. Only a small number of abnormalities were frequent in early stages of progression, and with the single exception of chromosome arm 9p, the sizes of chromosome abnormalities were small at early stages compared to late stages. Of the significant 0.5Mb windows with copy gain, loss and LOH identified in late stages of progression, only 4%, 6%, and 8%, respectively, also reached statistical significance in early stages of progression. Chromosome arm 9p was unusual in the sense that it acquired large regions of copy loss and LOH spanning most of the arm at early stages of progression. Within these large 9pLOH events we found three statistically significant, distinct regions of copy loss spanning 9.0~12.1, 20.5~25.0, 28.5~30.5 Mb, with the most frequent being potential homozygous deletions at the CDKN2A/ARF locus, which have been reported previously in multiple cancer types(2, 64–67). CDKN2A abnormalities have been associated with large clonal expansions in BE as well as other conditions (14, 36, 37, 68), and may be selected in BE as part of an adaptation to a harsh acid reflux environment (25–28). Additionally, chromosome 9 is highly structurally polymorphic with more than 1000 annotated genes and has evolved many intra- and interchromosomal duplications(69, 70). Two other common abnormalities were significant in early stage BE including copy loss spanning the FHIT (3p, 59.8 ~60.6 Mb) and WWOX (16q, 77.3 Mb) loci, detected in paired samples in 46% and 24% of early stage patients. These regions have been extensively studied in EA and other cancer types (71–82), and recently Nancarrow et al reported these regions as the two most common regions of homozygous deletions in EA, detected in 74% and 35%, respectively(2). This is consistent with the patients with advanced BE and EA in the present study in whom we detected FHIT and WWOX copy loss in 78% and 56%, respectively, in the late stage biopsies. Other relatively small lesions that contain multiple genes reached statistical significance among patients at early stages such as copy gain on 8q24.3 and 10q22.1.
The paradox of specific chromosomal abnormalities that appear to be selected both in early and late stages of progression is that these events occur too frequently in early stages to be sufficient for development of EA because the rate of progression to EA is very low in most persons with BE(43, 48, 61, 83–86). This suggests several possibilities. First, these early, high-frequency events might be necessary but not sufficient for progression. Second, the frequent, early events might be selected as part of the adaptation to gastroesophageal reflux, but have relatively little involvement in progression to EA, and could appear in late stages as hitchhikers (“passengers”) on selected events that drive progression to EA. Third, they might represent regions susceptible to chromosome damage such as fragile sites(87) that could then hitchhike on selected abnormalities such as an epigenetically modified progenitor cell population(88). Regardless of mechanism, patients with early stage BE appear to have a low risk of progressing to, or dying of, EA(23, 24, 43, 48, 61, 83–86, 89, 90). Effective identification of high risk patients will be required to guide cancer prevention strategies in BE, focusing on patients who may benefit from the intervention because their risk of cancer is high, and reassuring low-risk patients who are unlikely to benefit from an intervention targeting only the BE mucosa.
In this manuscript, we also report that frequent loss or LOH involving whole chromosome arms were observed in late stages of progression, including 3p, 5, 9p, 13, 17p and 18q (Figures 1 to to33 and supplemental Figure 1, Table S2), many of which have recently been reported and reviewed (55), as well as in earlier low density STR allelotypes(45–47, 91). Large recurring regions of copy gain, loss and LOH may provide reproducible assessment of aneuploidy by higher density SNP platforms and this could be combined with an array assessment of more localized regions of copy gain, loss or LOH in a common platform to assess chromosomal instability for risk stratification in BE and other conditions. We also confirmed 17p status in the SNP arrays compared to previous STR results in 38 of 42 patients (90%) even though we did not use the same biopsies in the current study and at least two of the discordant cases could be proven to be due to sampling limitations because the samples also lacked aneuploid populations that were present in the original biopsies (data not shown).
Our study has limitations, including the relatively small sample size in each of the participant categories, and validation of our results will require large cohort studies. However, our results provide important insight into the level of chromosomal instability relative to stages of neoplastic progression in BE. In addition, supervised statistical methods could be used for sample classification, but we chose not to prematurely exclude or include markers from the present study because of the limited sample sizes and probe densities. Further, although the current SNP platforms are potentially technically limited in detecting a perfect 4N tetraploid population, a large, pure tetraploid clone may be rare in an evolving neoplastic population of cells. Despite these potential limitations, this study succeeded in proof of concept that LOH, copy number and aneuploidy can be assessed in unprocessed whole biopsies from Barrett’s esophagus using a SNP array. By assessing chromosome instability in premalignant and malignant stages of neoplastic progression, the results can guide future hypothesis-driven research to better evaluate genomic abnormalities for cancer risk stratification, early detection and identification of candidate targets for cancer prevention.
In summary, our results indicate that SNP array assessment of genome-wide chromosomal changes that develop during neoplastic progression provide promise for improved EA risk stratification in patients with BE, and this hypothesis, as well as others based on DNA methylation, can be efficiently tested in cohort studies, using prospectively accumulated biospecimens (EDRN phase 4)(92). This single-platform assessment of chromosome instability biomarkers might also be used to identify patients at high risk of developing EA for cancer prevention trials.
This study was funded by NIH P01CA91955 and CCM was partially supported by the Commonwealth Universal Research Enhancement Program at the Pennsylvania Department of Health and the Pew Charitable Trust.
We thank David Cowan for database management, Christine Karlsen for patient care coordination, Valerie Cerera for flow-cytometry. Illumina, BeadArray, and Infinium are registered trademarks or trademarks of Illumina, Inc.