|Home | About | Journals | Submit | Contact Us | Français|
Esophageal cancer consists of two major histologic types: esophageal squamous cell carcinoma (ESCC) predominant globally and esophageal adenocarcinoma (EAC) with a higher incidence in westernized countries. Five-year overall survival is 15%. Clinical trials frequently combine histologies although they are different diseases with distinct origins. In the evolving era of personalized medicine and targeted therapies, we hypothesized that ESCC and EAC have genomic differences important for developing new therapeutic strategies for esophageal cancer.
We explored DNA copy number abnormalities (CNAs) in 70 ESCCs with publicly available array data and 189 EAC from our group. All data was from Affymetrix single nucleotide polymorphism (SNP) arrays. Analysis was performed with Nexus 5.0 Copy number software using a SNPRank segmentation algorithm. Log ratio thresholds for copy number gain and loss were set at +/− 0.2 (approximately 2.3 and 1.7 copies respectively).
ESCC and EAC genomes showed some CNAs with similar frequencies (e.g., CDKN2A, EGFR, KRAS, MYC, CDK6, MET) but also many CNAs with different frequencies between histologies, most of which were amplification events. Some of these regions harbor genes to which targeted therapies are currently available (VEGFA, ERBB2) or where agents are in clinical trials (PIK3CA, FGFR1). Other regions contain putative oncogenes that may be targeted in the future.
Using SNP arrays we compared genomic abnormalities in a large cohort of EAC and ESCC. We report here the similar and different frequencies of CNAs in ESCC and EAC. These results may allow development of histology-specific therapeutic agents for esophageal cancer.
Esophageal cancer, the eighth most common cancer in the world, is composed of two main histologic types: squamous cell carcinoma (ESCC) and adenocarcinoma (EAC). The two tumors have five-year overall survival rates averaging 15% , and are often included together in therapeutic and prognostic clinical studies [2;3]. The histologies may share the poor outcome with current therapeutic strategies, but they have distinct differences including causality, cell of origin and epidemiologic distribution . Risk factors for esophageal SCC, the predominant type worldwide, include smoking and alcohol abuse, both inflammatory insults to the esophagus, while EAC is associated with obesity and gastroesophageal reflux disease (GERD) [5;6]. These two malignancies arise from different cell populations: ESCC from squamous epithelium and EAC from intestinal metaplastic epithelial cells or Barrett’s esophagus . They should not necessarily be treated as the same disease in clinical trials and development of therapies.
With a shared poor outcome but distinct biologic, epidemiologic and demographic differences, these tumors are not the same. In the evolving era of personalized medicine and targeted therapies, we hypothesized that ESCC and EAC would have genomic differences important for developing new therapeutic strategies for esophageal cancer. There have been smaller studies using different techniques to compare ESCC and EAC DNA copy number abnormalities but our study is by far the largest and has the highest resolution to date, thus allowing a more precise comparison of similarities and differenced between the genomes.
Seventy-three Affymetrix 250K Sty esophageal adenocarcinoma GeneChip data published as part of Berouhkim et al.  were obtained from the authors. Tumor data was normalized to a baseline reference file that was created from matched normals. Additionally, we included our own Affymetrix 6.0 SNP array data from 116 esophageal adenocarcinomas from patients treated at the University of Pittsburgh Medical Center from 2002 – 2008. The baseline reference file for this population was created from normal DNA obtained from the blood of 15 individuals from the same patient population. Genomic DNA was isolated from tumors and blood using QiaAmp DNA Mini Kit (Qiagen, Valencia, CA). DNA concentration and purity was assessed by UV absorbance (NanoDrop 1000, Thermo Fisher Scientific, Waltham, MA) and DNA quality was assessed by gel electrophoresis prior to labeling and array hybridization. All patients provided informed consent and this research was approved by appropriate human research institutional review boards.
Raw copy number intensity files (.cel) for 70 ESCC samples from two studies [8;9] were obtained from the Gene Expression Omnibus (GEO). This included 30 samples genotyped with Affymetrix 500K (250K Nsp and 250K Sty) GeneChip data from Hu and colleagues . For this analysis, we only used the data from 250K Sty from Hu in conjunction with another 40 samples (including 11 ESCC cell lines) genotyped with 250K Sty from Bass and colleagues . Each dataset was individually referenced to the respective baseline files. The baseline reference file for the Hu dataset was created using the matched normal data provided. A reference file to normalize the Bass data was created from a subset of 37 non-neoplastic (normal) samples from a larger cohort of 1140 samples that were used in the original work.
All individually normalized data were pooled into a single project and analyzed in Nexus 5.0 Copy number software (Biodiscovery, El Segundo, CA) using the SNPRank segmentation algorithm with a minimum of eight probesets and a significance threshold p-value of 10−6. Log ratio thresholds for copy number gain and loss were set at ± 0.2 (approximately 2.3 and 1.7 copies respectively). Genome positions were mapped to NCBI Build 36.1 (hg18). We then compared the two cancer types using the comparison tool in Nexus 5.0 to guide us in identifying both common and different regions of aberrations. We also manually compared the genomes to identify other regions listed in Tables 1 and and2.2. Two-tailed Fisher’s exact test was performed in GraphPad software [http://www.graphpad.com/quickcalcs/contingency1.cfm] to determine if there significant difference in the frequencies of copy number changes between the two cancer types.
Analyses of copy number data from 70 ESCC and 189 EAC samples resulted in at least 18 regions of gain and 14 regions of loss that occur in both ESCC and EAC at similar frequencies (p>0.05 by Fisher’s exact test) (Table 1). These changes include previously reported cancer loci including gains [VEGFA (6p), EGFR (7p), CDK6 (7q21), MET (7q31), KRAS (12p), ERBB2 (17q12)] and losses [FHIT (3p), CSMDI (8p), and SMAD4 (18q)]. We also observed other regions of gain such as on 1p, 1q, 6p, 7q22, 8p, 10q, 11q, 12p11, 13q, 15q, 19q, 20q, and loss on 4p, 4q, 5q, 6q, 9p, 11p, 12q, 18q, 20p, 21q, and 22q in both ESCC and EA. Some of these regions have been previously identified in a similar pattern (gain or loss) in two studies that compared ESCC and EAC but with smaller cohort sizes and different hybridization techniques [11;12] (Table 1). Various putative target genes within these regions have been previously described by other studies [8–12] (Table 1).
In addition to regions with copy number aberrations at similar frequencies, we identified 17 regions of copy number (CN) gains and 13 regions of CN losses with a significant difference in their frequencies as determined by p < 0.05 by Fisher’s exact test (Table 2). Eleven of the 17 (65%) regions displayed higher frequencies of gain in ESCC and some of these regions harbored known cancer-associated genes such as SOX2, PIK3CA (ESCC=60% vs EAC=15%, p=0.0001), MYC (58% vs 38%, p=0.0046), and CCND1, ORAOV1 (59% vs 18%, p=0.0001) (Table 2). Similarly, 8/13 (62%) of the copy number loss regions were observed at significantly higher frequencies in ESCC in comparison to EAC. Genes in some of these regions include known cell cycle regulatory genes such as CDKN2A/B (57% vs 37%, p= 0.0046) and ATM (24% vs 7%, p=0.0003). In addition, we observed higher frequency gains targeting 2q (16% vs 5%, p=0.0069), 5p (28% vs 8%, p=0.0001), 8p (21% vs 9%, p=0.01), 14q (35% vs 4%, p=0.0001), 17q11 (19% vs 9%, p=0.047), 17q25 (21% vs 10%, p=0.022), and 22q (10% vs 3%, p=0.04). Higher frequency losses were seen targeting ESCC at 1p (~10% vs 3%, p<0.05), 2q (16% vs 3%, p=0.0009), and 3p (43% vs 15%, p=0.0001. Putative target genes in many of these regions have been previously pointed out in Berouhkim et al.  and include cancer susceptibility genes such as TERT (5p gain), FGFR1 (8p12 gain), NKX2-1, BCL2L2, PAX9 (14q gain), CRKL (22q gain), ING5 (2q loss), and CHL1 (3p loss) (Table 2).
On the other hand, the EAC genome displayed six CN gains and five CN losses that were more frequent in EAC in comparison to the ESCC genome. CN gains and losses in EAC were observed to affect cancer-associated loci such as gains at MYB (EA=11% vs ESCC=3%, p=0.047), TARP (56% vs 18%, p=0.0001), GATA4 (20% vs 4%, p=0.001), GATA6 (30% vs 4%, p=0.0001) and loss at 1p targeting SFN (8% vs 0%, p=0.01), GMDS (17% vs 4%, p=0.007), WWOX (32% vs 4%, p=0.0001) and TP53 (19% vs 2%, p=0.0001) (Table 2). We also observed higher frequency gains at 9p (13% vs 4%, p=0.04) containing putative cancer loci such as CA9 [13–15] and TLN1 [16;17].
Interestingly, we observed two loci that displayed an opposite pattern of copy number changes in the two cancers (indicated by ‘*’ in Figure 1 and Table 2). The 13q region has been previously identified by Weiss and Rumiato [11;12]. In our analysis, the 13q region displayed a 20% loss in ESCC and harbors the tumor suppressor gene BRCA2 in addition to other candidate tumor suppressor genes including FOXO1 , and STARD13/DLC2 [18–20]. However, this region is amplified in 17% of EA and also harbors the putative oncogene, ELF1 [21–23] and KLF5 [24;25]. Similarly, we observed 11% loss in EAC vs 11% gain in ESCC at 19p chromosomal arm. This region contains approximately 600 genes but two genes, ZNF492 and ZNF99, have been proposed as candidate genes within this region .
We have performed a comparative genomic analysis of the largest cohort of EAC and ESCC samples and at the highest resolution to date. We found considerable similarity between these two tumor types but also many focal regions of DNA amplification or loss that are more frequent in one histologic type than in the other. This confirms some findings from smaller studies reported previously [11;12] but our considerably larger sample size allows us to more precisely define regions, accurately determine event frequencies in the two histologies and identify significant differences between histologies. Our results demonstrate that while the genomes of EAC and ESCC are similar in many ways, there are distinct differences that may lead to biomarkers or therapeutic strategies unique to, or more effective in, one tumor type versus the other.
Genes of interest that were amplified in both histologies at similar frequencies include MCL1, EGFR, CDK6, SMURF1, KRAS, ERBB2, CCNE1, VEGFA, MET and IGF1R. Of these, EGFR, ERBB2, VEGFA and MET are the targets of currently available therapeutic agents, some of which have been tested or are in clinical trials for locally advanced and metastatic esophageal cancer for both histologies. For example, studies have been conducted with EGFR inhibitors in esophageal cancer and the results showed 8% partial responses in patients with advanced ESCC and no significant benefit for patients with EAC . Among the other genes, CDK6 and IGFR1 are currently being explored as potential targets for esophageal cancer therapy. CDK6 in particular is interesting as we have recently reported that amplification of this region is associated with poor survival in EAC. We further demonstrated that expression of CDK6 was associated with amplification and was an even stronger prognostic factor than gene amplification . We also showed that small interfering RNA knockdown, or inhibition with the CDK4/6 inhibitor PD-0332991 (Pfizer, New York, NY), resulted in reduced proliferation and anchorage independent growth in EAC cell lines. Thus, CDK6 inhibitors may provide a novel therapeutic target for EAC. Our current study indicates that they should also be assessed in ESCC.
Genes with different amplification frequencies between ESCC and EAC include SOX2, PIK3CA, MYC, CCND1, FGFR1, GATA4 and GATA6. One of the most striking differences observed is amplification of 3q (60% of ESCC versus 15% of EAC) which contains both SOX2 and PIK3CA. SOX2 has clearly been implicated as a lineage-specific oncogene for squamous cell tumors  but the close proximity and possible overexpression of PIK3CA in ESCC should not be ignored, particularly as this pathway is the target of many new therapeutic agents. Also striking is the differential amplification of the transcription factors GATA4 at 8p23.1 and GATA6 at 18q11.2. These very focal amplification events have been identified previously in EAC, and Lin et al, also demonstrated that GATA4 is over-expressed in association with amplification in EAC . The GATA factors are zinc finger DNA binding proteins that control the development of diverse tissues by activating or repressing transcription. Specifically, the GATA transcription factors coordinate cellular maturation with proliferation arrest and cell survival and it is not surprising that they have been implicated in cancer. Targeting of transcription factors is not currently a practical therapeutic approach but this may change in the future if microRNA-based therapeutics live up to their considerable promise [29;30]. In this case, the GATA transcription factors would clearly be a strong potential target for EAC but less likely for ESCC because of the low frequency of amplification.
In addition to regions with well-defined tumor-associated genes, our study also identified several genomic events that occur at different frequencies but for which no known target genes have been confirmed. These include 13q12.2 which is deleted in 20% of ESCC samples but amplified in 17% of the EAC tumors. 13q12.2 contains several interesting genes that could act as putative oncogenes including FGF9 and KLF5 but also contains the tumor suppressor BRCA2 and FOX01, a member of the FOXO tumor suppressor family. Among the putative oncogenes, Kruppel-like factor (KLF5) may be important in esophageal cancer because KLF5 is a cell growth mediator in various epithelial cells and has been implicated in intestinal tumorigenesis where KLF5 was critical in modulating intestinal tumor initiation and progression . Also, expression of KLF5 correlates with cell proliferation in breast cancer and is a poor prognostic factor with increased expression linked with shorter disease-free and overall survival .
Another dramatic difference between EAC and ESCC is the amplification of chromosome 14 which occurred in 35% of ESCC cases but only 4% of EAC. While many of these events are whole chromosome gains, there does appear to be a focus on the region containing the gene PAX9. This gene is a member of the paired box transcription factor family, which regulates the expression of the genes involved in cell proliferation, apoptotic resistance, and cell migration . PAX9 has also been implicated in the development of stratified squamous epithelia . PAX9 has been shown to be a lung cell lineage oncogene  but its role in esophageal cancer development and prognosis are not yet established . Given the huge disparity in amplification frequency observed in our study, it is possible that PAX9 represents an oncogene specific to the squamous histologic type of esophageal cancer. Finally, other areas of differential amplification include 19p, which was amplified 11% of ESCC but lost in 11% of EAC19p and contains ZNF492, another putative cancer gene. In addition amplification of 2q31-q33 is more prevalent in ESCC but contains approximately 160 genes with no clear candidate as the drivers. In summary, we have used high resolution SNP array analysis to determine the frequencies of DNA copy number gains and losses across the entire genome of 70 ESCC and 189 EAC samples. We have found known targets previously reported to be amplified in smaller cohorts thus corroborating other results and we have identified new regions which represent potential targets for therapy, prognostic biomarkers and aberrations for esophageal cancer studies. To our knowledge, this is the first study to directly compare the genomes of ESCC and EAC at this high resolution and with a large enough sample size to enable meaningful comparisons. Our study has identified several specific changes that differ between histologies including some where therapeutic agents are already available for the putative driver genes. Moving forward, this data shows that EAC and ESCC should not be grouped together when identifying, developing and testing novel therapeutic targets for esophageal cancer. Rather, we conclude that genomic analysis of ESCC and EAC allows for differential selection and potential development of histology-specific targeted therapies for this poor prognosis disease. Similar studies comparing same tumor histologies across different organ sites could also be informative in identification of common or site-specific therapeutic strategies.
Presented at the Fifty-eighth Annual Meeting of the Southern Thoracic Surgical Association, San Antonio, TX, November 9-12, 2011.