Samples evaluated in each phase of the study
For the initial massively parallel sequencing phase, 23 fresh-frozen primary tumors (12 ESCCs and 11 EACs) were evaluated at all coding exon positions represented by the SureSelect capture approach. For this study, only non-synonymous mutations were considered. From these data, 255 high-quality mutations (for EACs) and 95 (for ESCCs) were chosen for validation by Sanger sequencing of the mutated genes in the same 23 tumors. These 255 and 95 high-quality genes were chosen for validation of mutation calling as follows: in EAC, the 8 TP53 mutations, all 117 mutations found in ESO01T and ESO10T, plus 130 randomly selected genes from other samples, were subjected to Sanger sequencing; in ESCC, all 28 mutations in TP53, NOTCH1, NOTCH3, FBXW7, KIF16B, KIF21B and MYCBP2, plus 67 randomly chosen genes, were queried with Sanger sequencing to confirm our mutation calling. After this methodological validation step, a set of 8 genes (TP53, NOTCH1, NOTCH2, NOTCH3, FBXW7, KIF16B, KIF21B and MYCBP2: the only genes mutated in at least 3 tumors in the ESCC discovery screen) was chosen for “scale-up” Sanger sequencing of all coding exons in a larger, separate cohort comprising 41 fresh-frozen North American ESCCs. Because we were aware of preliminary findings from a parallel exome sequencing study being carried out in a larger cohort of EACs (Adam Bass, personal communication), we did not perform scale-up sequencing in any additional EACs. An additional cohort of 48 fresh-frozen Chinese ESCCs was also examined by Sanger sequencing of all coding exons in TP53, NOTCH1, NOTCH2, NOTCH3, and FBXW7. Finally, in the two patients from whom adequate high-quality DNA was available from matched BE epithelium, Sanger sequencing of all 78 genes that were confirmed as mutated in ESO01T and all 39 genes confirmed as mutated in ESO10T was performed in the matching benign BE tissues. Although it would have been preferable to study multiple anatomic locations of BE within each patient, this was not possible in these cases because biopsy material from only one site was available from each patient.
Patient characteristics and preparation of clinical samples
Patient characteristics are detailed in Supplementary Table 7
. Fresh-frozen resected tumor and matched blood were obtained from patients treated under an Institutional Review Board Protocol at the Johns Hopkins Hospital, University of Maryland and the First Affiliated Hospital of Zhengzhou University in China. Tumor tissue was analyzed by frozen section to assess neoplastic cellularity. Tumors were macrodissected to remove residual normal tissue and enhance neoplastic cellularity, as confirmed by multiple frozen sections.
Preparation of Illumina genomic DNA libraries
Genomic DNA libraries were prepared following Illumina’s (Illumina, San Diego, CA) suggested protocol with the following modifications. (1) 3 micrograms (µg) of genomic DNA from tumor or normal cells in 100 microliters (µl) of TE was fragmented in a Covaris sonicator (Covaris, Woburn, MA) to a size of 100–500 bp. DNA was purified with a PCR purification kit (Cat # 28104, Qiagen, Valencia, CA) and eluted in 35 µl of elution buffer included in the kit. (2) Purified, fragmented DNA was mixed with 40 µl of H2O, 10 µl of 10 × T4 ligase buffer with 10 mM ATP, 4 µl of 10 mM dNTP, 5 µl of T4 DNA polymerase, 1 µl of Klenow Polymerase, and 5 µl of T4 polynucleotide Kinase. All reagents used for this step and those described below were from New England Biolabs (NEB, Ipswich, MA) unless otherwise specified. The 100 µl end-repair mixture was incubated at 20°C for 30 min, purified by a PCR purification kit (Cat # 28104, Qiagen) and eluted with 32 µl of elution buffer (EB). (3) To A-tail, all 32 µl of end-repaired DNA was mixed with 5 µl of 10 × Buffer (NEB buffer 2), 10 µl of 1 mM dATP and 3 µl of Klenow (exo-). The 50 µl mixture was incubated at 37°C for 30 min before DNA was purified with a MinElute PCR purification kit (Cat # 28004, Qiagen). Purified DNA was eluted with 12.5 µl of 70°C EB and obtained with 10 µl of EB. (4) For adaptor ligation, 10 µl of A-tailed DNA was mixed with 10 µl of PE-adaptor (Illumina), 25 µl of 2× Rapid ligase buffer and 5 µl of Rapid Ligase. The ligation mixture was incubated at room temperature (RT) or 20°C for 15 min. (5) To purify adaptor- ligated DNA, 50 µl of ligation mixture from step (4) was mixed with 200 µl of NT buffer from NucleoSpin Extract II kit (cat# 636972, Clontech, Mountain View, CA) and loaded into NucleoSpin column. The column was centrifuged at 14000 g in a desktop centrifuge for 1 min, washed once with 600 µl of wash buffer (NT3 from Clontech), and centrifuged again for 2 min to dry completely. DNA was eluted in 50 µl elution buffer included in the kit. (6) To obtain an amplified library, ten PCRs of 25 µl each were set up, each including 13.25 µl of H2O, 5 µl of 5 × Phusion HF buffer, 0.5 µl of a dNTP mix containing 10 mM of each dNTP, 0.5 µl of Illumina PE primer #1, 0.5 µl of Illumina PE primer #2, 0.25 µl of Hotstart Phusion polymerase, and 5 µl of the DNA from step (5). The PCR program used was: 98°C 1 minute; 6 cycles of 98°C for 20 seconds, 65°C for 30 seconds, 72°C for 30 seconds; and 72°C for 5 min. To purify the PCR product, 250 µl PCR mixture (from the ten PCR reactions) was mixed with 500 µl NT buffer from a NucleoSpin Extract II kit and purified as described in step (5). Library DNA was eluted with 70°C-warm elution buffer and the DNA concentration was estimated by absorption at 260 nm.
Exome and Targeted Subgenomic DNA Capture
Human exome capture was performed following a protocol from Agilent’s SureSelect Paired-End Version 2.0 Human Exome Kit (Agilent, Santa Clara, CA) with the following modifications. (1) A hybridization mixture was prepared containing 25 µl of SureSelect Hyb # 1, 1 µl of SureSelect Hyb # 2, 10 µl of SureSelect Hyb # 3, and 13 µl of SureSelect Hyb # 4. (2) 3.4 µl (0.5 µg) of the PE-library DNA described above, 2.5 µl of SureSelect Block #1, 2.5 µl of SureSelect Block #2 and 0.6 µl of Block #3; was loaded into one well in a 384-well Diamond PCR plate (cat# AB-1111, Thermo-Scientific, Lafayette, CO), sealed with microAmp clear adhesive film (cat# 4306311; ABI, Carlsbad, CA) and placed in GeneAmp PCR system 9700 thermocycler (Life Sciences Inc., Carlsbad CA) for 5 minutes at 95°C, then held at 65°C (with the heated lid on). (3) 25–30 µl of hybridization buffer from step (1) was heated for at least 5 minutes at 65°C in another sealed plate with heated lid on. (4) 5 µl of SureSelect Oligo Capture Library, 1 µl of nuclease-free water, and 1 µl of diluted RNase Block (prepared by diluting RNase Block 1: 1 with nuclease-free water) were mixed and heated at 65°C for 2 minutes in another sealed 384-well plate. (5) While keeping all reactions at 65°C, 13 µl of Hybridization Buffer from Step (3) was added to the 7 µl of the SureSelect Capture Library Mix from Step (4) and then the entire contents (9 µl) of the library from Step (2). The mixture was slowly pipetted up and down 8 to 10 times. (6) The 384-well plate was sealed tightly and the hybridization mixture was incubated for 24 hours at 65°C with a heated lid.
After hybridization, five steps were performed to recover and amplify the captured DNA library: (1) Magnetic beads for recovering captured DNA: 50 µl of Dynal MyOne Streptavidin C1 magnetic beads (Cat # 650.02, Invitrogen Dynal, AS Oslo, Norway) was placed in a 1.5 ml microfuge tube and vigorously resuspended on a vortex mixer. Beads were washed three times by adding 200 µl of SureSelect Binding buffer, mixing on a vortex for five seconds and then removing the supernatant after placing the tubes in a Dynal magnetic separator. After the third wash, beads were resuspended in 200 µl of SureSelect Binding buffer. (2) To bind captured DNA, the entire hybridization mixture described above (29 µl) was transferred directly from the thermocycler to the bead solution and mixed gently; the hybridization mix /bead solution was incubated in an Eppendorf thermomixer at 850rpm for 30 minutes at room temperature. (3) To wash the beads, the supernatant was removed from beads after applying a Dynal magnetic separator and the beads was resuspended in 500 µl SureSelect Wash Buffer #1 by mixing on vortex mixer for 5 seconds and incubated for 15 minutes at room temperature. Wash Buffer#1 was then removed from beads after magnetic separation. The beads were further washed three times, each with 500 µl pre-warmed SureSelect Wash Buffer #2 after incubation at 65°C for 10 minutes. After the final wash, SureSelect Wash Buffer #2 was completely removed. (4) To elute captured DNA, the beads were suspended in 50 µl SureSelect Elution Buffer, vortex-mixed and incubated for 10 minutes at room temperature. The supernatant was removed after magnetic separation, collected in a new 1.5 ml microcentrifuge tube, and mixed with 50 µl of SureSelect Neutralization Buffer. DNA was purified with a Qiagen MinElute column and eluted in 17 µl of 70°C EB to obtain 15 µl of captured DNA library. (5) The captured DNA library was amplified in the following way: 15 PCR reactions each containing 9.5 µl of H2O, 3 µl of 5 × Phusion HF buffer, 0.3 µl of 10 mM dNTP, 0.75 µl of DMSO, 0.15 µl of Illumina PE primer #1, 0.15µl of Illumina PE primer #2, 0.15 µl of Hotstart Phusion polymerase, and 1 µl of captured exome library were set up. The PCR program used was: 98°C for 30 seconds; 14 cycles of 98°C for 10 seconds, 65°C for 30 seconds, 72°C for 30 seconds; and 72°C for 5 min. To purify PCR products, 225µl of PCR mixture (from 15 PCR reactions) was mixed with 450 µl of NT buffer from NucleoSpin Extract II kit and purified as described above. The final library DNA was eluted with 30 µl of 70°C elution buffer and DNA concentration was estimated by OD260 measurement.
Somatic Mutation Identification by Massively Parallel Sequencing
Captured DNA libraries were sequenced with the Illumina GAIIx Genome Analyzer, yielding 150 (2 X 75) base pairs from the final library fragments. Sequencing reads were analyzed and aligned to human genome hg18 with the Eland algorithm in CASAVA 1.6 software (Illumina). A mismatched base was identified as a mutation only when (i) it was identified by more than three distinct tags; (ii) the number of distinct tags containing a particular mismatched base was at least 15% of the total distinct tags; and (iii) it was not present in >0.5% of the tags in the matched normal sample. SNP search databases included the NCBI’s database and the 1000 Genomes Project database (59
Evaluation of Genes in Additional Tumors and Matched Normal Controls
For the TP53, NOTCH1, NOTCH2, NOTCH3, FBXW7, KIF16B, KIF21B
genes that were mutated in at least 3 tumors in the ESCC discovery screen, the coding region was sequenced in 41 additional American ESCCs and matched controls. The coding regions of TP53, NOTCH1, NOTCH2, NOTCH3,
were sequenced in 48 Chinese ESCCs and matched controls. PCR amplification and Sanger sequencing were performed following protocols described previously, using the primers listed in Supplementary Table 8
Evaluation of matched BE
The confirmed mutations in EAC samples ESO01T and ESO10T were sequenced in matched BE epithelium. PCR amplification and Sanger sequencing were performed as described in the previous paragraph (61
Differences between EAC and ESCC in the number of somatic mutations, type of specific mutations (TP53 and at least 1 NOTCH family member mutation) and mutation spectra were compared. The total number of mutations and specific mutations between groups were compared using Cochrane-Mantel Haenszel tests for general association. The mutation spectra were compared using a continuity adjusted chi-square test to prevent overestimation of statistical significance. To examine if there was a global trend for one subtype to have more spectra mutations of any type, a Cochran-Mantel Haenszel test stratified by spectra was performed. Differences between U.S. and China ESCC in type of mutations and predictors of mutations used the same tests as the comparisons between cancer subtypes. A post-hoc power calculation was performed to understand how well our study was powered to examine the relationship between mutations and region based on the prevalence of the mutation in the Chinese population and odds ratio of the mutation between the U.S. and Chinese population for a p-value ≤0.05.
We had hoped to examine the relationship between smoking and specific mutations among the U.S. patients. Unfortunately, 7 of the 8 ESCC patients with reliable information were smokers, which made correlative comparisons difficult. To examine the relationship between NOTCH mutation and tumor stage, we created a logistic regression model of stage 3 or 4 tumors compared with stage 1 or 2 tumors. Analyses were performed in SAS 9.2 (Cary, North Carolina).