|Home | About | Journals | Submit | Contact Us | Français|
The recent development of a semiconductor-based, non-optical DNA sequencing technology promises scalable, low-cost and rapid sequence data production. The technology has previously been applied mainly to genomic sequencing and targeted re-sequencing. Here we demonstrate the utility of Ion Torrent semiconductor-based sequencing for sensitive, efficient and rapid chromatin immunoprecipitation followed by sequencing (ChIP-seq) through the application of sample preparation methods that are optimized for ChIP-seq on the Ion Torrent platform. We leverage this method for epigenetic profiling of tumour tissues.
The semiconductor-based, non-optical sequencing technology used by the Ion Torrent sequencer (Life Technologies, Carlsbad, CA) has the potential for scalable, rapid and low-cost sequence data production1. Given the current industry standard for the density of transistors on the surface of a semiconductor, the technology has not yet reached its full possible capacity2 and has the potential to provide comparable sequencing data yields to conventional optical based sequencers in a fraction of the time and cost3,4. The technology has recently been applied to genomic sequencing1, microbial genotyping5 and targeted re-sequencing6.
Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is a powerful tool for characterizing the epigenetic landscape and transcriptional network in the context of both normal physiology and disease7–9. However, Ion Torrent sequencing has not yet been used for ChIP-Seq due to challenges in using ChIP DNA samples for sequencing library preparation. First, chromatin immunoprecipitation yields relatively low amounts of DNA, while commercial ChIP-Seq protocols recommend at least 500ng to 1ug of starting material for the library construction process. This is an issue in particular for ChIP DNA samples from immunoprecipitation of transcription factors or from limiting samples such as rare cell types or clinical samples, which are often at the few nanograms range. While recent studies presented ChIP-Seq protocols with low input (low cell number) for the Illumina platform10,11, such protocols are not yet available for the Ion Torrent platform. Second, the Ion Torrent process works optimally with a tight size range of DNA molecules of ~280bp +/− 20bp, whereas ChIP DNA typically spans a range of sizes from 200–600bp.
Here, we demonstrate the utility of Ion Torrent sequencing for ChIP-seq samples with sub-nanogram amounts of DNA. Furthermore, we apply the method to profile epigenetic marks of tumor tissues from melanoma patients and show its potential for analyzing tumor progression.
Our starting point was an automated 454-library construction method we previously developed12. To overcome the low input material obtained by ChIP, we devised a low input, scalable and robust library construction protocol for ChIP DNA that increases sensitivity and minimizes operator-dependent variability by incorporating a high yielding amplification enzyme (Kapa Biosystems, Woburn, MA), which has higher yield and higher genome coverage13 than the Phusion polymerase that is commonly used in the standard Illumina protocol, low microliter volume reactions, molecularly barcoded oligonucleotide adapters, and automated fluid handling protocols (Supplementary Fig. S1 and Methods).
To address the wide size range of ChIP DNA, we first tested a standard enzymatic DNA shearing method that is routinely used with Ion Torrent genomic libraries, but failed to generate usable ChIP-Seq libraries. To overcome this problem, we started the library construction process without shearing and then used an automated gel size-selection system (Pippin Prep, Sage Science) to select appropriately sized library molecules post adapter ligation. We note that Illumina ChIP-seq libraries are usually not sheared, as the sheering step results in significant material loss, which is of particular concern with very low input samples, such as ChIP samples. Using this process, we successfully created libraries for 32 of 36 samples attempted (88.9% pass rate; success defined as having sufficient library material to attempt at least three sequencing reactions. An Illumina ChIP-Seq library construction following a successful ChIP is closer to 100%).
To compare results between Ion Torrent sequencing and those from Illumina sequencing for ChIP applications, we performed ChIP with antibodies to the common histone mark, Histone 3 lysine 4 tri-methyl (H3K4me3), the C terminal domain of RNA polymerase II (Pol-II) and IgG (negative control) in mouse dendritic cells stimulated with lipopolysaccharide (LPS). The resulting immunoprecipitated DNA was used as input for both our Ion Torrent and standard Illumina library construction procedures. We sequenced the libraries on the Ion Torrent 316 sequencing chips (on average, 2 million reads/library, average read length: 180 bases) and with the gold standard ChIP-Seq data production using the Illumina Hi-Seq 2000 (15 million reads/library; read length: 40 base single end)7,14,15.
Illumina has a lower percentage of unmapped bases and a significantly higher rate of well-mapped bases than Ion Torrent (Supplementary Table S1). Although the Ion Torrent reads had higher error rates for both SNPs (10 fold higher) and indels (100 fold higher), these were still below 1 in 1,000 bases (Supplementary Table S1) and thus do not impact the quality of the chromatin maps.
We found excellent agreement between the two resulting maps. The ChIP-Seq enrichment scores, defined as the ratio of observed/expected number of reads at each peak region, are highly correlated between the two samples (H3K4me3: Pearson R=0.893, polII: R=0.722, Fig. 1a, b). Despite the differences in mapped bases and error rates, Ion Torrent sequencing produced comparable enrichment peaks to Illumina5,9,10 sequencing for both H3K4me3 and polII ChIP-seq (Fig. 1a, b and Supplementary Fig. S2). Saturation analysis by sub-sampling the Ion Torrent reads indicates that the Ion Torrent library was sequenced to sufficient depth at 2 million reads (Fig. 1c). To examine the possibility that the longer read length in Ion Torrent sequencing reads contributes to this phenomenon, we extended the 40 base reads from Illumina to 180 bases. The extended Illumina reads (randomly down sampled from 15 million to 2 million reads) produced enrichment peaks that were more comparable to the 2 million Ion Torrent reads, indicating that longer read lengths may be beneficial in ChIP-seq applications independent of the sequencing platform used (Supplementary Fig. S3). Similarly, when we shortened the Ion Torrent reads from 180 bases to 40 bases and then performed alignment, the enrichment peaks were reduced, similar to Illumina reads down sampled to 2 million reads (40 bases) (Supplementary Fig. S3). As expected, we did not find enriched peaks with a negative control IgG ChIP-seq from Ion Torrent sequencing (Supplementary Fig. S4).
To determine the sensitivity of Ion Torrent sequencing for ChIP-seq, we tested a titration of ChIP DNA input amounts. We used 56ng, 4ng and 0.4ng of H3K4me3 ChIP DNA from a single ChIP experiment as input for our modified library construction protocol for Ion Torrent. We obtained successful libraries from all three aliquots with comparable results (Fig. 1d–f). For example, the lowest input library (0.4ng; equivalent to H3K4me3 immunoprecipitated DNA from 20,000 cells) was comparable to the highest input library (56 ng; equivalent to 10×106 cells) based on enriched peaks (Fig. 1c) and correlation in enriched scores (R=0.753, Fig. 1f). Such low ChIP DNA input amounts are an order of magnitude lower than the Ion Torrent guidelines for library production and sequencing, and are comparable to recently developed protocols10,11 for low input ChIP-seq with the Illumina platform. We also tested 0.05 ng of ChIP DNA, but failed to produce a high quality library. In summary, using our protocol we created successful libraries from very low starting amounts of ChIP DNA and obtained comparable results with Ion Torrent ChIP-seq to those with our standard methods, optical-based Illumina sequencing, while requiring an order of magnitude fewer sequencing reads (albeit longer reads).
The rapid nature and relatively low-cost of Ion Torrent sequencing make it a promising candidate for diagnostic applications. As a proof of principal, we next performed H3K4me3 ChIP-seq experiments with a matched pair of primary tumor and metastasis cell lines derived from the same melanoma patient. Although the matched pair WM115 (primary tumor) and WM266-4 (metastasis) showed global correlated enrichment of H3K4me3 in gene promoters (r2R=0.83, Fig. 2a and Supplementary Fig. S5), a large number of the genes demonstrated increased levels of H3K4me3 on their promoters in either the primary tumor (Fig. 2a, b) or metastasis (Fig. 2a, c). To test if any biological functions were associated with differential levels of H3K4me3, we performed Gene Set Enrichment Analysis (GSEA) with the gene sets in the Molecular Signatures Database (MSigDB)16.
Interestingly, we found that decreased H3K4me3 levels in metastasis are significantly associated with genes whose expression is repressed in embryonic stem cells, including targets of the Polycomb Repressive Complex (PRC) and for genes enriched for the H3K27me3 histone mark in embryonic stem cells17 (Fig. 2d). Conversely, increased H3K4me3 levels in metastasis are significantly associated with interferon response and inflammatory response genes (Fig. 2e).
H3K4me3 is mostly enriched in promoters of actively transcribed genes, while H3K27me3 is usually enriched at repressive chromatin regions. In embryonic stem cells, however, many of the Polycomb targets, and especially key developmental regulators, are marked by both marks (‘bi-valent domains’)18. To explore the relationship between these two marks and metastasis, we performed H3K27me3 ChIP-seq experiments in the same cell lines. Indeed, increased H3K27me3 levels in metastasis correlate with loss of H3K4me3 and such genes are enriched for polycomb target genes in embryonic stem cells (Supplementary Fig. S6a). Interestingly, genes that have decreased H3K27me3 levels in metastasis are enriched for interferon response genes (Supplementary Fig. S6b, a few of the Interferon response gene sets are enriched in the top 150 gene sets).
To gain initial insight into the clinical applicability of the technology, we performed H3K4me3 ChIP-seq of seven metastatic tumor tissues from melanoma patients. Consistent with our observation in the progressive cell lines, genes that have increased H3K4me3 levels in the metastatic tumors are enriched for Interferon, inflammatory and immune response genes and genes that have decreased H3K4me3 levels are enriched for H3K27me3/polycomb target genes (Fig. 2f–g and Supplementary Fig. S7–10).
The repression of developmental gene sets in melanoma metastasis by an H3K27me3 gain and H3K4me3 loss is consistent with recent findings of similar embryonic stem cell signatures in aggressive tumors from other cancers19,20. Furthermore, several reports show that several histone modifying enzymes are misregulated in human cancers21. For example, EZH2, an H3K27me3 writer, is overexpressed in various solid tumors and its expression is correlated with tumor aggressiveness and metastatic progression22,23. Consistent with this, a stem cell Polycomb repression signature is also enriched in genes that gain H3K27me3 marks in metastatic prostate cancer24. Our finding that interferon and inflammatory response genes have higher levels of the H3K4me3 mark in melanoma metastasis is consistent with recent findings that link inflammation with cancer25–27.
In summary, we have demonstrated a rapid, sensitive, scalable and cost-effective semiconductor-based ChIP-seq pipeline for characterizing epigenetic signatures of metastatic human tumors from limiting samples with comparable sensitivity to recently developed protocols10,11 for ChIP-seq with low input using the Illumina platform. The technical and analytical methods for ChIP followed by Ion Torrent sequencing provide a platform for discovery and future diagnostic applications.
Mouse dendritic cells were isolated from wild type female 6–8 week old C57BL/6 mice obtained from the Jackson Laboratories and cultured in RPMI medium (Invitrogen) supplemented with 10% heat inactivated FBS (Invitrogen) and GM-CSF (20 ng/ml; Peprotech, Rocky Hill, NJ)28. Cells were cultured for 9 days and stimulated for 2 hours with LPS (100 ng/ml, rough, ultra-pure E. coli K12 strain, Invitrogen). Paired primary melanoma tumor derived cell line (WM115) and metastatic melanoma tumor derived cell line (WM266-4) from the same patient were obtained from the Wistar Institute (Philadelphia, PA). Metastatic melanoma tissues were collected from the Department of Surgical Oncology, University of Texas, MD Anderson Cancer Center with informed consent of the patients and prior MIT Committee On the Use of Humans as Experimental Subjects (COUHES) approval.
Chromatin immunoprecipitation assays were performed using a previously published protocol with some minor modifications28. Briefly, cells were fixed for 10 min with 1% formaldehyde, quenched with glycine. Cells were lysed for 10 min on ice with RIPA lysis buffer (10 mM Tris-HCl pH 8.0, 1 mM EDTA pH 8.0, 140 mM NaCl, 1% Triton X-100, 0.1% SDS, 0.1% DOC) (for mouse dendritic cells) or RIPA lysis buffer with 0.2% SDS and without Triton X-100 (for WM115 and WM266-4 melanoma cell lines) and then sonicated using the Branson sonicator as described in Garber et al28. Frozen human melanoma tissues (50–100mg) were thawed out on ice and chopped finely with a razor blade. Then tissues were fixed for 10 min with 1% formaldehyde in PBS and quenched with glycine. Fixed tissues were pulverized with Covaris CryoPrep CP02 at setting 5 for two times in TT1ET tissue tube. Cells were lysed for 10 min on ice with 1% SDS lysis buffer (1% SDS, 10mM EDTA, 50mM Tris-HCl, pH 8.1) and then sonicated using the Branson sonicator. Immunoprecipitation was performed by incubation of the sonicated cell lysate with 75 μl of protein G magnetic dynabeads (Invitrogen) coupled to target antibody for over night at 4 degrees. Magnetic beads were then washed 5 times with cold RIPA buffer (10 mM Tris-HCl, pH 8.0, 1 mM EDTA, pH 8.0, 140 mM NaCl, 1% Triton X-100, 0.1% SDS), twice with high salt RIPA buffer (10 mM Tris-HCl, pH 8.0, 1 mM EDTA, pH 8.0, 500 mM NaCl, 1% Triton X-100, 0.1% SDS), twice with LiCl buffer (10 mM Tris-HCl, pH 8.0, 1 mM EDTA, pH 8.0, 250 mM LiCl, 0.5% NP-40, 0.5% Na-DOC), twice with TE buffer, then eluted in 50 μl of elution buffer (10 mM Tris-HCl, pH 8.0, 5 mM EDTA, pH 8.0, 300 mM NaCl and 0.5% SDS). The eluate was reverse crosslinked at 65 degrees for 6 hours and then treated with 2 μl of RNase A (Roche Applied Science) for 30 minutes and 2.5 μl of Proteinase K (Invitrogen) for 2 hours. Finally, the decrosslinked DNA samples were cleaned up with 120 μl of solid phase reversible immobilization (SPRI) beads and eluted in 50 μl of EB buffer (10 mM Tris-HCl, pH 8.0).
ChIP-Seq libraries for Ion Torrent sequencing were created using a modified protocol of the manufacturer’s instructions. The process for sample preparation is outlined in Supplementary Fig. S1. Briefly, 45 μl of each ChIP DNA was added to a 96-well microtiter plate. Physical barcoding of all sample receptacles was employed at each step to ensure sample tracking integrity. Fragments were enzymatically end-repaired using an enzyme and buffer cocktail (Kapa Biosystems, Woburn, MA). Following end-repair, automated reaction cleanup was performed using a solid-phase reversible immobilization (SPRI) process with a ratio of 1.8 times beads to DNA (Ampure, Agencourt, Beckman Coulter). Samples were eluted in 30ul and added to a 50ul adapter ligation reaction (Ligase enzyme and buffer from Kapa Biosystems), which also contained 5ul of Ion Torrent compatible oligonucleotide adapters (Integrated DNA Technologies). Adapters were used at 8uM concentration (1/5th of the standard amount) to minimize adapter dimer with the low input samples. A further 1.8X SPRI was performed post adapter-ligation. All automated fluid handling steps were carried out on a Bravo Automated Liquid Handling Platform, with a 96LT Disposable Tip pipette head (Agilent Technologies, Santa Clara, CA). After adapter ligation, 10uL of loading solution was added to each sample and each sample was size selected (280 base target size) using 2% gel cartridges (SAGE Pippen prep, SAGE Science, Beverly, MA). A further SPRI (2X Bead:DNA ratio) was performed post sizing and samples were eluted in 23ul. An amplification reaction was set up in a final volume of 50ul with the following cycling profile: 98°C for 45 seconds; 72°C for 20 minutes; followed by 12 cycles of 98°C for 15s, 63°C for 30s, 72°C for 30s; finally, 72°C for 60s. Amplification enzymes and master mix were from Kapa Biosystems (Woburn, MA), primers were those provided in the Ion Torrent template preparation kit. A SPRI cleanup with a 1.5X Bead:DNA ratio was performed post amplification and final libraries were eluted in 25ul. Libraries were quantified and checked for size on an Agilent Bioanalyzer (Agilent, Santa Clara, CA).
Template preparation was conducted using the Ion PGM™ 200 Xpress™ Template Kit, following the Ion PGM™ 200 Xpress™ Template Kit protocol (version 3). Libraries were diluted and 18ul of the 1.55×107 molecules/ul dilution was added to an aqueous master mix containing polymerase and Ion Sphere Particles (ISPs) at the manufacturer’s specified proportions. Emulsions were created using an IKA Ultra-Turrax Tube Drive (IKA, Wilmington, NC). After emulsion PCR, DNA positive ISPs were recovered and enriched according to standard protocols. A sequencing primer was annealed to DNA positive ISPs and the sequencing polymerase bound, prior to loading of ISPs into Ion 316 sequencing chips. Sequencing of the samples was conducted according to the Ion PGM™ 200 Sequencing Kit Protocol (version 6). One or more 316 sequencing chips were loaded and run on an Ion Personal Genome Machine for each sample. Each run was programmed to include 520 nucleotide flows to deliver 200 base read lengths, on average. Libraries were sequenced on Ion Torrent Personal Genome Machine. Basecalling and alignment were performed by the Torrent Suite 2.0.1 software.
ChIP-seq library for Illumina sequencing were prepared using a previously published protocol28. Briefly, enzymes from New England Biolabs were used for the following library construction processes, DNA end-repair, A-base addition, adaptor ligation and Pfu Ultra II fusion enzyme (Agilent Technologies) was used for the enrichment step. Illumina ChIP libraries were barcoded and pooled as previously described28.
Reads were aligned to the reference mouse genome (mm9) or human genome (hg19) using the BWA aligner version 0.5.9. Sequencing metrics were extracted using the GATK tools29 to traverse the genome and qualify mapped bases as well aligned - when the reads had mapping quality greater than Q20 (phred scaled) and high quality - when the reads had mapping quality greater than Q20 and the base had base quality greater than Q20. Error rates were measured in every 100 bases. Mismatches were counted for every base that mismatched the reference sequence in the alignment. Insertions and deletions were counted as events (not by the number of bases in the events). The rate is the number of insertion or deletion events found in every 100 bases. Read length was calculated over all reads (including unmapped). Unmapped reads are all the reads that did not find a likely mapping in the mouse reference genome mm9 (this does not include mapping quality 0 reads). ChIP-seq peak calling was performed using the contiguous segmentation algorithm as part of the Scripture package28 (http://www.broadinstitute.org/software/scripture/). Pearson correlation coefficients in Figure 1 were calculated by performing Pearson correlation analysis of pairwise comparison of ChIP-seq peak enrichment scores (log2) over a 500 bases sliding window.
In order to simulate for longer reads, we extended the 40 bases Illumina reads to 180 bases by setting the–extFactor flag to 140 in igvtools (http://www.broadinstitute.org/software/igv/igvtools_commandline). To simulate for shorter Ion Torrent reads, we selected the first 40 bases from the ~180 bases of Ion Torrent reads. The sequencing data can be downloaded from the NCBI GEO database with the following accession number GSE49477.
GSEA v2.2 was used to test for enrichment of each of the 3,398 gene sets in the Chemical and Genetic Perturbation (CGP) Collection of the Molecular Signature Database (MSigDB v3.1). Reads were first normalized based on the total number of reads for each sample. For each of the 20,000 NCBI human Refseq genes, the accumulated H3K4me3 ChIP-seq reads over 1kb upstream and 1kb downstream of transcription start site was calculated to present the H3K4me3 ChIP-seq signal for each gene. The residuals of the natural logarithms of the accumulated reads calculated from the linear model for each gene were used as the ranked list input for the GSEAPreranked function. Three biological repeats of H3K4me3 ChIP-seq Illumina sequencing data from normal skin melanocytes were downloaded from NCBI GEO database (GSE16368)30.
We thank R. Satija and S. Schwartz for helpful discussions. We thank J. Bochicchio for project management, A. Wysoker for the development of the data transfer pipeline and the Broad’s Genomics Platform for sequencing. We also thank T.L. Calderone for preparing and sending the melanoma tumors. We thank L. Gaffney for production of the graphical abstract. This work was supported by the Human Frontiers Science Program Career Development Award (IA), the ISF Bikura Institutional Research Grant Program (IA), ERC starting grant 309788 (IA), NHGRI (1P50HG006193-01 to IA and AR), NIH Pioneer Award (DP1OD003958-01 to AR), Klarman Cell Observatory (AR), NCI (R01 CA93947 to LC), NCI (U01 CA1411508 to LC), NHGRI (U54HG003067 to CR), NIH Ruth L. Kirschstein National Research Service awards for individual Postdoctoral Fellowships (CC).
AUTHOR CONTRIBUTIONSC.S.C, A.R and I.A designed the study. C.S.C. performed the experiments and conducted data analysis. N.L., A.H., D.R., S.A., A.R. and A.T. developed and conducted the Ion Torrent ChIP-seq library construction experiments. C.S.C, A.R. and I.A. wrote the manuscript, with help from C.N., C.R. and N.H. A.R. guided data analysis. I.A. guided experimental designs. L.C. provided expertise and guidance in melanoma biology and analysis. K.R. provided fixed melanoma cell pellets from tumor derived cell lines and assisted in design of melanoma related experiments. M.G. provided analysis expertise and aided in the analysis of the data. M.O.C. carried out error rate analysis and helped with writing. R.R. carried out mouse dendritic cell culturing and fixing the cell pellets. J.E.G. and L.C. collected and provided tumor samples from melanoma patients. C.N., C.R. and N.H. helped editing the manuscript. All authors edited the manuscript.
Statement of Competing Financial Interests
The authors declare no competing financial interests.
Accession Codes: Sequencing data have been uploaded to the NCBI GEO database under accession number GSE49477.