Cell Culture
SuperTelomerase, MDA-MB-231-HOTAIR, MDA-MB-231 HOTAIR-shEZH2 cells were maintained in DMEM (Invitrogen) supplemented with 10% FBS (HyClone) and 1% Pen/Strep (Invitrogen).
Probe Design
Morpholino Probes against HOTAIR were designed on three open regions detected by PARS-seq (ref) by Gene-Tools LLC (HOTAIR Morpho-1: GAGCAGCTCAAGTCCCCTGCATCCA, HOTAIR Morpho-2: GCACCCGCTCAGGTTTTTCCAGCGT, HOTAIR Morpho-3: TACATAAACCTCTGTTCTGTGAGTGC, Mock Morpho: CCTCTTACCTCAGTTACAATTTATA). All probes were biotinylated at the 3’ end. Antisense DNA probes were designed against HOTAIR full-length sequence using online designer at
www.singlemoleculefish.com. All probes were compared with the human genome using the BLAT tool and probes returning noticeable homology to non-HOTAIR targets were discarded (BLAT searches through a non-overlapping 11-mers index). 48 probes were generated and split into two sets based on their relative positions along HOTAIR sequence such as even-numbered and odd-numbered probes were separately pooled. A symmetrical set of probes against LacZ RNA was also generated as the mock control. All probes were biotinylated at the 3’ end with an 18-carbon spacer arm (Protein and Nucleic Acid Facility, Stanford University). 19 probes were generated against TERC RNA and 24 for roX2 by similar methods. Sequences of all probes are listed in
Table S4. The absolute levels of the ncRNAs in this study are as follows in Ct values per 100 ng of total RNA: roX2 =16.6; TERC=18.4; HOTAIR= 22.95. Thus, the fly and mammalian experiments are roughly comparable, and the mammalian experiments in fact show that ChIRP is compatible with lower expressed ncRNAs.
Crosslinking and chromatin preparation
Cells were grown to log-phase in tissue culture plates and rinsed once with room temperature PBS. For UV crosslinking, the plates were irradiated in UV crosslinker (Stratagene) with lids off and PBS aspirated. UV strength was titrated from 240mJ to 960mJ. For chemical crosslinking, cells were fixed on plate with appropriate amounts of 1% formaldehyde or 1% glutaraldehyde in PBS for 10 minutes at room temperature. Crosslinking was then quenched with 0.125M glycine for 5 minutes. Cells were rinsed again with PBS, scraped into Falcon tubes, and pelleted at 800g for formaldehyde crosslinking and 2500g for glutaraldehyde crosslinking. Cell pellets were then snap frozen in liquid nitrogen and can be stored in -80C indefinitely.
To prepare chromatin, cell pellets were quickly thawed in 37C water bath and resuspended in Swelling Buffer (0.1M Tris pH7.0, 10mM KOAc, 15mM MgOAc. Before use, add 1% NP-40, 1mM DTT, 1mM PMSF, complete protease inhibitor (GE), and 0.1U/ul Superase-in (Ambion)) for 10’ on ice. Cell suspension was then dounced and pelleted at 2500g for 5’. Nuclei was further lysed in nuclear lysis buffer at 100mg/ml (50mM Tris 7.0, 10mM EDTA, 1% SDS, add DTT, PMSF, P.I., and Superase-in before use) on ice for 10’, and sonicated using Bioruptor (Diagenode) until most chromatin has solubilized and DNA is in the size range of 100-500bp. Chromatin can be snap frozen in liquid nitrogen and stored in -80C until use.
Hybridization and washing
Chromatin is diluted in 2 times volume of hybridization buffer (500mM NaCl, 1%SDS, 100mM Tris 7.0, 10mM EDTA, 15% Formamide, add DTT, PMSF, P.I, and Superase-in fresh). 100pmol probes were added to 3ml of diluted chromatin, which was mixed by end-to-end rotation at 37C for 4 hours. Streptavidin-magnetic C1 beads were washed three times in nuclear lysis buffer, blocked with 500ng/ul yeast total RNA and 1mg/ml BSA for 1 hour at room temperature, and washed three times again in nuclear lysis buffer before resuspended in its original volume. 100ul washed/blocked C1 beads were added per 100pmol of probes, and the whole reaction was mixed for another 30min at 37C. Beads:biotin-probes:RNA:chromatin adducts were captured by magnets (Invitrogen) and washed five times with 40x beads volume of wash buffer (2x SSC, 0.5% SDS, add DTT and PMSF fresh). After last wash buffer was removed carefully with P-10 pipette so that no trace volume was left behind. Beads are now poised for different elution protocols depending on downstream assays.
ChIRP RNA elution
For reversible crosslinking (formaldehyde), beads was resuspended in 10x original volume of RNA elution buffer (Tris 7.0, 1% SDS) and boiled for 15min, followed by trizol:chloroform extraction and RNeasy mini column purification. For non-reversible crosslinking (UV and glutaraldehyde), beads were resuspended in 10x original volume of RNA pK buffer (100mM NaCl, 10mM Tris 7.0, 1mM EDTA, 0.5% SDS) and 0.2U/ul Proteinase K (Invitrogen). pK treatment was carried out at 65C for 45’, followed by boiling for 15’, and trizol:chloroform extraction. Eluted RNA was subject to quantitative reverse-transcription PCR (QRTPCR) for the detection of enriched transcripts.
ChIRP Protein Elution and Dot Blot
Beads were resuspended in 3x original volume of DNase buffer (100mM NaCl and 0.1% NP-40), and protein was eluted with a cocktail of 100ug/ml RNase A (Sigma-Aldrich) and 0.1U/ul RNase H (Epicenter), and 100U/ml DNase I (Invitrogen) at 37C for 30’. Protein eluent was supplemented with 0.2 volume of 5x laemmeli buffer (without bromophenol blue or glycerol), boiled for 5’, and dot blotted to nitrocellulose membrane with Bio-Dot apparatus (Biorad). Membrane was then blotted against TCAB1 and tubulin antibodies (gifts from Artandi lab) per normal Western protocol.
ChIRP DNA Elution
Beads were resuspended in 3x original volume DNA elution buffer (50mM NaHCO3, 1%SDS, 200mM NaCl), and DNA was eluted with a cocktail of 100ug/ml RNase A (Sigma-Aldrich) and 0.1U/ul RNase H (Epicenter). RNase elution was carried out twice at 37C with end-to-end rotation and eluent from both steps was combined. For formaldehyde crosslinking, chromatin was reverse-crosslinked at 65C overnight. For non-reversible crosslinking, eluted chromatin was pK treated with 0.2U/ul pK at 65C for 45’. In either case, DNA was then extracted with equal volume of phenol:chloroform:isoamyl (Invitrogen) and precipitated with ethanol at -80C overnight. Eluted DNA was subject to QPCR, Dot Blots, or high-throughput sequencing.
DNA Dot Blot
DNA was denatured in 0.1 volume of denaturing solution (4M NaOH, 100mM EDTA) at 95C for 5’, and then chilled on ice for 5’. Equal volume of chilled 2M NH4OAC was added to neutralize DNA on ice, which is then dot blotted onto nitrocellulose membrane using a Bio-Dot apparatus. Membrane was immediately crosslinked at 120mJ in Stratalinker, and pre-hybridized in Rapid-Hyb (GE) at 42C for 30’. Telomere and Alu repeats were detected using end-labeled radioactive Southern probes CCCTAACCCTAACCCTAACCCTAACCCTAA and GTGATCCGCCCGCCTCGGCCTCCCAAAGTG respectively.
Deep Sequencing, Peak Calling, Motif and GO Term Analysis
High-throughput sequencing libraries were constructed from ChIRPed DNA according the ChIP-seq protocol as described(Johnson et al., 2007), and sequenced on Genome Analyzer IIx (Illumina), with read length of 36bp. Raw reads were uniquely mapped to reference genome (hg18 assembly for HOTAIR, TERC, LacZ and EZH2 ChIRP-seq samples, and dm3 for roX2) using Bowtie (
Langmead et al., 2009).
ChIRP-seq workflow consists of three steps.
- Find concordance: from the two independent ChIRP-seq experiments, we generate a consensus track, taking the lower value of the two at each coordinate. Thus, any aberrant signal in only one of the two experiments is removed. For each sample, reads from even and odd lanes were aligned separately, and per-base coverage was normalized as if there were 10M mappable reads. For each base pair of the genome, true coverage of this base in this sample was defined as the minimum coverage of the even lane and odd lane.
Genome wide signal consists of a combine lane, based on which, a SAM file was generated for peak calling.
- Find peaks: Peaks of each sample were called using MACS against its corresponding input with p-value cutoff 1e-5 (Zhang et al., 2008).
- Filter peaks: For each MACS peak, we filter for peaks that share the same shape in the raw data from the two independent experiments. Only peaks with substantial correlation of the raw data profile, and high coverage across the peak are accepted. For each MACS predicted peak, a window size of +/-2kbp around peak summit or peak width, whichever is smaller, is selected. Within this window, an average coverage of the combine lane and a Pearson correlation between the normalized per-base coverage of the even lane and odd lane were calculated. MACS predicted peaks were further filtered based on peak length, fold enriched against input lane, average coverage, and Pearson correlation to obtain a list of true peaks. For HOTAIR ChIRP-seq sample, thresholds of average coverage>1.5, Pearson correlation>0.3, and fold enrichment against input>2 were applied to filter MACS predicted peaks and obtained 832 true peaks. Same thresholds were used to obtain 2198 true TERC peaks. For roX2 ChIRP-seq, similar parameters were used with the additional cut-off of peak length >2300bp, based on the fact that roX2/MSL complexes cover entire genes. 308 true peaks were obtained.
Sequences of top 500 true peaks (ranked by fold enrichment) within +/-200bp around peak summits were extracted and motifs analysis against these 500 peaks was performed using MEME (
Bailey and Elkan, 1994). Only motifs of the highest significance were reported. Enriched gene sets were obtained through GREAT (
McLean et al., 2010) on all 2198 TERC true peaks and all 832 HOTAIR true peaks. Gene Ontology of both gene sets were performed using DAVID (
Huang da et al., 2009;
Wishart et al., 2009).
roX2 ChIRP-seq Analysis
roX2 peaks and motif were obtained in a way described above, within 308 predicted true peaks, none was in autosomes, resulted a false discovery rate (FDR) = 0. Normalized signal of both the combine lane of Rox2 ChIRP-seq and MSL3-TAP ChIP-seq was obtained in a similar way described in HOTAIR ChIRP-seq analysis. Only regions where normalized signal is >=10 were counted in calculating the Pearson correlation between Rox2 and MSL3-TAP samples. Genes who overlaps >=1bp with windows +/-2kbp of true Rox2 peak summits were included in the average diagram. In total, 1087 RefSeq transcripts were included in chrX average diagram, and 4260 RefSeq transcripts were included in that of chr2L. Distance on the diagram was scaled with gene length, so that the diagram shows signal in a region from 50% gene length upstream to 50% gene length downstream.
TERC ChIRP-seq Analysis
Reads from “TERC ChIRP” sample and “Input” sample were compared against telomere sequence (CCCTAAx5) and Alu sequence (GTGATCCGCCCGCCTCGGCCTCCCAAAGTG). Complete matches were tallied and divided by total number of reads in that sample to give Reads per Million (RPM). RPMs from TERC enriched sample were divided with those from the Input sample to give “Fold Enrichment.” We note that the odd probes yields better enrichment of telomere than the even probes. Because the genome-wide TERC binding sites require by definition comparable pull down by both sets of probes, this result raise the possibility that TERC interacts with telomeres vs other genomic binding site via different mechanisms.
HOTAIR ChIRP-seq Analysis
Normalized signal within 10kb upstream and downstream of the summits of true HOTAIR peaks were extracted with a smooth window size of 50bp. Within each 50bp, the normalized HOTAIR ChIRP signal is calculated via:
Suz12, Ezh2 and H3K27Me3 ChIP-chip data were generated previous by
Gupta et. al., 2010,
Tsai et. al., 2010, and
Rinn et. al., 2007. ChIP-chip signal of Suz12, Ezh2 and H3K27Me3 of 10kb upstream and downstream of HOTAIR peak summits were also extracted in a similar way.