F9 mouse EC cell culture
EC cells were cultured in Dulbecco's modified Eagle's medium supplemented with 10% fetal calf serum and 40 μg/ml Gentamicin. Cells were seeded in gelatine-coated tissue culture plates (0.1%) and ATRA was added to the plates in a final concentration of 1 × 10−6 M at different time points. For assays involving RAR subtype-specific agonists, cells were incubated with BMS961 (RARγ specific; final concentration 10−7 M), BMS753 (RARα specific; final concentration 10−6 M) or BMS641 (RARβ specific; final concentration 10−6 M).
ChIP and reChIP assays
Cells were fixed with 1% para-formaldehyde (Electron Microscopy Sciences) for 30 min at room temperature. ChIP assays were performed following standard conditions: chromatin sonication and immunoprecipitation in lysis buffer (50 mM Tris–Cl pH=8, 140 mM NaCl, 1 mM EDTA, 1% Triton, 0.1% Na-deoxycholate) complemented with protease inhibitor cocktail (Roche 11873580001); 2 × washes with lysis buffer; 2 × washes with lysis buffer containing 360 mM NaCl; 2 × washes with washing buffer (10 mM Tris–Cl pH=8, 250 mM LiCl, 0.5% NP-40, 1 mM EDTA, 0.5% Na-deoxycholate); 2 × washes with 1 × TE; elution at 65°C; 15 min in elution buffer (50 mM Tris–Cl pH=8, 10 mM EDTA, 1% SDS). RXRα and RARγ have been immunoprecipitated with antibodies generated by immunization of rabbits with the following peptides:
mRXRα: PB105 (MDTKHFLPLDFSTQVNSSSLNSPTGRGC),
mRARγ: PB288 (CSKPGPHPKASSEDEAPGGQGKRGQSPQPD).
Polyclonal anti-RXRα and anti-RARγ were purified from the crude serum by affinity chromatography. RNA Polymerase II (sc-9001 H-224), p300 (sc-584 N-15) and Rac3/NCoA-3 (sc-9119) antibodies were purchased from Santa Cruz biotechnology (Santa Cruz, CA). For RXRα, RARγ, p300 and RAC3 6 × 106 cells were used per ChIP, whereas for RNA polII 2 × 106 cells were used. For reChIPs, at least four ChIP assays of 6 × 106 cells were used for the first IP. For reChIPs, the first antibody (anti-RXRα) was covalently linked to the sepharose protein A (Sigma P92424) using disuccinimidyl suberate (DSS). The covalently linked Ab beads were washed with ethanolamine (0.1 M), followed by glycin at pH=2.8 to remove non-covalently linked antibodies from the beads. Beads were washed with 50 mM sodium borate at pH=8.2 and PBS, and were incubated overnight at 4°C with the corresponding whole cell extract as in a regular ChIP assay. Following standard washing, elution was performed with 10 mM DTT (30 min, 37°C). The eluates from four ChIPs were combined, diluted at least 30 times with lysis buffer (containing protease inhibitors like in a regular ChIP assay), and incubated overnight with the second antibody (anti-RARγ) and sepharose Protein A beads at 4°C. The subsequent steps were performed as for regular ChIPs.
The immunoprecipitated chromatin was validated using positive (recruitment to known targets) and negative (‘cold' region) controls and the binding was expressed as enrichment relative to the whole cell extract input control (% input) and/or relative to a ‘cold' region (fold occupancy); validation assays were performed by quantitative real-time PCR (qPCR, Roche LC480 light cycler device) using Quantitect (Qiagen).
Massive parallel sequencing
After qPCR validation, immunoprecipitated chromatin was quantified using Qubit (Quant-It dsDNA HS Assay Kit; Invitrogen). In all, 10 ng of the ChIPed material was used for preparing the sequencing library (ChIP-seq DNA sample preparation kit, Illumina). In all, 5 pmol of the library was used per flow cell in the Solexa 2G Genome analyzer (Illumina). The Illumina Pipeline v1.4.0 was used for primary data analysis (image processing: Firecrest; base calling: Bustard; alignment: Gerald) from which uniquely aligned reads with up to two mismatches relative to the mouse mm9 reference genome were kept.
Peak detection approach
MACS v. 1.3.6 (Zhang et al, 2008
) was used as peak caller at the first level of data treatment to obtain signal intensity wiggle files and annotated peak regions using a Poisson model distribution (P
-value confidence cutoff: 1 × 10−5
). Considering that a certain number of false positive regions were observed during this treatment, which in our hands cannot be decreases without compromising true positives by playing with the P
-value confidence parameter, we used subsequently an approach based on a machine-learning algorithm able to define the model distribution associated with the ChIP-seq data set under study. This approach, implemented in the R package MeDiChI, has been previously described (Reiss et al, 2008
) for analysis of ChIP-chip assays; its implementation for ChIP-seq assays will be described elsewhere (manuscript in preparation).
RT–PCR and microarrays
Total RNA was extracted from cells, treated with ATRA during different period of times, using the GENEEluteTM RNA extraction kit (Sigma). In all, 2 μg of the eluted RNA was used for reverse transcription (AMV-RTase, Roche; oligo(dT) New England Biolabs; 1 h; 42°C). The cDNA was diluted 10-fold and used for real-time qPCR (Roche LC480). Expression of the following marker genes was assessed to follow the process of F9 cell differentiation:
For early response to the differentiation inductor (ATRA):
Hoxa1 (CCCAGACGGCTACTTACCAG; CATGGGAGTCGAGAGGTTTC);
Rarb (GATCCTGGATTTCTACACCG; CACTGACGCCATAGTGGTA).
For late response to the differentiation inductor (ATRA):
Col4a1 (ATGCCCTTTCTCTTCTGCAA; ATCCACAGTGAGGACCAACC);
Lama1 (CCGACAACCTCCTCTTCTACC; TCTCCACTGCGAGAAAGTCA);
Lamb1 (TCTATGCTCGGCAGTGTGAC; CAGTGGTCTCCTGACCCAAT);
Lamc1 (GGCCGAGTGCCTACAACTT; CAGTGGCAGTTACCCATTCC).
During data analysis, all qPCR values were normalized relative to the constitutively expressed 36B4 mRNA (AATCTCCAGAGGCACCATTG; CCGATCTGCAGACACACACT).
Using 250 ng of starting total RNA, biotin-labelled cDNAs were synthesized and hybridized on Affymetrix GeneChip® Mouse Gene 1.0 ST Array (Affymetrix, Santa Clara, CA, USA) according to the manufacturer's recommendations as described in ‘GeneChip® whole transcript sense target labelling assay manual' (P/N 701880 Rev.4). The arrays were further washed and stained with streptavidine-phycoerythrin in an Affymetrix GeneChip® Fluidics station 450 using the script protocol F450-0007, then scanned with an Affymetrix Gene Chip Scanner 3000 7G. Expression values were generated with Affymetrix software Expression Console version 1.1, using sketch quantile normalization and median polish summarization as in Robust Multiarray Analysis.
RNA polymerase II ChIP-seq data processing
RNA Polymerase II genome-wide intensity profiles were generated using MACS v. 1.3.6 (Zhang et al, 2008
) as peak caller and signal intensity profiles were processed by MeDiChI to identify chromatin regions presenting significant RNA Polymerase II enrichment (P
-value cutoff: 1 × 10−2
). MeDiChI-predicted enriched regions were further filtered according to their genomic localization by comparing them with the mm9 coding region Ref-seq annotation. Thereby coding regions presenting significant levels of RNA PolII at the TSS were identified and selected for further analysis. A Quantile normalization procedure was used to enable a comparison between different samples. Normalized profiles were compared in the context of their signal intensities to identify local changes in the Pol II occupancy relative to the control sample. Note that this approach differs from previously described linear corrections between samples based on total numbers of reads; it has been implemented in the R package POLYPHEMUS (Mendoza et al
, submitted). The final comparisons between the PolII levels at the TSS relative to the average behaviour through the gene body have been used as readouts to identify and describe genes presenting an enhancement of their transcriptional activity at the different analysed time points ().
The global RXRα and RARγ localization at different time points during ATRA-induced differentiation has been followed by ChIP-seq. To increase the confidence in binding localization assessment, we collected the mappable reads from all five time points in a single file called metaprofile. The metaprofile associated with the localization of RXRα and RARγ was processed by MACS and MeDiChI to identify significant enriched regions. Interestingly, this approach generates a higher signal/background contrast, thus increasing the sensitivity of peak calling. Taking in consideration the heterodimeric nature between RXRα and RARγ, we compared the two metaprofiles to identify the optimal confidence parameter at which the coexistence between RXRα and RARγ is maximal (see ). Indeed, for a P-value threshold of 1 × 10−4 (CT40; where CT=−10 × log (P-value)) all identified RARγ sites in the metaprofile colocalize with RXRα. These co-occupied sites were further evaluated in the context of their temporal RXRα and RARγ co-occupancy. Whereas a CT40 has been used for metaprofiles binding sites selection, the same parameters are too stringent for the analysis of time-point profiles. We have taken a CT25 per time-point profile for evaluating the RXRα and/or RARγ occupancy at the pre-identified sites ().
RXRα–RARγ co-occupied sites (metaprofiles
comparison) were annotated based on their proximity (<10 kb) to transcriptionally active coding regions (upregulation and downregulation levels relative to control sample: ratio cutoff
0.5, respectively); annotated genes are referred to as ‘putative' target genes. In order to gain temporal information about the correlation between RXRα–RARγ binding and putative target gene transcription, the RXRα and RARγ binding at a given time point was compared with the temporal mRNA levels of putative target genes. To further remove possible false positive annotations, genes presenting coexistence between RXRα–RARγ binding and differential mRNA expression levels at least at one time point were retained during selection. For classification purposes, the coexistence of several events was scored per time point following a hierarchical order of importance in the context of gene regulation:
Finally, the temporal incidence of the evaluated events was classified using an SOTA approach (Euclidean distance, Max. cycles=7, cell variability P
-value=0.01) under the open access multiExperiment Viewer (Saeed et al, 2003
; see ).
Dynamic regulatory maps and Network reconstruction
RXRα–RARγ putative target gene information as well as further TF-gene regulatory interaction annotations extracted from the NCBI database and/or predicted by MatInspector (Cartharius et al, 2005
) were combined into a binary matrix (‘1' for TF-target gene association; otherwise ‘0'). Furthermore, time-point transcriptomics data were expressed in log2 ratios relative to the 0-h control. These two data sets were integrated into the DREM (Ernst et al, 2007
) to establish dynamic regulatory maps. In addition to classifying temporal sets of gene expression data into co-expression paths, DREM predicts BPs, defined as a time point at which the co-expression behaviour of a subset of genes diverges from the common path present in a previous time point. Moreover, the association of a given TF with the corresponding BPs is estimated by evaluating the enrichment of its target genes using hypergeometric distributions relative to the genes associated with the common path.
To construct gene networks, genes associated with each predicted co-expression path were evaluated in the context of their functional co-citation relationships (Genomatix Bibiosphere PathwayEdition). To identify functional co-citation interactions between different co-expression paths, all genes used for the initial analysis were evaluated for their functional co-citations relationships as explained before. Finally, the intra- and inter-co-expression paths co-citation interactions were integrated into Cytoscape (Shannon et al, 2003
; Cline et al, 2007
). To further increase the confidence of the reconstructed network, topological information concerning the number of edges per node (Hubba; Lin et al, 2008
) were used as filter in the final representation. Briefly, we have used Hubba's Double screening scheme (DSS) approach, which combines a Maximum Neighbourhood Component (MNC) with a Density of Maximum Neighbourhood Component (DMNC) to score nodes based on the number of their interconnections. The final gene-network representation corresponds to the top 100 ranked nodes (red to green colour code) as well as their corresponding first level neighbours. Additional information, like TF annotations, BMS961 and ATRA responsiveness, number of functional co-citations between nodes (edge's broadness) and co-expression path belonging (colour coded) are also included in the final display of this RXRα–RARγ induced genes network, which is also available in Cytoscape format (Supplementary File S1
). Transcriptomics data associated with ATRA and RAR subtype-specific agonist treatments were also included as attributes for the reconstructed network (Supplementary Files S2–S6
F9 cells were transfected with siRNA oligomers directed against RNA target sequences for the following TFs: Hoxb2 (QIAGEN; FlexiTube GeneSolution: Cat. No. SI01069089, SI01069096, SI01069075, SI01069082; working concentration: 50 nM); Hoxb5 (QIAGEN; FlexiTube GeneSolution: Cat. No. SI01069173, SI01069180, SI01069159, SI01069166; working concentration: 50 nM); Foxa1 (QIAGEN; FlexiTube GeneSolution: Cat. No. SI01004493, SI01004500, SI01004479, SI01004486; working concentration: 50 nM); Foxa2 (QIAGEN; FlexiTube GeneSolution: Cat. No. SI01004528, SI02737182, SI01004514, SI01004521; working concentration: 50 nM); Gata4 (QIAGEN; FlexiTube GeneSolution: Cat. No. SI01009813, SI01009820, SI01009799, SI01009806; working concentration: 50 nM). In addition, F9 cells were transfected with an siRNA oligomer directed against the RNA target sequence of GFP (Dharmacon; P-002048-01-20; working concentration: 50 nM) as a control for the specificity of the assay. The transfection efficiency has been followed by co-transfecting the previous oligomers with the 6-FAM labelled siGLO transfection indicator (Dharmacon; D-001630-01-05; working concentration: 50 nM). F9 cells were transfected with Lipofectamine 2000 (Invitrogen; Cat. No. 11668-027) during 18 h, then medium has been changed and cells were treated either with ATRA or with ethanol during 48 h as previously described.
Affymetrix microarrays as well as Illumina platform ChIP-seq data described in this study are available from the Gene Expression Omnibus database (http://www.ncbi.nlm.nih.gov/geo
) under accession number GSE30539