|Home | About | Journals | Submit | Contact Us | Français|
A global understanding of the actions of the nuclear hormone 1α,25-dihydroxyvitamin D3 (1α,25(OH)2D3) and its vitamin D receptor (VDR) requires a genome-wide analysis of VDR binding sites. In THP-1 human monocytic leukemia cells we identified by ChIP-seq 2340 VDR binding locations, of which 1171 and 520 occurred uniquely with and without 1α,25(OH)2D3 treatment, respectively, while 649 were common. De novo identified direct repeat spaced by 3 nucleotides (DR3)-type response elements (REs) were strongly associated with the ligand-responsiveness of VDR occupation. Only 20% of the VDR peaks diminishing most after ligand treatment have a DR3-type RE, in contrast to 90% for the most growing peaks. Ligand treatment revealed 638 1α,25(OH)2D3 target genes enriched in gene ontology categories associated with immunity and signaling. From the 408 upregulated genes, 72% showed VDR binding within 400kb of their transcription start sites (TSSs), while this applied only for 43% of the 230 downregulated genes. The VDR loci showed considerable variation in gene regulatory scenarios ranging from a single VDR location near the target gene TSS to very complex clusters of multiple VDR locations and target genes. In conclusion, ligand binding shifts the locations of VDR occupation to DR3-type REs that surround its target genes and occur in a large variety of regulatory constellations.
The vitamin D receptor (VDR) is an endocrine member of the nuclear receptor superfamily, since it is activated already by sub-nanomolar concentrations of its natural ligand 1α,25-dihydroxyvitamin D3 (1α,25(OH)2D3) (1). The classical, physiological role of 1α,25(OH)2D3 is the regulation of calcium and phosphate homeostasis and bone mineralization (2), but there is both epidemiological and pre-clinical evidence that VDR ligands have also anti-proliferative and immuno-modulatory actions (3,4). VDR is nearly ubiquitously expressed (5) and many cell types [for example, bone, skin and monocytes (6)] are capable of metabolizing the main circulating form of vitamin D, 25-hydroxyvitamin D3 (25(OH)2D3), to 1α,25(OH)2D3 by the enzyme 25(OH)D3-1α-hydroxylase, encoded by the gene CYP27B1, i.e. 1α,25(OH)2D3 can have in these tissues both auto- and paracrine roles.
In recent years, an increasing amount of data has shown a physiological function of 1α,25(OH)2D3 in the immune system, both in defense and in maintenance of self-tolerance (7). VDR ligands affect the immune system in many different ways, but the most important seems to be the enhanced capacity of innate anti-bacterial defense and a more tolerogenic profile toward autoimmune phenomena (4). One important target cell of 1α,25(OH)2D3 is the monocyte/macrophage, where the nuclear hormone promotes the anti-bacterial defense system by inducing the secretion of anti-bacterial proteins, such as cathelicidin anti-microbial peptide (CAMP) (8). Importantly, decreased anti-bacterial and anti-viral defenses are obvious in vitamin D deficiency, and restoration of the vitamin D-sufficient state restores immune responses (4).
As a member of the nuclear receptor superfamily VDR acts as a transcription factor that binds to specific vitamin D response elements (VDREs) within the regulatory regions of its primary target genes (1). VDR forms a heterodimer with retinoid X receptor (RXR), another member of the nuclear receptor superfamily, and binds preferentially to VDREs that are formed as direct repeat of two hexameric core binding motifs with 3 nucleotides spacing, so-called DR3-type VDREs (9). However, also other types of VDREs, such as direct repeats with 4 intervening nucleotides (DR4) (10) or everted repeats (ERs) with a spacer of 6–9 nucleotides (ER6-9) (11,12) have been reported. Single gene analyses suggest that VDR target genes often contain multiple functional VDREs (13–16).
VDR belongs to those nuclear receptors that bind to their genomic targets already in the absence of ligand. Under this constellation VDR has been shown to associate via co-repressor proteins with histone deacetylases and to repress its target genes (17). Ligand binding induces a conformational change that leads to dissociation of co-repressors and to the association of different types of co-activator proteins that on one hand lead to chromatin opening and on the other hand are part of the mediator complex, which forms a bridge to the basal transcription machinery associated with the transcription start sites (TSSs) of primary VDR target genes (18).
Thus far, both single gene and microarray analyses in various tissues and cells from different species have suggested a long list of VDR target genes. However, the actions of VDR seem to be highly cell-specific, since there is only very little overlap of VDR targets between the different tissues (19–22). Moreover, such studies have often been performed at such late time points that they did not allow distinguishing primary VDR targets from secondary ones. Recently, Ramagopalan et al. (23) reported the first genome-wide view on VDR locations in human lymphoblastoid cell lines suggesting 2776 genomic VDR locations after ligand treatment and 229 target genes. However, also these data were obtained from a very late time point (36h after ligand stimulation) and therefore may not fully represent the primary actions of VDR.
In this study, we used undifferentiated THP-1 human monocytic leukemia cells as a model system, since they reflect reasonably well the 1α,25(OH)2D3 response of primary human monocytes (24,25). To gain a genome-wide view of the primary actions of VDR in these cells, we determined its genomic binding sites after 40min stimulation with 1α,25(OH)2D3 and in the unstimulated state and compared it with the VDR target genes after 4h ligand treatment. We chose human monocytes as cellular model out of interest in the biological role of 1α,25(OH)2D3 signaling in the initial events of immune response. However, the main focus of this study is of mechanistic nature showing that VDR is associated with chromatin even in the absence of ligand, but that the balance of VDR binding shifts from locations that are gene-centered and lacking classical VDREs to locations that are more distal to the genes and contain one or more DR3-type response elements (REs).
THP-1 human monocytic leukemia cells were grown in RPMI1640 medium supplemented with 10% fetal calf serum, 2mM l-glutamine, 0.1mg/ml streptomycin and 100 U/ml penicillin and the cells were kept at 37°C in a humidified 95% air/5% CO2 incubator. Prior to mRNA or chromatin extraction, cells were grown overnight in phenol red-free medium supplemented with charcoal-stripped fetal bovine serum. Then, for RNA extractions cells were treated for 4h with 1α,25(OH)2D3 (Sigma-Aldrich) or with solvent (ethanol, finally 0.1%) or for chromatin immunoprecipitation (ChIP) assays for 40min with 1α,25(OH)2D3 or left unstimulated.
Total RNA was extracted using the Mini RNA Isolation II Kit (Zymo Research, HiSS Diagnostics, Freiburg, Germany). cDNA synthesis was performed for 60min at 37°C using 1µg of total RNA as a template, 100pmol oligo(dT)18 primers (Eurogentec, Seraing, Belgium), 500µM dNTPs, 40U Ribolock Ribonuclease Inhibitor and 40U MMuLV reverse transcriptase (Fermentas, St Leon-Rot, Germany). Prior to real-time quantitative PCR (qPCR) reaction, the cDNA was diluted 10-fold.
qPCR was performed using an ABI 7500 Fast System (Applied Biosystems, Lennik, Belgium). The reactions were performed using 250nM of reverse and forward primers (for sequence see Supplementary Table S1), 4µl 1/10 diluted cDNA template and the ABsolute Q-PCR SYBR Green Low ROX Mix (Abgene, Westburg, Leusden, The Netherlands) in a total volume of 20µl. In the PCR reaction Hotstart Taq polymerase was activated for 15min at 95°C, followed by 40 amplification cycles of 30s denaturation at 95°C, 30s annealing at primer-specific temperatures (Supplementary Table S1) and 40s elongation at 72°C and a final elongation for 5min at 72°C. PCR product specificity was monitored using post-PCR melt curve analysis. Relative expression levels were determined with the comparative delta threshold cycle (ΔCt) method. Amplification efficiencies of the primer pairs were taken into account and have been determined on the basis of a standard curve of a cDNA dilution series (with two exceptions, all PCR efficiencies were between 80% and 120%). Relative expression levels of the target genes were normalized to the three most stable of 10 tested internal control genes (b2M, GAPDH, HPRT1). The expression stability of the control genes was determined by the geNorm algorithm (26). Briefly, the arithmetic mean of replicated Ct values for each gene is transformed to a relative quantity (setting the sample with the highest expression as calibrator to 1), using the ΔCt formula Q=EΔCt=E(calibratorCt−sampleCt) (Q=quantity sample relative to calibrator sample; E=amplification efficiency). For normalization, the relative quantities were divided by the normalization factor being the geometric mean of the three reference genes.
Total RNA was checked for RNA integrity using a Biorad Experion electrophoresis station (Biorad, Nazareth, Belgium) and analyzed on Sentrix Human-6 v2 Expression BeadChips from Illumina (San Diego, CA, USA) at the Finnish Microarray Centre (Turku, Finland) using protocols recommended by the manufacturer. Raw microarray data is available at Gene Expression Omnibus (GEO) under accession number GSE27270 (SubSeries of GSE27438). Microarray data analysis was performed using R statistical software version 2.11 (R Dev Core) with associated libraries from Bioconductor project version 2.6 (27). Data was normalized using VST transformation and RSN normalization used as standard approach for Illumina arrays with the lumi package (28). Probe sets that were not linked to any known or predicted human gene were filtered out. Linear Models for Microarray Data (limma) package (29) using linear model fitting for statistical testing with empirical Bayes variance smoothing procedure was applied to detection of differentially expressed genes. Obtained P-values were corrected for multiple testing using Benjamini–Hochberg false discovery rate (FDR) procedure (30). To identify the biological processes targeted by the 1α,25(OH)2D3-treatment, we used GeneTrail (31) to analyze the enrichment of up and downregulated VDR target genes into GO categories using default settings.
After ligand or no treatment of THP-1 cells, nuclear proteins were cross-linked to DNA by adding formaldehyde directly to the medium to a final concentration of 1% and incubating at room temperature for 15min on a rocking platform. Cross-linking was stopped by adding glycine to a final concentration of 0.125M and incubating at room temperature for 5min on a rocking platform. The cells were collected by centrifugation and washed twice with ice cold PBS (140mM NaCl, 2.7mM KCl, 1.5mM KH2PO4, 8.1mM Na2HPO4·2H2O). The cell pellets were resuspended in 1ml of lysis buffer (1% SDS, 10mM EDTA, protease inhibitors, 50mM Tris–HCl, pH 8.1) and the lysates were sonicated to result in DNA fragments of 200–400bp. Cellular debris was removed by centrifugation. Aliquots of 100µl of the lysate were diluted 1:10 in ChIP dilution buffer (0.01% SDS, 1.1% Triton X-100, 1.2mM EDTA, 167mM NaCl, protease inhibitors, 16.7mM Tris–HCl, pH 8.1) and 1µg of anti-VDR antibody (ab3508-100, Abcam, Cambridge, UK) or non-specific IgG (12-370, Millipore, Temecula, CA, USA) were added and the samples were incubated for overnight at 4°C on a rotating platform. The immunocomplexes were collected using 20µl of Magna ChIP protein A magnetic beads (Millipore) at 4°C for 1h with rotation. The beads were washed sequentially for 5min in rotating platform with 1ml of the following buffers: low salt wash buffer (0.1% SDS, 1% Triton X-100, 2mM EDTA, 150mM NaCl, 20mM Tris–HCl, pH 8.1), high salt wash buffer (0.1% SDS, 1% Triton X-100, 2mM EDTA, 500mM NaCl, 20mM Tris–HCl, pH 8.1) and LiCl wash buffer (0.25M LiCl, 1% Nonidet P-40, 1% sodium deoxycholate, 1mM EDTA, 10mM Tris–HCl, pH 8.1). Finally, the beads were washed twice with 1ml TE buffer (1mM EDTA, 10mM Tris–HCl, pH 8.0) and the immune complexes were eluted twice using 250µl elution buffer (1% SDS, 100mM NaHCO3) at room temperature for 15min with rotation. The supernatants were combined and the immune complexes were reverse cross-linked at 64°C overnight in the presence of proteinase K (Fermentas) in a final concentration of 40µg/ml. DNA was extracted with phenol/chloroform/isoamyl alcohol (25/24/1) and precipitated with 0.1 volumes of 3M sodium acetate, pH 5.2 and 2 volumes of ethanol using glycogen as carrier. For ChIP-seq five individual VDR ChIP samples were pooled and concentrated using a SpeedVac. The ChIP templates were sequenced using a Solexa Gene Analyzer II platform (Illumina) at 36bp read length using standard manufacturer protocols at the Genomics Core Facility in Heidelberg, Germany. For the validation of the ChIP-seq data, selected genomic regions containing VDR peaks were analyzed by quantitative ChIP analysis of independent samples using Maxima™ Sybr Green qPCR Master Mix (Fermentas) and the genomic primers listed in Supplementary Table S2. In addition to VDR and IgG antibodies, immunoprecipitation was performed using 1µg of anti-RXR antibody (sc-553, Santa Cruz Biotechnologies, Santa Cruz, CA, USA). The qPCR reactions were performed with a LightCycler 480 apparatus (Roche) using the following PCR profile: 10min at 95°C, followed by 50 cycles of 30s at 95°C, 30s at 58°C and 30s at 72°C. The results were normalized with respect to input. Non-specific IgG results were subtracted by using the formula 2−(ΔCp)*100(specific antibody)−2−(ΔCp)*100(non-specific IgG), where ΔCp is Cp(immunoprecipitated DNA)−Cp(input) and Cp is the fractional cycle number. Statistical significance was calculated in reference to non-treated sample using two-tailed Student’s t-test.
Alignment of sequence reads against the human reference genome version hg19 was done using Bowtie software version 0.12.2. (32) with the following command line arguments: bowtie -n 1 -m 1 -e 70 -l 28 -k 1 -t -p 8 -q -S --best hg19 input_file_name output_file_name. Of note, one Solexa run with 1, 2 and 1 lanes of IgG, unstimulated and ligand-treated VDR ChIP sample sequencing, respectively, suffered from a minor post-sequencing processing error at the sequencing platform that had resulted in an omission of the 16th nucleotide from the large majority of reads. This was corrected for by inserting an N at the missing position in each read and using two modified settings in the Bowtie alignment (-n 2 -m 3). The short read data publically available under accession number GSE27437 (SubSeries of GSE27438) contains the corrected reads. Aligned reads from multiple lanes were combined per sample. In the case of ChIP-seq analysis, the common IgG background read set was scaled so that an individual background was generated for each VDR ChIP-seq sample by randomly removing IgG reads such that the read ratio IgG/VDR became the same for each VDR ChIP-seq sample. The scaled IgG and un-scaled VDR reads were then converted to sorted BAM format using SAM tools (33) and subsequently also to TDF format using igvtools to allow visualization in the IGV genome browser (34).
For ChIP-seq analysis statistically significant peaks were identified using MACS program version 220.127.116.11 (35) with the following command line arguments: macs --pvalue=1e-2 --nomodel --wig -t input_sample_file_name -c input_control_file_name --tsize=36 --format=BAM --name=analysis_topic --mfold=13 --shiftsize=250 --bw=250 --verbose=3. A loose initial P-value cutoff was purposeful used to enable catching all true signals with less regard to possible false positive signals at this early phase of analysis. Subsequent refinement of MACS peaks was done using PeakSplitter program version 1.0 (36) with the following command line arguments: PeakSplitter -p peak_folder_name -w aligned_read_wig_folder_name -o output_folder -c 5 -v 0.6 -f. An in-house R script was used to calculate fold enrichments (FEs) over IgG, P-values and FDR estimates for the subpeaks in an identical manner used internally in MACS 18.104.22.168. Since splitting the original peaks into several subpeaks may produce large sets of weaker flanking residual peaks, thus biasing for example the genomic and FE distributions of the peaks, we kept for further analysis only those peaks that either had FDR <1% or were the best subpeak (by FDR) within their parental MACS peak.
Further data analysis was conducted using an in-house R pipeline containing tools for identifying peaks overlapping in two samples, the analysis of genomic distribution of peaks and the integration of peak and gene expression data sets. Of note, since there are genomic positions that cannot be uniquely assigned to the selected 10 types of genomic elements used in the analysis of peak distributions therein, a prioritization scheme was employed, where the peaks were uniquely overlapped to the elements in a step-wise exclusive scheme, starting from coding elements (UTRs and coding exons), and then moving to introns and outward from the gene.
For the de novo identification of the VDR binding consensus sequence within ±100bp of the summits of all 2340 FDR <1% VDR peaks we employed the rGADEM R-package with the following settings: GADEM(Sequences, genome=Hsapiens, weightType=1, populationSize=500, numGeneration=20). Of note, the default settings did not identify the DR3-like sequence in the full peak set. The resulting candidate position weight matrices (PWMs) were then compared to the set of known PWMs in the JASPAR database using the MotIV R-package, and their locations mapped back to the original peaks using in-house R-scripts. To scan the peak sequences for PWMs representing different types of DRs and ERs with varying spacer lengths, we used ‘matrix-scan (quick)’ that is one of the regulatory sequence analysis tools [RSAT, available at http://rsat.ulb.ac.be/rsat/ (37)]. For the RSAT scans, background was modeled at Markov order 2 from the ‘upstream-noorf’ subset of the human genome and a P-value cutoff of 1×10−4 was used.
To identify VDR target genes, THP-1 cells were treated for 4h with either vehicle (ethanol) or 1α,25(OH)2D3 and collected for the extraction of total RNA that was subjected to gene expression microarray analysis. We chose the 4h time point as a compromise to obtain reasonable mRNA up or downregulation of VDR target genes, the majority of which still can be considered as primary targets. Statistically significant effect on gene expression levels (adjusted P<0.05) was detected for transcripts representing 638 separate Ensembl genes (604 separate RefSeq gene symbols), out of which 408 (384) were upregulated and 230 (221) downregulated (Supplementary Table S3). The broad expression range of these 638 genes is indicated in Supplementary Figure S1A. In addition to the larger number of affected genes, also the magnitude of change was higher for the upregulated set of VDR target genes (maximal upregulation 6.65-fold), compared to the downregulated set (maximal downregulation 1.48-fold). Validation of 43 selected genes on independent samples by qPCR showed a high correlation (r=0.889, P<1.8×10−14) to the microarray results across a wide range of fold changes (Supplementary FigureS1B). The validation results of five upregulated immunity-related genes CAMP, CD14, SP100, LRRCA8 and CD93 and a downregulated immunity-related gene HSH2D are shown in Supplementary Figure S1C.
Gene ontology (GO) enrichment analysis of the 408 upregulated VDR target genes using GeneTrail (31) highlighted the role of 1α,25(OH)2D3 in the regulation of immune response and signaling cascades and in the positive regulation of higher-level processes (Table 1). In particular, a positive effect of 1α,25(OH)2D3 on the production and secretion of the cytokine interleukin (IL) 1 was implicated. The 230 downregulated genes showed association with the negative regulation of many similar higher-level processes as the upregulated genes. In contrast to upregulated genes, GO terms enriched for downregulated genes lacked categories related to immunity and signaling, but interestingly included a category for transcription regulator activity (Table 1). Combining the up and downregulated genes for the enrichment analysis did not point to additional functions (data not shown).
In summary, we identified in THP-1 cells a set of 638 genes as primary targets of regulation by 1α,25(OH)2 D3, and thus by VDR, that were enriched for genes that are associated with the GO terms related to immunity and signaling.
To identify genomic VDR binding sites, THP-1 cells were treated for 40min with 1α,25(OH)2D3 or remained unstimulated and ChIP assays were performed with antibodies against VDR and non-specific IgGs. Purified chromatin samples were subjected to deep sequencing using a Solexa GAII platform. Analysis of the resulting sequences, using IgG-precipitated sample as a negative control, revealed the genome-wide landscape of the chromatin occupation of VDR in THP-1 cells. The reads at various stages of the analysis are summarized in Supplementary Table S4. In the 1α,25(OH)2D3-treated sample the number of high-confidence (FDR <1%) peaks indicating genomic VDR binding sites was 1820, in contrast to 1169 sites in the unstimulated sample (Figure 1A). The total number of VDR binding sites was 2340, with 649 (27.7%) sites common in both unstimulated and 1α,25(OH)2D3-treated samples and 520 (22.2%) and 1171 (50.0%) unique in unstimulated and 1α,25(OH)2D3-treated samples, respectively (Figure 1A). The genomic locations of the 2340 binding sites are listed in Supplementary Table S5.
VDR binding locations, grouped to unique and common as in the Venn diagram in Figure 1A, showed differences in their distribution to various genomic elements (Figure 1B). Within both the unstimulated and 1α,25(OH)2D3-treated peak sets, the high-confidence peaks, compared to peaks with lower confidence, were enriched to genomic regions representing proximal promoter and exons, and were depleted from distal regions and introns other than the first (Supplementary Figure S2). However, comparing the distributions of unstimulated and 1α,25(OH)2D3-treated peak sets revealed that upon ligand treatment VDR favors significantly more such binding locations that are distal to genes (>1kb upstream or >10kb downstream) and disfavors those representing coding sequences (Figure 1B). The shift of VDR binding from proximal to distal locations upon ligand treatment was also highlighted by the VDR peak proximity to TSS (Figure 1C). The density of VDR binding proximal to the TSS diminished with decreasing peak confidence, i.e. with increasing FDR (peaks with FDR >10% may be considered random-like), for peaks that were either unique in the unstimulated state (left panel) or common in both treatments (middle panel). In contrast, such enrichment was nearly completely lacking in the 1α,25(OH)2D3-treated cells (right panel). Together with Figure 1B, this suggests a non-random enrichment of VDR binding locations to gene-dense regions which, at first glance, appears weaker for the 1α,25(OH)2D3-treated peak set but may in fact indicate enrichment to a subset of genes, such as regulated genes.
Taken together, we have identified in THP-1 cells 2340 genomic VDR binding locations, a surprisingly large proportion of which display at least some binding also in the unstimulated state and not only upon 1α,25(OH)2D3-treatment.
De novo binding site search using rGADEM and MotIV R/Bioconductor packages identified the classical DR3-type RE consensus sequence for VDR–RXR heterodimers as the most highly enriched within ±100bp of the VDR peak summits (E-value 2.65×10−12 for the JASPAR consensus matrix MA0074.1 for RXRA::VDR). Of note, while the JASPAR consensus has near-exclusive T at positions 4, 12 and 13, more variability is allowed at these positions in the de novo DR3-type REs actually occupied by VDR in THP-1 cells (Figure 2A). However, only 31.7% (742) of all VDR peak summits included one or more de novo DR3-type RE. The quality of the identified de novo DR3-type REs appears high, since the more each of the de novo DR3-type REs resembled their common consensus, i.e. had lower E-value, the higher was the FE of the respective VDR peak (Pearson r=−0.217, P<3×10−15, Figure 2B). Hoping to identify any atypical VDR binding consensus sequences for the remaining 68.3% (1598) of VDR peaks, we re-ran the de novo analysis on them only, but even the most frequent consensus nrGGrGswGGrrn, corresponding best to a SP1 binding site in the JASPAR database (MA0079.2, E-value 1.03×10−7), was found in 538 peaks (23.0%). Another consensus sequence, nsAGGAAGn, was even less frequent with 274 peaks (11.7%). The latter had high similarity to the consensus for ETS transcription factor family member SPI1/PU.1 (MA0080.2; E-value 8.50×10−7). De novo analysis of even more focused subgroups, like the peaks unique to unstimulated cells, did not reveal additional consensus sequences (data not shown). Using a set of DR- and ER-type VDREs where the half sites were based on the identified de novo DR3-type RE and which covered all spacer lengths from 0 to 16 nucleotides, VDR peak sequences lacking the de novo DR3-type RE were scanned using RSAT (37). This scan failed to identify any VDRE with statistically significant enrichment to ±100bp of the peak summit versus the two 150bp flanks of that summit region (data not shown). It should be noted that we found no genome-wide evidence that VDR binds to any other type of VDRE than DR3. However, the possibility can not be excluded that a few individual non-DR3 sites exist that bind VDR.
In summary, only about a third of the genomic VDR binding sites in THP-1 cells contained a DR3-type RE and no common element was found to consistently explain the presence of VDR at locations lacking the classical VDRE.
While the de novo DR3-type RE was, at first look, apparently relatively infrequent at sites of VDR chromatin occupation in THP-1 cells (Figure 2), on closer inspection its presence strikingly correlated with the 1α,25(OH)2D3-dependent effects on VDR binding at each location (Figure 3A). The de novo DR3-type RE was present in about 20% of genomic VDR locations where the VDR binding in the unstimulated sample exceeded that in the 1α,25(OH)2D3-treated sample by >10 FE. Conversely, when the VDR binding after a short 40min 1α,25(OH)2D3 treatment superseded that of the unstimulated state, the percentage of peak locations with a DR3-type RE climbed to >90%. Furthermore, the number of de novo DR3-type REs near the peak summit tended to associate with increased peak FE within the common and 1α,25(OH)2D3-exclusive peaks, but not within the unstimulated-exclusive peaks (Figure 3B). The presence or absence of either the de novo SPI1/PU.1 or SP1-like consensus sequences, irrespective of coinciding de novo DR3-type REs, did not affect the FE of VDR peaks (data not shown).
Taken together, even without the ligand VDR occupies many chromatin sites, but most of these do not contain a DR3-type RE. In contrast, even a short ligand-treatment focuses the bulk of VDR binding to sites that do contain at least one DR3-type RE.
Combining gene expression data with the genomic locations of VDR offered an avenue to explore the mechanisms of target gene regulation by VDR. First we mapped the VDR binding locations into the neighborhoods of each gene regulated by the 4h 1α,25(OH)2D3 treatment. For the 408 upregulated genes the bulk of VDR regulation seems to occur via sites within only 30kb from the target gene TSS. However, a considerable proportion of regulation appears to be orchestrated by VDR binding sites up to 400kb away, especially when the sites are unique to the 1α,25(OH)2D3 treatment (Figure 4A, top panel and Supplementary Figure S3A, top right panel). In contrast, the 230 downregulated genes had only slight enrichment for common and 1α,25(OH)2D3-unique peaks near the TSS (Figures 4A, Supplementary Figures S3A and S3B, bottom panels). The frequency of VDR peaks around the TSS was also higher for upregulated than for downregulated genes (2.04±2.54 versus 1.05±2.16 peaks per gene, Wilcoxon rank sum test P-value 5.01×10−13). The strong enrichment of 1α,25(OH)2D3-treated peaks to the 1α,25(OH)2D3-regulated gene neighborhoods confirms regulated genes as the likeliest source to the apparent depletion of these peaks from the proximity of any TSS seen in Figure 1B and and1C.1C. Thus, VDR seems to shift from sites in the vicinity of the TSS of non-regulated genes to the proximity of VDR-regulated genes.
Since the presence of the de novo DR3-type REs strongly associated with ligand-dependency of the VDR binding (Figure 3), we investigated whether this is also reflected by the VDR target gene regulation (Figure 4B). We identified for each regulated gene the strongest VDR binding location in 1α,25(OH)2D3-treated cells within 400kb of the gene TSS and defined them as the most likely regulating VDR locations (Supplementary Table S6). These sites were further split to those in proximal (0 to±30kb of the TSS) and distal regions (±30 to ±400kb). If no VDR peak was present within the 400kb limit, we used from the outlying ‘desert’ the closest peak of any FE to represent a potential very distant regulator. Genes were divided, separately for both up and downregulated sets, into equally sized tertiles according to their stimulated fold change [4h 1α,25(OH)2D3 versus vehicle treatment]. The most likely regulating VDR locations were also divided at median FE to equally sized ‘low’ and ‘high’ VDR peak sets and further to peaks with or without any DR3-type RE near the peak summit. Gene counts in these 12 sub-categories suggest that the higher the VDR peak is, the more likely it is to strongly upregulate its target gene, especially when it contains a DR3-type RE. As already seen in Figure 4A and Supplementary Figure S3, strong regulation by VDR is possible from both proximal and distal sites, but less likely from sites beyond ± 400kb. Also among the genes for which the most likely regulating VDRE was in the ‘low’ category, the proportion of DR3-type RE containing peaks versus peaks without such sites increased toward the high fold change genes. In contrast to upregulated genes, the VDR peak FE or the presence of DR3-type REs associated much less with the fold change of downregulated target genes.
Taken together, 72% of the 408 upregulated 1α,25(OH)2D3 target genes showed VDR binding within 400kb of their TSS, while this applied only for 43% of the 230 downregulated genes.
Examples of previously unidentified VDR target genes that were upregulated by a 4h 1α,25(OH)2D3 treatment, and for which the most likely regulating VDRE could be easily proposed, are shown in Figure 5. SP100, a gene encoding SP100 nuclear antigen that has been identified as a co-repressor for transcription factors, such as ETS1 (38), is characterized by a single, strong and ligand-dependent VDR binding region with a single de novo DR3-type RE located in the first intron at+0.5kb of the gene’s TSS (a category 4 peak, Figure 5A).
Also for ELL, a gene encoding for an elongation factor of RNA polymerase II that has also been identified as a transcriptional co-factor for transcription factors, such as HIF-1 (39), the single dominant VDRE is located in the first intron at +19.3kb of the gene’s TSS, is slightly ligand-dependent although with considerable basal binding and contains a DR3-type RE (Figure 5B). The locus carries also a distal 1α,25(OH)2D3 target gene, LRRC25 encoding the protein leucine rich repeat containing 25 of unknown function, for which the ELL VDR binding region, located 105kb upstream of the LRRC25 TSS, appears the most likely regulator. Therefore, for the ELL gene the VDR peak is of category 4, while for the LRRC25 gene it is of category 5.
The third example is a complex locus containing the three 1α,25(OH)2D3 target genes thrombomodulin (THBD), encoding for the integral membrane protein THBD (also called CD141) (40), CD93, encoding for the C-type transmembrane receptor CD93 (41), and GZF1, encoding for the GDNF-inducible zinc finger protein 1 (42) (Figure 5C). However, the microarray result for GZF1 could not be validated by qPCR. The THBD gene responds most effectively to 1α,25(OH)2D3 treatment, while the strongest ligand-dependent VDR binding region of the genomic locus is 285kb upstream of the THBD TSS and contains eight distinct DR3-type REs and, i.e. it is a category 5 peak. There is also one other good genomic VDR location at 176kb upstream of the THBD TSS, but this lacks a DR3-type RE. The locus has still two additional, but weaker VDR binding regions, one with a DR3-type RE and one without, at 77 and 107kb upstream of the THBD TSS, respectively. Interestingly, there is also an unstimulated-unique VDR binding site within the short, single exon THBD gene. It is tempting to speculate that this VDR binding site could have a repressive role that could, together with the strong ligand-dependent VDR binding site in the vicinity, facilitate the relatively strong upregulation of THBD.
For all three examples the VDR binding at the sites indicated with an arrow was validated from independent chromatin samples of unstimulated and 40min 1α,25(OH)2D3-treated THP-1 cells by qPCR (Figure 5D). In addition to VDR, also the presence of the heterodimerizing partner RXR was verified from the same chromatin samples. In all cases the 1α,25(OH)2D3 treatment significantly increased the amount of both VDR and RXR at the respective VDR binding region. In addition, the upregulation by 1α,25(OH)2D3 of SP100, ELL and THBD mRNA was confirmed on independent samples (Figure 5D).
Additional examples of VDR target gene and VDR binding site scenarios are shown in Supplementary Figure S4. A strong double VDR peak with each a DR3-type VDRE was found very close to the TSS of the uncharacterized VDR target gene in chromosome 20 called open reading frame 197 (C20orf197), which also seems to serve as regulatory site for the VDR target genes C20orf177 and PPP1R3D (Supplementary Figure S4A). The gene ArfGAP with SH3 domain, ankyrin repeat and PH domain (ASAP2) is regulated by three VDR binding sites, of which one is proximal and two are distal (Supplementary Figure S4B). The location of the myosin IXB (MYO9B) gene represents a similar complex locus as that of the THBD gene, as the MYO9B gene seems to be regulated together with the genes OCEL1 and ABHD8 by two proximal VDR binding sites that both contain a DR3-type RE and four proximal sites (Supplementary Figure S4C). As examples of downregulated VDR target genes the genes glucosaminyl (N-acetyl) transferase 1, core 2 (GCNT1) (Supplementary Figure S4D) and SUMO1 pseudogene 1 (SUMO1P1) (Supplementary Figure S4E) are shown having their next VDR binding site nearly 1 Mb and more than 300kb distant from their TSS regions, respectively. Interestingly, the genomic region of SUMO1P1 also contains the 24-hydroxylase (CYP24A1) gene, which is in most cellular systems a strong VDR target gene with a number of ChIP-confirmed VDR binding locations (13,43).
In summary, the eight exemplary genomic loci showed the variation of gene regulatory scenarios ranging from a single VDR location close to the 1α,25(OH)2D3 target gene TSS, more complex ones with one VDR location distal to the TSSs of one or more 1α,25(OH)2D3 target genes and very complex ones with up to six VDR locations clustering with multiple 1α,25(OH)2D3 target genes.
For a general understanding of the mechanisms of 1α,25(OH)2D3 signaling it is essential to monitor the genome-wide location of VDR in relation to primary 1α,25(OH)2D3 target genes. In this study, we investigated the sites of VDR chromatin occupancy in THP-1 human monocytes after 40min treatment with 1α,25(OH)2D3 and in the unstimulated state using ChIP-seq, and related these to the VDR target genes identified using microarrays from the same cellular model 4h after ligand treatment. As expected, GO term analysis of the 638 primary 1α,25(OH)2D3 target genes that we identified in THP-1 cells showed a clear enrichment for immune-related genes. In particular, the positive effect of 1α,25(OH)2D3 on IL1 production and secretion was highlighted, which fits well with previous reports (44,45). However, we did not find IL1A or IL1B within our 638 1α,25(OH)2D3 target genes, but only the gene of the IL1 receptor antagonist (IL1RN) downregulated. The latter suggests that the regulation of IL1 activity by 1α,25(OH)2D3 is indirect via the block of the IL1 receptor. In addition, Kaler et al. (46) suggested that in THP-1 cells 1α,25(OH)2D3 downregulates the transcription factor STAT1, which is central in IL1 regulation.
Since the only nuclear target of 1α,25(OH)2D3 is VDR (47) and the 4h treatment is a reasonably short time, most of the genes that we found in our microarray analysis are assumed to be primary VDR target genes, i.e. VDR should bind a regulatory chromatin region that loops to the target gene’s TSS region. The majority (64%) of the 1α,25(OH)2D3 target genes were upregulated, which is typical for most VDR target tissues (48,49).
Through the choice of early time points both in microarray and ChIP-seq experiments we aimed to detect mainly primary VDR effects. This is in contrast to the recent study of Ramagopalan et al. (23), who reported VDR ChIP-seq data in HapMap lymphoblastoid cells after a long (36h) 1α,25(OH)2D3 treatment, focusing more on the association of VDR binding sites with disease SNPs and aspects of human evolution.
In this study, we identified 2340 genomic sites of VDR binding, 520 of which are exclusive to the unstimulated THP-1 cells. By size, our set of 1820 1α,25(OH)2D3-treated genomic VDR binding sites in THP-1 cells is comparable with the 2776 VDR binding sites in 1α,25(OH)2D3-treated lymphoblastoids (23), but the overlap is only 18.2%. On the level of 1α,25(OH)2D3 target genes identified by microarray analysis the overlap between the two cellular models is even lower at 5.6%. Although the poor overlap may be partly explained by differences in cutoffs when considering high-confidence peaks and target genes as well as the rather different duration of 1α,25(OH)2D3 treatment (40min or 4h versus 36h), it also suggests that VDR utilizes considerably variant binding sites in different cells. In silico screening for VDR binding showed an approximately 100-fold higher number of putative REs than actually used (50). The main regulator for the access to these sites is the organization of chromatin, which is very cell specific. Moreover, in cells of primary origin there is a significant inter-individual variation in chromatin organization and gene expression. Therefore, it is not surprising that monocytes and lymphoblasts have a rather small overlap of VDR binding sites. This is fully in line with the diversity of VDR target gene sets between different tissues (19–22).
In the absence of ligand, VDR has been shown to actively repress its target genes, likely via a mechanism involving an interaction with co-repressor proteins (17,51). Interestingly, only 14% of the 520 genomic VDR binding locations that occur uniquely in the absence of ligand contain a DR3-type VDRE. In contrast, up to 90% of the best 1α,25(OH)2D3-dependent VDR binding regions contain at least one DR3-type RE. This means that upon ligand treatment the VDR occupancy is strongly shifted from non-DR3 locations to DR3-type RE locations. It is possible that the non-DR3 locations may serve as a nuclear store of VDR to be utilized rapidly upon the introduction of the ligand, partly substituting for the need to transport VDR into the nucleus from outside. This mechanism may also apply to retinoic acid and thyroid hormone receptors, but not for the classical steroid receptors, since they do not bind to DNA in the absence of ligand.
Interestingly, for 300 (73.5%) of the 408 upregulated 1α,25(OH)2D3 target genes we found VDR binding within 400kb of their TSS, while this applied only for 104 (45.2%) of the 230 downregulated genes. This suggests that the mechanisms of downregulation of VDR target genes are rather complex and/or varied, and may require gene-specific investigations as demonstrated for the CYP27B1 gene (52). In this case the repressive function of VDR is known to result from indirect interaction with the chromatin, via transcription factor 3, also known as VDR interacting repressor (53). An additional question is whether for 1α,25(OH)2D3 target genes there is a ligand-induced derepression mechanism as described for few other members of the nuclear receptor superfamily (54). From the 408 1α,25(OH)2D3 target genes that we can explain via a VDR peak within 400kb of their TSS only 11 (2.7%) meet the derepression criteria that they have a VDR peak in the unstimulated sample and no peak in the 1α,25(OH)2D3-treated sample. Additional 32 genes (7.9%) can be called dominantly derepressed, since their main peak shows only in the unstimulated sample. This indicates that for some 10% of all 1α,25(OH)2D3 target genes a derepression mechanism may apply.
The fact that DR3-type REs are detected by our de novo motif search as the most dominating sequence in the genomic VDR peak regions confirms earlier reports (9,55) and the general belief in the field that DR3-type REs are the main type of VDREs. Nevertheless, only 31.7% of the identified VDR binding regions contain at least one DR3-type RE, i.e. the majority of genomic VDR binding sites do not use this type of VDRE. Since the most dominant ligand-induced VDR binding sites are highly enriched for DR3-type REs that, moreover, preferentially locate to the neighborhoods of upregulated 1α,25(OH)2D3 target genes, it is likely that DR3-type REs are primarily used in gene activation. However, this leaves the question about the nature of VDR binding at those 68.3% of locations that lack a DR3-type VDRE. Although we could not identify any non-DR3-type VDRE that would explain any significant proportion of the VDR binding, it does not exclude the possibility that a low number of such sites exist. Indeed, a case of repressive gene regulation via an ER9-type VDRE has been demonstrated by Polly et al. (17). Other possible explanations include sites where VDR binds indirectly to genomic DNA via other transcription factors, such as SP1, although again our de novo motif analysis did not identify specific enrichment of other transcription factor binding sites in VDR binding locations lacking a DR3-type VDRE compared to those that had it. A probably more likely explanation for both the VDR-peak-less target genes as well as the target gene-less VDR peaks comes from the recent demonstration that gene regulation by VDR is a very dynamic process with rapid changes of VDR binding site occupancy and, consequently, on the mRNA accumulation of the 1α,25(OH)2D3 target genes (13,56). Therefore, the time points chosen in this study represent only snap-shots of the early actions of 1α,25(OH)2D3 and its receptor VDR. Due to the fast dynamics of VDR binding site occupancy and resulting fluctuations in target gene mRNA amounts, it is likely that without time-course data a considerable proportion of all transient VDR binding sites remains unknown.
Although most of the upregulated genes may be explained by a more unified mechanism of one or multiple VDREs, as already suggested for various VDR target genes, such as CDKN1A (57), CYP24A1 (13,43), ALOX5 (16) and CCNC (14), we demonstrate here with three exemplary genomic regions that there is quite a variation in the gene regulatory scenarios of upregulated genes. Genome-wide there are only about 20 genes that show, as in the case of the SP100 gene, a single VDR location close to the target gene TSS. More common are situations where either one target gene has multiple VDR binding sites in its regulatory region or, as shown for ELL and LRRC25, a pair of closely located VDR target genes share one or more VDR binding sites. The first constellation we have already demonstrated for the above mentioned VDR target genes, while the second scenario applies for the members of the IGFBP gene family (15). However, the most complex regulatory situations we found in the clusters around the VDR target genes THBD and MYO9B, which each contains five or six VDR binding sites of different characteristics.
In conclusion, our genome-wide characterization of early-responding VDR binding sites close to primary 1α,25(OH)2D3 target genes identified a clear shift of VDR binding from non-DR3 sites to DR3-type REs upon ligand treatment. Furthermore, our data suggest regulatory explanations for a large majority of especially upregulated 1α,25(OH)2D3 target genes in THP-1 cells.
The gene expression microarray and sequence data have been submitted collectively to the NCBI Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo) under accession number GSE27438.
Supplementary Data are available at NAR Online.
Academy of Finland; Juselius Foundation, Télévie (Luxembourg); Luxembourgish National Research Fund (FNR); University of Luxembourg. Funding for open access charge: University of Eastern Finland.
Conflict of interest statement. None declared.
Kind thanks to the Finnish Microarray and Sequencing Centre in Turku, Finland for microarray services and members of the Benes team for massive parallel sequencing service. S.H., S.V. and C.C. designed the project; S.V. and S.S. performed the experiments; S.H., P.P. and V.B. analyzed the data and S.H. and C.C. wrote the paper.