|Home | About | Journals | Submit | Contact Us | Français|
Signal Transducers and Activators of Transcription (STAT) 5A/B regulate cytokine-inducible genes upon binding to GAS motifs. It is not known what percentage of genes with GAS motifs bind to and are regulated by STAT5. Moreover, it is not clear whether genome-wide STAT5 binding is modulated by its concentration. To clarify these issues we established genome-wide STAT5 binding upon growth hormone (GH) stimulation of wild-type (WT) mouse embryonic fibroblasts (MEFs) and MEFs overexpressing STAT5A more than 20-fold. Upon GH stimulation, 23827 and 111939 STAT5A binding sites were detected in WT and STAT5A overexpressing MEFs, respectively. 13278 and 71561 peaks contained at least one GAS motif. 1586 and 8613 binding sites were located within 2.5kb of promoter sequences, respectively. Stringent filtering revealed 78 genes in which the promoter/upstream region (−10kb to +0.5kb) was recognized by STAT5 both in WT and STAT5 overexpressing MEFs and 347 genes that bound STAT5 only in overexpressing cells. Genome-wide expression analyses identified that the majority of STAT5-bound genes was not under GH control. Up to 40% of STAT5-bound genes were not expressed. For the first time we demonstrate the magnitude of opportunistic genomic STAT5 binding that does not translate into transcriptional activation of neighboring genes.
Signal transducers and activators of transcription (STAT) 5A and 5B (referred to as STAT5) are latent transcription factors that are activated by cytokines, such as interleukins, erythropoietin, thrombopoietin, growth hormone (GH) and prolactin, through their respective receptors (1). Mice, from which the Stat5 genes had been deleted globally or in defined cell populations have shed light onto general and cell-specific functions. These include erythropoiesis, immune functions, liver metabolism, body growth and mammary gland development during pregnancy (2–10). In addition, humans carrying homozygous mutations in the Stat5B gene are characterized by growth defects, due to impaired IGF1 signaling and immune defects (11). STAT5 levels vary greatly between tissues and evidence is mounting that distinct levels are needed for different cellular functions (12). In turn, excessive activation of STAT5 has been linked to the progression of solid tumors and leukemia (13,14).
Genes containing GAS motifs (consensus sequence TTCnnnGAA) bound in vivo by STAT5 and whose expression is STAT5-dependent are considered direct STAT5 targets. Gene expression studies on cells ectopically overexpressing STAT5 (15) and siRNA-mediated reduction of STAT5 (16) have been employed in the identification of putative target genes. Last, comparative gene expression analyses of wild-type (WT) and STAT5-null cells and tissues have provided additional candidate target genes (17). In many cases, GAS motifs were detected within regulatory regions of these putative STAT5 target genes. However, due to the experimental design of these studies, it is not necessarily possible to distinguish between bona fide STAT5 target genes and genes whose induction is secondary to the STAT5 signaling pathway. Excessive STAT5 activation has been observed in various tumor cells and a case has been made that this occurs through the aberrant activation of STAT5 target genes (18). However, there is little experimental evidence to suggest that elevated STAT5 levels and activity are able to control otherwise dormant STAT5 targets genes in vivo.
Here, we have addressed two questions pertaining to STAT5 as a cytokine-controlled DNA binding protein and transcriptional activator. First, we determined GH-induced genome-wide binding of STAT5 and asked to what extent a 20-fold elevated concentration of STAT5 affects its binding pattern. Second, we asked to what extent the genes that bind STAT5 are controlled by STAT5 and GH. For this purpose, we have established a well controlled experimental system that permitted qualitative and quantitative analyses of STAT5 binding and corresponding transcriptional consequences. We have used mouse embryonic fibroblasts (MEFs) with three distinct genotypes; WT, Stat5−/− and Stat5−/− ectopically expressing STAT5A at a more than 20-fold elevated concentration. ChIP-seq experiments were used to identify genomic loci that bind STAT5 at normal and elevated concentrations upon GH stimulation. Subsequently, we analyzed the corresponding gene expression. This approach allowed us to address the following questions: (i) How many genomic loci bind STAT5 in the presence of GH? (ii) Does a greatly elevated STAT5A concentration alter STAT5 binding quality and quantity? (iii) Does STAT5 binding correlate with GH-dependent and independent transcription of neighboring genes?
Mice were handled and housed in accordance with the guidelines of National Institute of Diabetes, Digestive and Kidney Diseases (NIDDK) Animal Care and Use Committee. Stat5+/− mice (2) were bred into the C57BL/6 background. Stat5+/− mice were intercrossed and MEFs were isolated from 12.5–13.5-day fetuses. MEFs were expanded in DMEM with high glucose supplements and 15% fetal bovine serum (FBS). MEFs of passage 9–10 were used for experiments. After 5h of starvation in serum-free medium with 0.1% of BSA, MEFs were treated with growth hormone for indicated time period.
The retroviral-expression vector carrying a WT Stat5A gene was based on an MSCV-IRES-GFP backbone (gift from Richard Moriggl, Ludwig-Boltzmann Institute, Vienna, Austria). 293T cells were transfected with the plasmid using FuGENE (Roche, Indianapolis, IN, USA). Supernatants were collected for 48–72h after transfection and passed through a 0.45-μm filter before freezing at −80°C. For the infection, 106 Stat5−/− MEFs were seeded on a 10-cm culture dish and infected the next day with retrovirus in the presence of 8μg/ml polybrene. After infection, non-fluorescent cells and GFP-expressing cells were isolated using BD FACS Vantage Cell Sorter Flow Cytometer (Becton Dickinson, San Jose, CA, USA) and sorted directly into PBS. Sorted MEFs were maintained in Dulbecco’s Modified Eagle medium (DMEM) supplemented as described above.
After 5h starvation followed by 45min stimulation with GH, protein was extracted from primary Stat5+/+ MEFs and Stat5−/− MEFs overexpressing Stat5A. In total, 1× 106 cells were pelleted by centrifuging for 1min at 1500g and then washed once with ice-cold PBS. Cells were lysed by adding lysis buffer containing: 50mM Tris pH 7.4, 150mM NaCl, 10% glycerol, 1% IGEPAL (NP-40), 2mM EDTA, 100mM NaF and protease inhibitor cocktail (Roche) and incubating for 30min on ice. Lysates were cleared by centrifugation in at 4°C in a microfuge at maximum speed. Protein concentration was determined by Bradford reagent (Bio-Rad, Richmond, CA, USA). A quantity of 10µg of total protein was electrophoretically separated and transferred to polyvinylidene fluoride (PVDF) membranes using the NuPAGE system (Invitrogen, Carlsbad, CA, USA). Membranes were probed with anti-STAT5 (C-17), anti-STAT3 (C-20) and anti-beta actin antibodies (C-4) (Santa Cruz Biotechnology, Santa Cruz, CA, USA) and phospho-STAT5 (Y694) antibody from Cell Signaling Technology (Danvers, MA, USA) and HRP-conjugated secondary antibodies (GE Healthcare, Wauwatosa, WI, USA). Proteins were detected using enhanced chemiluminescence substrate (Pierce, Rockford, IL, USA), Kodak MR film (Kodak, Rochester, NY, USA) and Amersham Hyperfilm ECL (GE Healthcare, Wauwatosa, WI, USA).
After 5h of starvation, MEFs were cultured for 2h in the absence or presence of GH and total RNA was isolated using RNeasy mini kit (Qiagen, Valencia, CA, USA). A quantity of 1μg amounts of RNA were reverse transcribed (cDNA reverse transcription kit; Applied Biosystems, Foster City, CA, USA). Real-time quantification of mRNA transcript levels was performed using the TaqMan gene Expression Master Mix (Applied Biosystems, Foster City, CA, USA) according to the manufacturer’s instructions. Real-time PCR was carried out using an ABI Prism 7900HT (Applied Biosystems, Foster City, CA, USA). TaqMan probes for STAT5a (Mm00839861_m1), Socs1 (Mm00782550_s1), Socs2 (Mm00850544_g1), Bcl6 (Mm00477633_m1) and beta-actin (4352341E) were used (Applied Biosystems, Foster City, CA, USA) for Real-time PCR.
After 5h of starvation, cells were treated for 2h with GH. Total cellular RNA from each group of the MEFs was extracted with TRIzol reagent (Invitrogen) according to the manufacturer’s instructions. Microarray analyses were performed using Affymetrix Mouse Genome 430 2.0 GeneChips (Affymetrix, Santa Clara, CA, USA) (four groups, biological triplicates for each group). Expression values were determined with GeneChip Operating Software (GCOS) v1.1.1 software. Robust Multichip Average (RMA) signals were summarized using GeneSpring GX 10.0.1 (Agilent) and normalized by quantile normalization. All data analysis was performed with GeneSpring software GX 10.01.
Chromatin immunoprecipitation coupled by illumina sequencing (ChIP-seq) experiments were performed as described previously (19,20). In brief, after 5-h starvation, WT, Stat5−/−, or STAT5A-Stat5−/− MEFs were stimulated with or without 1µg/ml GH for 45min. MEFs were then cross-linked with 1% formaldehyde for 10min. Chromatin from 5×106 cells was used for each ChIP experiment. Antibodies against STAT5 (sc-835, Santa Cruz, CA, USA) and IgG (AB-105C, R&D System) were used. The ChIP DNA fragments were blunt-ended, ligated to the Solexa paired-end adaptors and sequenced with the Illumina Hi-seq 2000 genome analyzer.
Experimental data sets came from WT (WT-GH), Stat5−/− (KO-GH) and Stat5−/− overexpressing STAT5A (OE-GH) MEFs upon GH stimulation. The mapped tags of STAT5 antibodies and IgG samples were analyzed using the MACS program (21) with the following parameters; -mfold=10,20 (fold enrichment threshold for the detection of highly enriched regions) -g mm (mouse genome) –p 1.0e−4 (P-value cut-off). The identified peaks were further split by using PeakSplitter (22). ChIP-seq data obtained from GH-treated Stat5−/− MEFs (Transfected with an empty vector, KO-empty-GH) served as negative controls and were independently processed with the same parameters. A total of 28343 (WT-GH), 117,771 (OE-GH), 38,451 (KO-GH) and 23,404 (KO-empty -GH) peaks were identified. Since the identified STAT5 binding peaks obtained in Stat5−/− MEFs (KO-GH and KO-empty-GH) should be false-positives; we only used those peaks that did not overlap with those of the KO sets. With this approach, a total of 23827 and 111939 STAT5 peaks were identified in the WT-GH and OE-GH sets, respectively. STAT5 binds to DNA directly and its DNA binding motif (GAS motif) has been well defined. We utilized the GAS motif information to accurately identify bona fide STAT5 binding sites. The Find Individual Motif Occurrences (FIMO) DNA motif identification tool (23) was used to scan the peak regions with a STAT5 position-specific scoring matrix (P-value threshold: 1e−3). Out of the 23827 and 111939 STAT5 peaks, 13278 and 71561 peaks contained at least one GAS motif within their peak regions. These final peaks were regarded as bona fide STAT5 binding sites and used in this study. For the comparison analysis (Figures 3 and and4),4), the total number of tags was normalized to 10 million.
Genes were classified based on the quantity of STAT5 binding within a 2.5kb promoter region and/or adjacent 8kb of upstream region. STAT5 binding within each promoter region was measured by means of the maximum peak height, as well as the sum of peak heights. The maximum peak height (max_peak_height) indicates STAT5 affinity to target sites, whereas the sum of peak heights represents the overall amount of STAT5 binding. The sum of peak heights was converted to z-score for comparison (z-score_sum). To identify bona fide STAT5 target genes, the following peaks were defined as strong (or weak) peaks; either max_peak_height ≥95th percentile (70th for weak) among the heights of identified peaks or z-score_sum≥5. With the defined peaks, genes were classified as follows; class I—genes with STAT5 bindings in both WT-GH and OE-GH promoter regions, class II—genes with STAT5 bindings only in OE-GH promoter region, class III—genes with STAT5 bindings in both WT-GH and OE-GH upstream regions and class IV—genes with STAT5 bindings only in OE-GH upstream region. With the strong peaks, 29, 161, 49 and 186 genes were categorized into class I, II, III and IV, respectively.
WT, Stat5−/− and Stat5−/− MEFs overexpressing mouse STAT5A (Stat5−/−; Stat5A) MEFs were prepared (see ‘Materials and Methods’ section). The Stat5a mRNA concentration in Stat5−/−; Stat5A MEFs was more than 100-fold higher than in control cells (Figure 1A). The STAT5A protein level in Stat5−/−; Stat5A MEFs was at least 20-fold higher than in control MEFs (Figure 1B). Similarly, greatly elevated p-STAT5 levels were observed in Stat5−/−; Stat5A MEFs (Figure 1B) upon GH stimulation. This demonstrates that neither the GH receptor nor JAK2 were limiting in this system and that the elevated STAT5 levels could be activated.
To identify genetic loci readily recognized by activated STAT5 in vivo, we used ChIP-seq to analyze WT Stat5−/− (WT-GH) and STAT5A overexpressing Stat5−/−; Stat5A MEFs (OE-GH) upon GH stimulation. Stat5−/− MEFs and rabbit serum IgG served as negative controls. Control and experimental MEFs were grown to ~70% confluency, starved for 5h, followed by GH stimulation for 45min and ChIP-seq analysis (see ‘Materials and Methods’ section).
The results were analyzed as follows. Raw mapped tags of the WT-GH and OE-GH samples were processed with corresponding IgG controls by using a peak-calling program, MACS (see ‘Materials and Methods’ section for details) (21). A total of 28343 and 117771 regions were initially identified in the WT-GH and OE-GH sets, respectively. Subsequently, regions overlapping with STAT5 enriched sites in the Stat5−/− MEFs samples were discarded. We pinpointed GAS motifs within the binding regions using the FIMO motif identification tool (23) and defined peaks containing at least one GAS motif as bona fide STAT5 binding sites. These high quality peaks were used throughout this study. In sum, 13278 and 71561 STAT5 peaks were regarded as bona fide STAT5 binding sites in WT-GH and OE-GH MEFs, respectively (Figure 2). In total, 4690 peaks were shared between the two data sets and 66871 were de novo sites (Figure 2A).
To further identify bona fide STAT5 target genes, we focused on peaks located within 2.5kb of promoter sequences (−2kb to +0.5kb) or upstream sequences (−10 kb to −2kb). Based on the max_peak_height we called strong (95th percentile) and weak (70th percentile) peaks (see ‘Materials and Methods’ section). Using these criteria we defined four classes of genes; class I—genes with STAT5 binding in their promoters in both WT-GH and OE-GH MEFs; class II—genes with STAT5 binding in their promoters only in OE-GH MEFs; class III—genes with STAT5 binding in upstream regions in both WT-GH and OE-GH MEFs; class IV—genes with STAT5 binding in upstream regions only in OE-GH MEFs. Based on strong peaks, 29, 161, 49 and 186 genes were categorized into class I, II, III and IV, respectively (Figure 2A). In general, STAT5 binding to promoter regions (−2kb to +0.5kb) was much stronger than binding to upstream sequences (−10 kb to −2kb). In addition, we quantified the level of STAT5 bindings in each group using z-score. Comparison analysis reveals that the level of STAT5 binding is changed according to the context of cells. For example, 8588 STAT5 binding sites show very weak STAT5 binding (or not bound by STAT5) in the OE-GH context in terms of the fold change (Figure 2B) although these two cells were grown in very similar conditions except with the dosage of STAT5.
Using strong and weak criteria, STAT5 bound to promoter sequences of 29 and 214 genes, respectively (Supplementary Table S1). The class of 29 genes encompassed bona fide STAT5 target genes, including Socs2, Socs3, Cish and Bcl6 (Figure 3 and Table 1). Overall, both the maximum peak height and the peak sum were similar or slightly higher in STAT5A overexpressing cells compared with WT MEFs (Table 1). In general, STAT5 binding was observed only in the presence of GH.
Although higher STAT5A concentrations did not significantly alter binding at GAS motifs recognized already in WT MEFs, de novo binding to juxtaposed GAS motifs within a given gene was observed in several cases. For example, while promoter-bound GAS motifs in the Bcl6 gene were recognized by STAT5 independent of its concentration, a subset of intronic GAS sites was recognized only upon STAT5A overexpression (Figure 4). The 12 kb Bcl6 locus contains 20 GAS motifs within the STAT5 binding areas, 13 of which are conserved between mouse and opossum. Under physiological STAT5 concentrations, five conserved GAS motifs within the promoter region were occupied. De novo occupancy of a set of four highly conserved GAS motifs within the first intron but not other conserved GAS motifs was obtained at elevated STAT5 concentration (Figure 4). Within the Cish locus, de novo binding was observed over five conserved GAS motifs within the 3′ flanking region, although the degree of binding over promoter-bound GAS motifs remained unchanged (Figure 3). Similarly, elevated STAT5A levels led to the occupation of de novo binding sites within the promoter upstream region of the Socs3 gene (Figure 4). The two distal ones are highly conserved between species.
Based on global gene expression analyses of these 29 genes, all 26 represented on the array were expressed. Expression ranged over 40-fold between the lowest (Irf2) and highest (Ubc) (Table 1). However, expression of only four genes (Socs2, Socs3, Cish and Bcl6) was under GH-STAT5 control in WT MEFs (Table 1). At 20-fold elevated STAT5 concentration, one additional gene (Tle1) entered this category (Table 1). Whereas GH-induced expression of the Socs3 and Bcl6 genes was not altered by extra STAT5A, Socs2 and Cish mRNA levels increased (Table 1). Expression of selected genes was verified using RT–PCR (Supplementary Figure S1). These data demonstrate that in WT cells, GH-induced STAT5 binding to promoter sequences coincides only for some genes with GH-induced expression and STAT5 dependency.
Using strong and weak criteria, STAT5 bound to promoter sequences of 161 and 1551 genes, respectively (Supplementary Table S1). Genes in this group, which includes Esr1 and Rwdd1, exhibit STAT5 binding only in STAT5 overexpressing MEFs (OE-GH). Out of the 161 genes, 135 were represented by probes on the Affymetrix arrays. Out of the 135 genes, 18 were not expressed at detectable levels (Table 2). Among those that were expressed, mRNA levels ranged over 30-fold between genes. Using a 2-fold cut-off, only the Esr1 gene had acquired STAT5-dependent GH-induction in STAT5A overexpressing MEFs (Table 2). These results demonstrate that elevated STAT5 levels produce de novo STAT5 binding to genes that are not targeted at lower concentrations. However, this de novo binding does not place these cells under STAT5-dependent GH control. Moreover, 10% of the de novo recognized genes are silent.
STAT5 bound to the upstream region (−10kb to −2kb) of 49 genes both in WT and STAT5A overexpressing cells. Out of the 39 genes with probes on the Affymetrix chips, 7 were not expressed at detectable levels (Table 3). Two genes, Sbno2 and Mapkapk3, acquired modest GH-inducibility in GH overexpressing cells. Sbno2 belongs to a family of helicase co-repressors and its expression is induced in macrophages by IL-10 through STAT5 (24).
Upon STAT5 overexpression, de novo binding to the upstream region (−10kb to −2kb) of 186 genes was observed (Table 4). Out of the 158 genes represented on the microarray, expression of 41 was undetectable. Out of the 117 expressed genes, 4 acquired a slightly <2-fold GH induction in STAT5 overexpressing MEFs (Table 4).
A biological replicate ChIP-seq experiment with STAT5A overexpressing MEFs in the absence and presence of GH validated and substantiated the above findings (Supplementary Figure S2). The pattern of STAT5 binding peaks is very similar to the original one, and we confirmed the unique STAT5 binding sites (Figure 4) which are only observed in the OE-GH cell-context. The MEME-ChIP program (25) successfully identified the canonical STAT5 binding motif (TTC…GAA) as the most frequently occurring sequence at the top 1000 STAT5 binding regions. Among the top 10000 peaks identified in the replicate, 86% of the peaks overlapped with peaks in the original data set.
Approximately 60% of STAT5 binding is located over GAS motifs, whereas in 40% of the binding areas no classical GAS motifs are found. The identified areas of STAT5 binding were separated into two sets according to the existence of GAS motif (canonical versus non-canonical). De novo motif prediction (25) that uses public motif databases for co-factor identification (26) established bona fide GAS motifs in all peaks in STAT5 overexpressing MEFs (Supplementary Figure S3A). Peak height in WT MEFs was in general lower and convincing GAS motifs were found only in strong peaks. In search for additional transcription factor binding motifs co-localizing with GAS motifs, we identified AP1 motifs in strong and intermediate peaks of STAT5 overexpressing MEF (Supplementary Figure S3). No clear consensus motifs were identified in the non-canonical STAT5 binding areas.
A number of cell-specific STAT5 target genes have been reported. Among them are prolactin-induced milk protein genes in mammary tissue (27–29), FoxP3 in Regulatory T cells (Tregs) (10), Tbx21, Il4ra and Il17rb2 in T-cells (ref) and IL17 in TH17 cells (30). Regardless of the STAT5 concentration, no binding was observed to GAS motifs in established regulatory regions of milk protein genes, the IL17a and IL17f genes and the Tbx21 and Il12rb2 genes that are activated by IL-2 through STAT5 in Th1 cells (30). Similarly, no binding was observed in the Il4ra gene that is induced by IL-2 through STAT5 in T-cells or the Foxp3 gene. These findings demonstrate that STAT5 is unable to access GAS motifs within genuine cell-specific STAT5 target genes even at very high concentrations and levels of activation.
A total of nearly 900 genes in MEFs were induced more than 2-fold by GH (Supplementary Table S2). For each gene in the four classes, we established the correlation between the extent of STAT5 binding and GH-induced expression. Gene expression was estimated from microarray analysis (y-axis) and the degree of STAT5 bindings was calculated using maximum peak height in either promoter or upstream regions according to the classification (x-axis) (see ‘Materials and Methods’ section). Correlation was measured with Pearson’s correlation coefficient. Each spot indicates a single gene. As shown in Figure 5, in none of the four classes was a direct correlation between the extent of STAT5 binding and the level of gene expression.
We extended data analysis and included STAT5 binding sites at genomic regions associated with gene body and downstream sequences (+0.5 kb ~ +9kb) (Supplementary Figure S4). We were unable to establish a direct correlation between gene expression and STAT5 binding strength.
Although it is well established that many cytokines activate the transcription factor STAT5, evidence that STAT5 binding controls cytokine-induced gene expression is based on a reverse experimental designed. Uncovering that expression of a given gene is regulated by cytokines leads to the search for GAS motifs in the respective promoter region and establishing in vitro and in vivo STAT5 binding, followed by reporter gene experiments carrying these GAS motifs serves as a paradigm to demonstrate the link between cytokine stimulation, transcription factor binding and gene expression. However, it is not clear how many genes that are recognized by STAT5 are actually under STAT5 and cytokine control. We addressed this question by genome-wide identification of all genes that bind STAT5 and correlating their expression at different STAT5 concentrations.
The GAS motif sequence TTCnnnGAA occurs statistically once every 4096bp and therefore can be found in the promoter/upstream sequence of virtually every single gene. At physiological concentrations of STAT5, GH-induced binding was observed at 13278 sites with GAS motifs within the genome of MEFs. Using stringent conditions, 78 genes featured STAT5 binding within 10kb of promoter/upstream sequences. However, only four of these genes (Socs2, Socs3, Cish and Bcl6) were soundly under the control of GH by STAT5. A >20-fold over-expression of STAT5A resulted in 71561 GH-induced STAT5 binding peaks. Under stringent conditions, 347 genes featured de novo STAT5 binding within 10kb of promoter/upstream sequences. Up to 15% of these genes did not show any detectable basal or GH-induced expression despite strong GH-induced STAT5 binding. Out of the 234 expressed genes, only 4 acquired a modest (<2-fold) GH-induction upon STAT5 overexpression.
Analyzing genome-wide GH-induced STAT5 binding and gene expression at different STAT5 concentrations revealed novel information about the role of STAT5 as a transcription factor. First, GH-induced STAT5 binding intensity to regulatory regions in bona fide STAT5 targets, such as Socs2 and Cish is relatively concentration independent. Thus, in these cases STAT5 is not a limiting factor. However, GH-induced expression of both of these genes increased in STAT5 overexpression. It is possible, that the higher STAT5 concentration allows binding over longer time windows. In contrast, regulation of the bona fide STAT5 target Bcl6 is independent of STAT5 levels although additional binding sites within the Bcl6 gene were uncovered upon STAT5 overexpression. Second, the majority of GH-induced STAT5 binding in WT cells is associated with genes that are expressed over a wide range but are not under GH control. This finding demonstrates that STAT5 binding to promoter upstream sequences does not automatically convey STAT5 control over those genes. Third, increasing STAT5 concentrations by at last 20-fold, which is accompanied by GH-induced STAT5 phosphorylation, greatly increases the number of genes bound by STAT5. However, only a small fraction (18.6%, 30 out of 161) of these genes acquired GH-STAT5 control (>1.5 fold). Moreover, basal and GH-induced expression of a large number of genes that acquire STAT5 binding is undetectable. This finding demonstrates that at elevated concentrations, STAT5 can bind to otherwise dormant GAS motifs in promoter upstream sequences of both expressed and silent genes. However, STAT5 binding is not sufficient to convey transcriptional activation. Fourth, although STAT5 binding intensity to regulatory regions in bona fide target genes was mostly independent of the STAT5 concentration, STAT5 occupancy of a restricted set of neighboring GAS motifs was observed in several genes, including Socs3 and Bcl6. This demonstrates that dependent on the concentration, STAT5 can also bind to dormant sites within STAT5 target genes. However, this binding appears to be of little or no consequence. A similar discrepancy between transcription factor binding and transcriptional regulation has been reported for the ligand-induced glucocorticoid receptor (GR) (31). More than 8200 GR binding sites were detected in a mammary cell line but there was no clear association with transcriptional regulation of nearby genes.
Of the STAT5 binding peaks, ~50% coincide with GAS motifs, confirming the concept that STAT5 binds to specific sequence motifs. However, the nature of STAT5 binding to sequences without bona fide GAS motif is not clear. We were unable to identify a unifying sequence motif in non-canonical STAT5 binding areas. In contrast, AP1 motifs were identified in canonical STAT5 binding areas at elevated STAT5 levels, suggesting cooperativity between STAT5 and AP1.
ChIP-seq analyses have shed light onto binding patterns of different STAT family members (8,9,30–33) in several cell types. Dependent on the specific STAT member and the investigating lab, each individual STAT member binds to between several hundreds and >4000 genes. In many cases there is substantial overlap of binding between different STATs members. For example, out of the >4000 binding sites for STAT4, 50% were also targeted by STAT6 and vice versa (33). The same seems to hold true for STAT3 and STAT5 (9). This observation is particularly relevant in light that different STATs not only serve similar functions but also can serve opposite functions. Although ChIP-seq studies provide a detailed view on the genome-wide occupation by a given transcription factor, the biological relevance of these findings remains to be determined. Specifically, it is not clear why only a small fraction of genes binding any given transcription factor appear to be under its control. This enigma cannot be explained by the lack of polymerase recruitment as many genes binding STAT5 display already high basal activity. Moreover, there were no apparent differences between GAS motifs and neighboring sequences between productive and non-productive sites. It is possible that STAT5 is a weak transcription factor that has considerable activity only in specific contexts.
Raw microarray data can be accessed by GSE33688 and ChIP-seq data by GSE34896 from the NCBI Gene Expression Omnibus at http://www.ncbi.nlm.nih.gov/geo.
Supplementary data are available at NAR Online: Supplementary Tables 1 and 2 and Supplementary Figures 1–4.
The Intramural Research Programs (IRP) of NIDDK, NIAMS and NEI at the National Institutes of Health (NIH), USA; Epigenomic Research Program for Human Stem Cells (grant no. 2007-2004134); the World Class University Program, Ministry of Education, Science and Technology, through the National Research Foundation of Korea, South Korea (R31-10069); the National Basic Research Program of China (973 Program, grant no. 2012CB518501); Natural Science Fund for Colleges and Universities, Department of Education, Jiangsu, China (grant no. 11KJA360003); L.H. is an adjunct member of the Department of Nanobiomedical Science and WCU Research Center for Nanobiomedical Science, Dankook University, Chungnam, Korea. Funding for the open access charge: WCU Research Center, Dankook University National Institutes of Health.
Conflict of interest statement. None declared.
We thank G. Poy for help with microarray experiments, Y. Cui and W. Liao for technical help and Richard Moriggl for reagents.