|Home | About | Journals | Submit | Contact Us | Français|
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Signal transducer and activator of transcription (STAT) proteins are key regulators of gene expression in response to the interferon (IFN) family of anti-viral and anti-microbial cytokines. We have examined the genomic relationship between STAT1 binding and regulated transcription using multiple tiling microarray and chromatin immunoprecipitation microarray (ChIP-chip) experiments from public repositories.
In response to IFN-γ, STAT1 bound proximally to regions of the genome that exhibit regulated transcriptional activity. This finding was consistent between different tiling microarray platforms, and between different measures of transcriptional activity, including differential binding of RNA polymerase II, and differential mRNA transcription. Re-analysis of tiling microarray data from a recent study of IFN-γ-induced STAT1 ChIP-chip and mRNA expression revealed that STAT1 binding is tightly associated with localized mRNA transcription in response to IFN-γ. Close relationships were also apparent between STAT1 binding, STAT2 binding, and mRNA transcription in response to IFN-α. Furthermore, we found that sites of STAT1 binding within the Encyclopedia of DNA Elements (ENCODE) region are precisely correlated with sites of either enhanced or diminished binding by the RNA polymerase II complex.
Together, our results indicate that STAT1 binds proximally to regions of the genome that exhibit regulated transcriptional activity. This finding establishes a generalized basis for the positioning of STAT1 binding sites within the genome, and supports a role for STAT1 in the direct recruitment of the RNA polymerase II complex to the promoters of IFN-γ-responsive genes.
Interferon-gamma (IFN-γ) is a potent pro-inflammatory cytokine that regulates a spectrum of biological processes, and is produced primarily in response to infection . IFN-γ signal transduction begins at the cell surface with the formation of a heteromeric protein complex that includes IFN-γ, IFN-γ receptor-1, and IFN-γ receptor-2 . Associated with the IFN-γ receptors are members of the Janus kinase (JAK) family of tyrosine kinases, which become activated upon formation of the IFN-γ receptor complex, and in turn phosphorylate the signal transducer and activator of transcription-1 (STAT1) transcription factor [2-4]. Upon its phosphorylation, STAT1 homo-dimerizes, and is transported into the nucleus where it binds to the gamma activated sequence (GAS; consensus: TTCNNNGAA ) to activate the expression of IFN-γ-responsive genes . One example of a STAT1-responsive gene is the interferon regulatory factor-1 (IRF1) gene, an important IFN-γ-responsive transcription factor which contains a functional GAS 120 bp upstream of its first exon [7,8]. In addition, STAT1 functions as a component of the IFN stimulated gene factor 3 (ISGF3) transcription factor complex, which also includes STAT2 and interferon regulatory factor-9 (IRF9) [9,10]. The ISGF3 complex is formed in response to signaling by IFN family members, including IFN-α, that associate with IFN-α receptor-1 and IFN-α receptor-2. Upon its transportation into the nucleus, ISGF3 binds to IFN-stimulated response elements (ISREs; consensus: GGAAANNGAAACT ) to activate the expression of IFN-α-responsive genes.
How gene expression is regulated by the association of transcription factors to their target sequences is a central question in mammalian biology. Compared with lower organisms such as S. cerevisiae and D. melanogaster, the genomic regions responsible for regulating the expression of mammalian genes are much more difficult to predict, and may be located far away from a gene's transcriptional start site (TSS). Furthermore, the debate over what constitutes a gene was further intensified with the development of tiling arrays for the human genome, and the discovery that much of the human transcriptome is coded for by regions of the genome that lie outside of exons as they have been classically defined [12,13]. Chromatin immunoprecipitation microarray (ChIP-chip) technology provided additional insight into the regulation of the human transcriptome when it was employed to examine the transcription factor binding sites (TFBSs) of Sp1, cMyc, and p53 on chromosomes 21 and 22 . Interestingly, only 22% of these TFBSs were located in the upstream regions of genes, the regions that have classically been defined as "promoter" regions. The genomic relationship between sites of transcription factor binding, and sites of transcription, was further elucidated in an elegant study of estrogen receptor (ER) binding across chromosomes 21 and 22 . Remarkably, the RNA polymerase II (RNApolII) complex was found to associate in an estrogen-dependent manner with the majority of tested TFBSs, even those located far from the nearest TSS. Using a chromatin capture assay, an ER binding site located over 144 kbp from the NRIP-1 gene was shown to function as an enhancer of NRIP-1 transcription. These results showed that, when bound to its TFBS, the ER can act as an enhancer to regulate the expression of target genes by associating, often across large chromosomal distances, with RNApolII.
Recently, the response to IFN-γ has been the subject of microarray expression analyses as well as ChIP-chip analyses for STAT1 [16-20]. Hartman et al. compared the locations of STAT1 binding sites on chromosome 22 (as determined by ChIP-chip) with the expression level of the nearest gene, and noted that only 21% of STAT1 binding sites were within 10 kbp of the start of the nearest gene. Hartman et al. also suggested that a novel mechanism may alter the specificity of STAT1 binding, depending on whether it is induced by IFN-α or IFN-γ.
Elucidating the rationale with which transcription factor binding sites are deployed across the genome will likely be of great benefit to our understanding of the complexities of transcriptional regulation in mammals. We have examined the data associated with several recent tiling array experiments, and show here that STAT binding and mRNA transcription in response to IFN are much more closely associated than previous reports would suggest. Our results indicate that, in contrast to other mammalian transcription factors such as the ER, STAT1 binds closely to the TSSs of its target genes. Our findings also suggest that STAT1 may play an important role in recruiting the RNApolII complex to the promoters of IFN-γ-inducible genes.
To examine the relationship between STAT1 binding and mRNA transcription in response to IFN-γ, we obtained the data associated with Hartman et al.'s tiling array analysis of gene transcription and STAT1 binding for chromosome 22 following treatment with IFN-γ and IFN-α . We were particularly interested in determining whether there was any evidence for localized coupling of differential STAT1 binding and differential mRNA expression. To our surprise, there was a pronounced probe level correlation between differential STAT1 binding in response to IFN-γ, and differential mRNA expression in response to IFN-γ (Figure (Figure1A).1A). A similar probe level relationship between STAT1 binding, STAT2 binding, and differential expression, was also apparent for the responses of cells to IFN-α (Figure 1B,C). Furthermore, probes that detected differential STAT2 binding in response to IFN-α also detected substantial differential expression in response to IFN-γ (Figure (Figure1D),1D), and consistent with this, there was a high probe level correlation between the binding of STAT1 or STAT2 in response to IFN-α, and the binding of STAT1 in response to IFN-γ (Figure 1E,F). Consistent with Hartman et al.'s conclusion that the binding preference of STAT1 may differ in response to IFN-α versus IFN-γ , some differences in the binding sites of STAT1 were apparent for these stimuli (Figure (Figure1E),1E), although this was primarily true only of some of the relatively weak STAT1 binding sites. For relatively strong STAT1 binding sites, the binding preferences in response to IFN-γ versus IFN-α showed substantial correlation (Figure (Figure1E1E).
The findings outlined above strongly suggest that there is a high degree of overlap between the positions where STAT1 and STAT2 bind in response to either IFN-γ or IFN-α, which is perhaps not surprising, since STAT2 binds DNA as part of the ISGF3 transcription factor complex, of which STAT1 is also a component . Consistent with this, correlation was also apparent between the transcriptional response of cells to IFN-γ and the transcriptional response of cells to IFN-α (data not shown). Of greater interest, however, is the clear probe level correlation between the binding of STAT1 or STAT2, and the differential expression in response to IFN-γ or IFN-α. Given that probes for the chromosome 22 arrays are 300 – 1400 bp long , our observations indicate that the locations at which STAT1 or STAT2 associate with chromosome 22 are in the majority of cases very close to regions of chromosome 22 that are transcriptionally regulated by IFN-γ or IFN-α.
To examine STAT1 binding in response to IFN-γ at a higher resolution than is permitted by cDNA microarray technology, we obtained data associated with a recent ChIP-chip analysis of STAT1 binding in response to IFN-γ  across the 1 Mb encyclopedia of DNA elements (ENCODE) region . A moderately conservative differential intensity cut-off of 20.4 (1.3-fold) was selected by inspection of a quantile-quantile plot of the differential intensities (data not shown). In total, 463 probes satisfied this cut-off. These probes all clustered to 22 distinct genomic sites, 17 (77%) of which were within 10 kbp of the nearest 1st exon of a gene (Figure (Figure33 and Table Table1).1). This method appears to identify fewer STAT1 binding sites than are identified on the UCSC genome browser , presumably because our significance cut-off is higher. When the alternative transcripts of three genes, ATP11A, MCF2L, and RAB11-FIP3, were accounted for, 21 of the 22 (95%) of the STAT1 binding sites we identified were within 10 kbp of the nearest 1st exon of a gene (Table (Table1).1). Of the 11 genes identified in this manner, three (IRF1, PRKCG, and F7) are known IFN-γ-responsive genes [7,8,23,24]. In the case of IRF-1, a STAT1 binding site immediately upstream of the first exon has been previously identified by in vivo footprinting .
Our finding that IFN-γ-induced STAT1 binding to chromosome 22 is closely associated with increased mRNA transcription suggested that STAT1 may bind proximally to RNApolII within the promoters of IFN-γ-responsive genes. We were therefore interested in examining the binding of RNApolII to regions near STAT1 binding sites. To this end, we obtained data associated with Bieda et al.'s recent analysis of RNApolII binding within the ENCODE region for non-stimulated HeLaS3 cells . Remarkably, the binding of RNApolII in non-stimulated cells was negatively correlated (on the log2 scale) with the binding of STAT1 in response to IFN-γ for 17 of the 22 binding sites within the ENCODE region (Figure 3A–I, K, L). This is indicative of a diminished level of RNApolII binding, relative to the average level of RNApolII binding for the entire ENCODE region. Localized suppression of RNApolII binding is likely to reflect the locations at which modifications in the chromatin state, or possibly other factors, prevent the association of RNApolII (relative to the average level of association across the ENCODE region).
Of the five STAT1 binding sites that did not exhibit suppressed RNApolII binding in non-stimulated cells (Figure 3J, M, N), three were located within 7 kbp upstream of the IRF1 gene (Figure (Figure3N),3N), and one was located near the start of the IFNaR1 gene (Figure (Figure3M).3M). In each case, STAT1 binding in response to IFN-γ was positively correlated with RNApolII binding, relative to the average level of RNApolII binding across the ENCODE region.
Our finding that IFN-γ-induced STAT1 binding to chromosome 22 is closely associated with increased mRNA transcription indicated that STAT1 binds closely to transcriptionally-active regions of the genome. Our subsequent identification of STAT1 binding sites that are bound by RNApolII near the start positions of the IRF1 and IFNaR1 genes in non-stimulated cells suggested that basal levels of mRNA transcription might be detectable at these sites. To investigate this possibility, we obtained data associated with a recent analysis of basal mRNA transcription in non-stimulated HeLa cells for 10 human chromosomes . Pleasingly, most exons of the IRF1 and IFNaR1 genes exhibited clear peaks of basal mRNA detection (Figure 4A, B). Basal mRNA transcription was apparent not only within exons, but also introns and extragenic regions (Figure 4A, B). All peaks of STAT1 binding near the IRF1 and IFNaR1 genes overlapped with peaks of basal mRNA transcription as well as peaks of basal RNApolII binding (Figure 4A, B), thus reinforcing our finding using Hartman et al.'s chromosome 22 data  that STAT1 binding to chromosome 22 is closely associated with increased mRNA transcription in response to IFN-γ.
When conducting microarray analyses of cellular responses to stimuli, it is often difficult to distinguish genes that are obligatory components of the cell's response from genes that are induced only in particular cell types or under certain conditions, and may therefore be less critical to the outcome of the response. It is interesting to note that within the ENCODE region, STAT1 bound most closely to start positions of two genes with key roles in IFN signal transduction, IRF1 and IFNaR2 (Table (Table1).1). In fact, these are the only genes for which proximal STAT1 binding in the ENCODE region was observed that have known roles in IFN signal transduction. We wondered therefore whether STAT1 binding sites are also closely associated with the start positions of other "key" IFN-γ-inducible genes. To this end, we identified 10 genes for which the transcriptional response to IFN-γ is conserved between human microglial cells and mouse livers, and are thus likely to be essential components of the responses of most cells to IFN-γ (Table (Table2).2). Remarkably, the expression levels of seven of these genes are known to be regulated by STAT1 binding sites located within just 200 bp of the start of each gene (Table (Table2).2). This close proximity with which STAT1 regulates the expression of key IFN-γ-responsive genes is in stark contrast to the relatively large distances between most STAT1 binding sites within the ENCODE region, and the start positions of the nearest genes (Table (Table1).1). This suggests that genes with STAT1 binding sites that are not located close to the start of the coding region could be more susceptible to tissue-specific mechanisms of regulation, for example by chromatin modification of the promoter.
Whilst it is clear that mammalian promoter regions are often enriched for TFBSs, much remains to be learnt about how mammalian TFBSs are distributed throughout the genome, and what determines the positioning of TFBSs. We analyzed data from Hartman et al.'s study of IFN-induced STAT binding and mRNA transcription for chromosome 22 , and to our surprise, we observed a marked probe level correlation between STAT1 binding and differential expression in response to IFN-γ. Although all ChIP-chip analyses of transcription factor binding have the potential to detect some RNApolII binding sites due to interactions between RNApolII and the transcription factor of interest, our results suggest that the vast majority of STAT1 binding sites, if not all STAT1 binding sites, are proximal to transcriptionally active regions. The extent to which we observed this phenomenon indicates that STAT1 binding sites are located proximally to regions that are bound by the transcriptional apparatus.
In addition to the discovery of a probe level correlation between mRNA transcription and STAT1 binding in response to IFN-γ, our analysis of Hartman et al.'s Chromosome 22 tiling array data for STAT binding and mRNA transcription in response to IFN  revealed probe level correlations for various combinations of STAT1 binding, STAT2 binding, and mRNA transcription in response to treatment with either IFN-γ or IFN-α. Together, these observations suggest that differences in the binding specificities between STAT1 and STAT2 are predominantly quantitative rather than qualitative. Consistent with this idea, IFN-α and IFN-γ were recently shown to induce the transcription of similar sets of genes . The substantial correlation we observed between the binding sites of STAT1 in response to IFN-α vs IFN-γ perhaps casts some doubt over the significance of Hartman et al.'s observation that the binding preference of STAT1 is differentially regulated in response to IFN-α as compared to IFN-γ .
Consistent with our observation of a probe level correlation between STAT1 binding and mRNA expression in response to IFN-γ for Chromosome 22, we found that STAT1 binds specifically to sites within the ENCODE region that exhibit either diminished or enhanced binding of RNApolII in non-stimulated cells. Collectively, these findings suggest that STAT1 binds proximally to regions of the genome that are involved in the initiation of RNA transcription, since this would explain both the correlation with mRNA transcription that we identified using the Chromosome 22 tiling array data, and the correlation with differential RNA polymerase binding that we identified using the ENCODE tiling array data. The primary mechanism by which STAT1 activates gene transcription may therefore be to recruit RNA polymerase II to the promoters of IFN-γ-responsive genes.
There have been several reports implicating STAT1 in the recruitment of components of the transcriptional apparatus to the promoters of particular IFN-responsive genes. In 2003, STAT1 was shown to synergize with NF-κB at the promoter of the CXCL9 gene to bind the transcriptional co-activator, CREB-binding protein, which may in turn recruit the RNApolII complex . In 2004, it was shown that IRF9 (a member of the ISGF3 complex) may recruit the RNApolII complex to the promoter of interferon-stimulated gene 54 in a histone deacetylase-dependent manner . In 2005, STAT1 was shown to recruit the DNA helicase protein mini-chromosome maintenance-5 to the promoters of the genes encoding IRF1, TAP1, guanylate-binding protein-1, and class II transactivator in response to IFN-γ . Furthermore, transcription of the aforementioned genes in response to IFN-γ was dependent on the activity of the mini-chromosome maintenance-5 protein, and the binding of the mini-chromosome maintenance-5 protein to the IRF1 gene was consistent with the binding of RNApolII to the IRF1 gene . Our findings that STAT1 binds to positions on chromosome 22 that exhibit IFN-γ-inducible mRNA transcription, and to positions within the ENCODE region that exhibit either enhanced or diminished RNApolII binding (relative to the average level of RNApolII binding), are in strong agreement with the above reports. Furthermore, our results generalize the findings of the previous reports, since they suggest that recruitment of the RNApolII complex to the promoters of IFN-γ-responsive genes is the dominant mechanism by which STAT1 activates gene transcription. At STAT1 binding sites that are normally inaccessible to RNApolII, this could involve chromatin modification to make the site more accessible to RNApolII. At STAT1 binding sites that exhibit basal levels of RNApolII binding, STAT1 may simply enhance RNApolII binding. Such a mechanism of transcriptional regulation contrasts with that of the ER, which can regulate the transcription of genes from distant ER TFBSs by acting as an enhancer for RNApolII activity .
Interestingly, two STAT1 binding sites upstream of two well characterized mediators of IFN signal transduction, IFNaR1 and IRF1, were the closest two STAT1 binding sites to the start of any gene in the ENCODE region (both were located less than 500 bp away). This could simply be a coincidence, however our identification of ten genes for which transcription in response to IFN-γ is conserved between different tissues of humans and mice provides an interesting alternative explanation. Whilst many genes are probably induced by IFN-γ only in certain cell types or under certain conditions, these ubiquitously-responsive genes are likely to play key roles in the response to IFN-γ of numerous cell types (just as IFNaR1 and IRF1 do). Consistent with this hypothesis, genes with important functions in cytokine signaling were identified, including SOCS-1 and -3 , IRF1 , and the chemokines CXCL9 and CXCL10 . Remarkably, seven of the ten ubiquitously-IFN-γ-inducible genes that we identified are known to be regulated by STAT1 binding sites that are located within 200 bp from the start of each gene. Of the remaining three, little is known about the roles of GADD45γ and ID2 in response to IFN-γ, and SOCS-1 may be regulated primarily by IRF1, and not STAT1 . The close proximity with which STAT1 regulates the expression of "core" IFN-γ-inducible genes suggests that many of the STAT1 binding sites identified by ChIP-chip may be involved in the regulation of genes, alternative gene transcripts, or non-coding RNAs that are induced by IFN-γ only in certain cell types or only under certain conditions. As illustrated in Figure Figure5,5, increased regulatory opportunities may be available for genes that are regulated by STAT1 from a distance, compared with genes that are regulated by STAT1 close to the start of the coding region. This model contrasts with a recent report showing that different combinations of transcription factor binding sites in the proximal promoter region are important for determining tissue-specific gene expression . However, given that biological systems are inherently variable, it seems unlikely that the two models are necessarily mutually exclusive.
Our analysis of STAT1 binding and mRNA transcription reveals that the binding of STAT1 is closely associated with IFN-γ-inducible transcriptional activity. Furthermore, our findings implicate STAT1 in the regulation of RNApolII binding, and suggest that recruitment of the RNApolII complex to the promoters of IFN-γ-responsive genes could be a general mechanism by which STAT1 activates transcription. Finally, we propose that STAT1 may strategically associate with the start positions of key IFN-γ-responsive genes to ensure that they are ubiquitously, rather than conditionally, induced in response to IFN-γ.
Genepix files corresponding to Hartman et al.'s  analysis of STAT binding and mRNA transcription in response to 120 min of IFN stimulation were obtained on-line . All experiments were performed using HeLaS3 cells. These cDNA tiling arrays are designed to detect chromosome 22 DNA using 300 – 1400 bp probes, which are aligned with chromosome 22 in an end-to-end configuration . The complete set of chromosome 22 cDNA tiling arrays is comprised of three different arrays, numbered 1 to 3 . In all cases where plots of data from chromosome 22 arrays are presented here, the data correspond to array-2 of the three-array set, and are representative of arrays -1 and -3. Data from 4 – 6 replicates were analyzed for all treatments. We processed and analyzed Genepix data using the limma package  for the R statistical programming environment, as follows. Background correction was performed using limma's "normexp" method. Within-array normalization was achieved using print-tip loess normalization. Between-array normalization was achieved using quantile normalization of mean probe intensity values, which is implemented by limma as the "Aquantile" method. Correlations in probe intensity levels ("probe level correlations") between different IFN treatments, and between either mRNA transcription or chromatin immunoprecipitation for STAT1, were visually assessed using scatterplots. Probe intensity levels for IFN-stimulated cells showed no correlation with NF-kappaB binding in response to tumor necrosis factor-alpha , indicating that the correlations observed are specific to the response to IFN (data not shown).
Cy5 and Cy3 probe intensity values for Rozowsky et al.'s  36-base oligonucleotide tiling array analysis of STAT1 binding in HeLaS3 cells in response to 30 min of IFN-γ stimulation were obtained from the Gene Expression Omnibus (GEO) website , series id GSE2714. These data are also available as a track on the UCSC genome browser . Experiments were performed in duplicate. These tiling arrays are designed to detect the 30 Mb Encyclopedia of DNA Elements (ENCODE) region . Probes are 36 bases in length, and are aligned with DNA from the ENCODE region in an end-to-end configuration. We used limma  to perform within-array normalization (using the "loess" method) and quantile normalization of mean probe intensity values (using the "Aquantile" method). To enhance the identification of runs of probes that detect STAT1 binding, probe intensities were smoothed by taking the mean intensites across a window of 20 probes (720 bp).
Cy5 and Cy3 probe intensity values for Bieda et al.'s  50-base oligonucleotide tiling array analysis of RNApolII binding in non-stimulated HeLaS3 cells were obtained from the Gene Expression Omnibus (GEO) website , series id GSE4337. Experiments were performed in triplicate. These tiling arrays are designed to detect the ENCODE region . Probes are 50 bases in length, and are aligned with DNA from the ENCODE region in a 12-base overlap configuration. We used limma  to perform between-array normalization (using the "loess" method) and quantile normalization of mean probe intensity values (using the "Aquantile" method). To enhance the identification of runs of probes that detect RNApolII binding, probe intensities were smoothed by taking the mean intensites across a window of 20 probes (760 bp). To perform probe level comparisons between these 50-base oligonucleotide arrays, and the 36-base oligonucleotide arrays for Rozowsky et al.'s  ChIP-chip analysis of STAT1 binding within the ENCODE region, probe intensities from the 50-base overlapping format were mapped to the 36-base end-to-end format. Weighted averages were calculated where two 50-base probes overlapped with a single 36-base probe.
Polyadenylated (polyA) RNA signal values for Sekinger et al.'s  tiling array analysis of non-stimulated HeLaS3 cells were obtained from the GEO website, series id GSE2800. The tiling arrays used for these experiments were manufactured by Affymetrix (Santa Clara, CA), and have 25 base probes that detect DNA from the ENCODE region in a 5-base-overlap configuration. Signal values were determined using the Wilcoxon Sign Rank Scan Statistic, as outlined in the description accompanying each sample. To enhance the identification of runs of probes that detect the presence of polyA RNA, we smoothed probe intensity values by taking the mean intensity across a window of 20 probes (400 bp).
Probe intensity values for Rock et al.'s microarray analysis of the transcriptional response of human fetal microglial cells to 1 h of IFN-γ treatment  were obtained from the GEO website, series id GSE1432. All experiments were performed in quadruplicate. Affymetrix GeneChip HGU133A arrays, which detect the expression of approximately 16,000 human genes, were used for this experiment. We used limma  to perform within-array (loess) and between-array (quantile) normalization of probe intensity data, and to identify significantly differentially expressed genes.
Affymetrix MGU74Av2 CEL files for our previously reported timecourse analysis of the transcriptional response of IFN-γ-/- mouse livers to intraperitoneal injection with IFN-γ  (GEO series id GSE4232) were processed by the affy package [38-40] for the R statistical programming environment . Background correction, quantile normalization, and expression summaries were performed using affy's "robust multiarray average" method, with default parameters. Samples corresponding to 0 h, 0.5 h, 1 h, 2 h, 4 h, 12 h, 16 h, 24 h and 48 h after stimulation with IFN-γ were included in the analysis.
SW designed the study, performed data analysis, interpreted results, and wrote the manuscript. GKS contributed to data analysis, interpretation of results, and assisted with writing the manuscript. DJH and TPS contributed to the design of the study, interpretation of results, and revision of the manuscript.
We wish to thank Dr Angeline Goh for assistance with proofreading the manuscript. This work was supported by a transitional institute grant from the National Health and Medical Research Council of Australia.