A model of the TBP regulatory network
The building block for modeling our global biochemical gene regulatory network is an elementary reaction step such as:
where a protein (P) whose identity is 'x' (for example, TBP) binds to DNA (D) located in the promoter of gene 'y' to form a protein-DNA complex (Px
), as demonstrated previously [9
]. The forward flux of P and D through the reaction is governed by a gene-specific flux constant k1
may be coupled to a second reaction step such as:
exemplified by a TBP·DNA complex binding the TBP regulator called negative cofactor 2 (NC2). Examples of other types of reaction steps include protein-protein assembly, protein-DNA disassembly, rearrangements within a complex, and chemical catalysis.
In principle, hundreds of different transcription regulatory proteins can act upon each other and upon thousands of genes, giving a nearly infinite combination of potential reaction steps. In reality, biologically relevant interactions have specificity, which keeps the number of reaction steps to a finite but nevertheless large number. In constructing a biochemical gene regulatory network, we employed only reaction steps that have strong experimental support. TBP regulatory mechanisms are among the best characterized eukaryotic gene regulatory systems [12
], and thus are ideally suited for integrative modeling studies.
Figure illustrates the prototype TBP regulatory network upon which our studies are based. The model attempts to assimilate a variety of individual TBP regulatory mechanisms into a common regulatory network that is potentially applicable to all genes. This represents the first description of an integrated TBP regulatory mechanism. The model serves as a visual framework for interpreting in vivo promoter occupancy and gene expression data. While parts of the model could be wrong or incomplete, it serves as a useful starting point to evaluate whether potential gene regulatory mechanisms defined in vitro with purified components can be integrated into a network of coupled reactions that account for transcription factor occupancy and gene expression profiles in vivo on a genome-wide scale. The goal here is to take a step towards bridging a myriad of in vitro biochemical mechanisms with genome-scale in vivo regulatory processes, rather than to establish a rigorous mathematical model for regulatory networks.
Figure 1 An integrated prototype of the TBP regulatory network. Three parallel assembly pathways proceed temporally from left to right along line segments that represent promoter DNA. Two pathways, directed by the multisubunit TFIID complex ('D', in cyan) and (more ...)
In this model, TBP resides as a self-inhibited dimer when not bound to DNA (segments 2 and 3 in Figure ) [9
]. Dissociation into monomers, as directed by promoter-specific regulators, is required for DNA binding. When TBP resides in the multisubunit TFIID complex, it may also engage in interactions with other TFIID subunits such as the TBP-associated Factor 1 (TAF1) amino-teminal domain (TAND) (segment 1 in Figure ) [16
]. TAND has the potential to act negatively by blocking TBP's DNA binding surface, and positively by tethering TBP to TFIID. Although not shown in the model, promoter-bound regulators modulate the assembly of TFIID-TBP at promoters [18
], giving rise to promoter-specific control of the network.
The model shows TBP assembling onto promoter regions via three possible pathways. One pathway involves TFIID (segment 4) and a second involves the Spt-Ada-Gcn5-Acetyltransferase complex termed SAGA (segments 5,6). Both lead to formation of a PIC containing TBP, pol II, and many other transcription proteins (segments 12 and14). The PIC produces a transcribing pol II from which RNA is made (segments 13 and 15). The third pathway, outlined in more detail below, is a nonproductive one.
TFIID and SAGA are compositionally and functionally related complexes [20
]. In principle, a given promoter can utilize either the TFIID or SAGA pathway [22
]. The SAGA pathway is tailored towards TATA-containing promoters, whereas the TFIID pathway plays a greater role at TATA-less promoters [23
Inhibiting the SAGA pathway (but not the TFIID pathway) are two negative regulators of TBP, termed NC2 and Mot1 (Figure , segments 8, 9, and 10) [22
]. NC2 binds to a TBP-DNA complex and blocks PIC assembly [25
]. Mot1 uses the energy of ATP hydrolysis to dissociate TBP from DNA [11
]. Mot1 can also dissociate TBP from DNA in the presence of NC2 [24
], and NC2 stimulates TBP-Mot1 interactions [29
]. Since the genome-wide gene expression profile of an NC2 mutant is very similar to the expression profiles of Mot1 mutants (Figure S1 of Additional data file 1) [30
], we make the simplifying assumption that the two largely work together. Consistent with the notion that Mot1 dissociates TBP-NC2 complexes, Mot1 mutants result in the accumulation of TBP-NC2 complexes in vivo
]. However, we do not exclude the possibility that Mot1 might act in the absence of NC2 at some promoters.
The third TBP assembly pathway is nonproductive (Figure , segment 7), as has been proposed by Dasgupta et al
]. This nonproductive pathway has the potential to compete with the two productive pathways and inhibit transcription. At genes where this nonproductive pathway limits transcriptional output, removal of the inactive TBP by the coordinate action of NC2 and Mot1 (Figure , segments 11, 9, and 10) would, therefore, result in positive regulation [33
]. At genes where the SAGA pathway dominates, NC2 and Mot1 would act negatively. Thus, the model has the capacity to account for positive and negative regulation by NC2 and Mot1, both of which have been reported for these proteins [30
]. The model is limited in that it does not include other types of regulatory functions that SAGA (Spt3), Mot1, or other network components might be involved in [34
], but for which there is no direct biochemical support.
Biological systems are a balance of assembly and disassembly processes that typically move along different pathways. Thus, PIC disassembly and RNA degradation do not proceed by an exact reversal of PIC assembly and transcription. Nevertheless, proteins and RNA are recycled, allowing gene expression to reach a steady-state in a stable environment. Therefore, our modeled biochemical gene regulatory network is designed to reach steady-state, where the rate of PIC assembly and RNA production equals the rate of PIC disassembly and RNA degradation (Figure , segment 16).
Within the cell, the TBP regulatory network is embedded within the larger global gene regulatory network involving hundreds of functionally diverse proteins, such as activators, chromatin remodeling complexes, mediator, and general transcription factors. They are not an explicit part of this modeling study (and thus not shown in Figure ) because they are assumed to make contributions to PIC assembly and RNA output that are approximately equivalent in the mutant and wild-type strains used here. For example, a TBP mutation that impairs TBP-NC2 interactions is assumed to have a negligible effect on other types of interactions within chromatin and the transcription machinery.
Computational simulation of a TBP regulatory network
The core mechanism, as defined by the model, is tailored to specific genes by adjusting the flux constants. For example, the flux constant governing the association of TBP with one promoter may be different to the flux constant governing TBP association with a different promoter as dictated by promoter-bound activators [35
]. In our modeling studies, promoter-specific factors such as bound regulatory proteins and chromatin structure are not individually defined because their individual contributions are not being tested here. The flux constants reflect their combined contributions.
The model illustrated in Figure was converted into a series of explicitly defined reaction segments (Figure ) that were suitable for computational modeling using the freely available KinTekSim simulator [38
]. KinTekSim is a derivative of KINSIM, which is used to model enzyme reaction mechanisms [39
]. For any set of similarly behaving genes, the core mechanism contains 16 segments governed by 16 forward and reverse 'flux constants'. Although written as explicit reaction steps, these segments are intended to approximate relevant portions of the modeled network that have been biochemically demonstrated. The flux constants provide a means for governing the relative distribution of components in the pathway when the simulation has reached steady-state.
Figure 2 Simulation of the TBP regulatory network. (a) The reaction mechanism as entered into the KinTekSim program. Each numbered line represents a reaction segment governing TBP regulation and PIC assembly shown in Figure 1. The '= =' denotes forward and reverse (more ...)
The distinct arrangement of promoter elements and bound proteins ranging from activators to chromatin is expected to control the flow of components through every step in the network (that is, flux constants) in a way that is unique to every gene or set of co-regulated genes. This includes segments of the mechanism that are not explicitly marked in Figure as involving promoter-bound regulators (for example, segments 1-3).
One challenge is fitting the model to experimental data within the constraints imposed by the experimental system. We required our computational model to report outputs that had experimental counterparts such as promoter occupancy from ChIP-chip assays or mRNA levels. Experimentally, such absolute levels obtained on a genome-wide scale are inherently noisy. Noise is reduced substantially by measuring changes in occupancy or changes in mRNA levels. Therefore, we set up our model to report both steady-state and changes in steady-state levels when a portion of the network is shut down. Computationally, we shut down a portion of the network by reducing the flux constant governing that portion of the network. Mutagenesis is used to achieve the physiological counterpart (see below). Further noise reduction is achieved by combining data within a set of similarly behaving genes so that hundreds of measurements contribute to the modeled data. Combining data also reduces the computational complexity from approximately 5,700 core mechanisms for approximately 5,700 genes to a manageable small number.
Figure (left panel) demonstrates how initialization of the flux parameters governing one gene (or group of co-regulated genes) leads the simulator to produce promoter occupancy intermediates and RNA outputs (in arbitrary units) that have experimental counterparts measurable with ChIP-chip and expression microarrays. By altering flux constants that govern the progress of the coupled reactions, we control the steady-state levels of these components.
Changing the values of the flux constants, mimicking either gene control or genetic mutations that impair an interaction, leads to changes in factor occupancy and RNA output. Figure (right panel) plots simulated changes in RNA output (log2 scale) when one or more flux values are altered. In some cases the impact is substantial, while in others the impact is minimal, reflecting the extent to which a TBP regulatory interaction contributes to the expression of that gene.
Using this approach, we tested the simulator and the model by creating mutant yeast strains having single and multiple defects in biochemically defined TBP regulatory interactions (Figure , and section 1 of Additional data file 1). These mutations provide the physiological counterpart to perturbations of the computational network. The experimental design has several constraints. First, we wanted to minimize potential indirect effects from physiological adjustments arising from long-term exposure to the mutations. Therefore, we placed all TBP mutants under control of the GAL10 promoter and analyzed gene expression changes throughout the genome after induction of TBP with galactose. TBP levels pre- and post-induction along with endogenous wild-type TBP are shown in Additional data file 1 (section 1 of Figure S2). The reference state in all experiments is a strain where a 'null' TBP is induced. Induction of wild-type TBP had no significant effect on the system, indicating that the measured effects are unlikely to be inherent in the experimental design and thus are likely to be caused by the mutations.
Second, because TBP is essential for cell growth it was necessary to include endogenous wild-type TBP in all strains. Competition between the mutant and wild-type TBP is likely to diminish, but not eliminate, the impact of the mutants. Indeed many mutants displayed dominant growth phenotypes (Figure S3 of Additional data file 1). Validating the use of TBP mutants in the context of wild-type TBP, a nonlethal TBP mutation generates similar genome-wide expression profiles in the presence or absence of endogenous wild-type TBP (Figure S4a, subpanel 4, of Additional data file 1). Therefore, the presence of wild-type TBP in the background is not expected to substantially impact the correlations between simulated and actual data.
It was not feasible to eliminate all TBP regulatory interactions via mutagenesis of TBP. First, since TBP's interaction surface with SAGA remains ill-defined, this interaction was eliminated by deleting SPT3
, which encodes a TBP regulatory subunit of SAGA [40
]. Second, TBP mutations that impair interactions with TFIID do not generate the same phenotype as deletion of TFIID's main TBP interaction domain, the TAF1 TAND domain [16
]. Since this region of TBP interacts with other factors as well, we opted to delete the TAND domain as a means to diminish TBP-TFIID interactions. Third, we cannot be certain that the mutations affect only the interactions being modeled, and thus is a caveat of any in vivo
analysis. However, the mutated residues employed in our study have been biochemically characterized for their defective interactions [27
]. In the context of our study, we sought to provide further evidence on a genome-wide scale that the mutants were generally having the intended effects in vivo
. These validation studies are presented in section 2 of Additional data file 1, and support the assertion that the mutants are defective in the modeled interactions.
Identifying genes that are sensitive to the TBP regulatory network
Changes in mRNA output at more than 99% of all known S. cerevisiae genes were measured in 63 wild-type or mutant strains (Additional data file 2). Of these, 2,903 genes met stringent filtering criteria for changes in gene expression (see Materials and methods) and thus were chosen for the modeling studies. Most of the remaining approximately 3,000 genes were not significantly expressed (cluster 0 in Table , entry 12), which may be due to repressive mechanisms (for example, repressors, chromatin, and so on) that lie outside of the TBP regulatory network. Their lack of expression can be modeled by setting the forward flux constants to zero. The TBP regulatory network may be applicable to these genes under conditions where they are expressed.
Significance values relating genomic properties to clusters defined by the TBP regulatory network*
Individually applying the core mechanism to each of the 2,903 filtered genes was not practical. Instead we sought to identify major themes in the network. This was achieved by combining data from similarly behaving genes. Data consolidation enhanced the robustness of modeling since any one consolidated value is derived from as many as several hundred data points that were not appreciably different from each other (20% average standard deviation; see Additional data file 2). We used a K-means algorithm to assist in clustering the data, and identified ten as the maximum number of nonredundant clusters (Figure ). Since all genome-wide measurements naturally fall into a continuum of values rather than well-partitioned clusters, the choice of a cluster number is necessarily subjective. Partitioning the data into more clusters generated increasingly subtle quantitative differences rather than major qualitatively distinct patterns. Genes within a cluster were consolidated by calculating median changes in gene expression for each mutant in each cluster.
Figure 3 Genome-wide changes in expression upon perturbation of the TBP regulatory network. We grouped 2,903 ORFs (rows) that changed expression by at least 1.5-fold in at least one of the 63 conditions (columns) examined into 10 clusters using the K-means algorithm (more ...)
Of the ten clusters, six (clusters 3-6 and 8-9) were deemed appropriate for modeling studies. The properties of the remaining four clusters (clusters 1, 2, 7 and 10) suggested that they arose from indirect effects. The complex interconnectivity of biological systems ensures that some changes in gene expression in response to a genetic perturbation will necessarily cause other genes to change in expression, with the latter being largely an indirect consequence of the primary perturbation. Therefore, indirect responders are not accurate metrics of a perturbed network. Detailed justifications for their exclusion are presented in section 3 of Additional data file 1, which is briefly summarized here.
Cluster 1 values were inversely correlated with cluster 8 values (Figure S5 in Additional data file 1). The behavior of cluster 8 could be accounted for by well known biochemical interactions that TBP engages in. Accounting for the pattern in cluster 1 as a direct effect of TBP required TBP to possess biochemical activities that are the opposite of its known activities. Since there is no basis for such supposition, we interpret the inverse relationship between clusters 1 and 8 as cluster 8 genes being the predominant inhibitor of cluster 1 genes, rather than TBP reversing its activity. Strikingly, cluster 8 genes encode predominantly ribosomal proteins whereas cluster 1 genes are involved predominantly in ribosomal biogenesis (Table , rows 3 and 4), suggesting that ribosomal protein genes or their products down-regulate ribosomal biogenesis genes. In support of this notion, cluster 1 genes are highly enriched with genes that are up-regulated when ribosomal protein genes are deleted (Table , row 26).
Cluster 2 genes were up-regulated only in spt3Δ
strains. However, ChIP-chip occupancy measurements indicated that Spt3 was not present at these genes in wild-type strains (data not shown), suggesting that the changes in gene expression are likely to be an indirect consequence of physiological adjustments to the constitutive absence of Spt3. Cluster 7 genes were down-regulated only in spt3Δ
strains. Cluster 7 was enriched with genes involved in Ty transposition (Table , row 10), which is particularly Spt3 dependent [49
]. Because of the possibility of unique regulation of Ty elements and the potential for long-term physiological adjustments to the constitutive spt3
deletion, cluster 7 genes were not modeled.
Almost all of cluster 10 genes were restricted to chromosomes XI and XII (Figure S6 in Additional data file 1), suggesting that the highly mutated strains that demarcate this cluster underwent duplications of chromosomes XI and XII. Therefore, the apparent increase in expression is likely due to an increase in gene copy number, and thus was excluded from the analysis.
Testing the TBP regulatory network through genome-wide occupancy assays and expression profiling
Clusters 3-6 and 8-9 represent six classes of genes that display dependencies on different parts of the TBP regulatory network. The challenge is to determine whether the flux constants governing the network can be fit with values that recapitulate these dependencies. A proper fit requires a match between actual changes in gene expression, when a section of the network is genetically impaired, and changes that are simulated when the relevant flux constant(s) is/are altered in value. An important caveat here is that a good fit does not necessarily indicate that the model is correct since there are more degrees of freedom than there are observations to constrain the model. Nevertheless, the modeling provides a tool for integrating and interpreting massive genomic data sets into a plausible conceptual model.
To enhance the accuracy of the modeling and assist in identifying suitable values for the flux constants, we employed additional types of data. First, the flux constants were empirically set so as to recapitulate the relative transcription frequencies that have been experimentally measured [3
] for genes in each of the six clusters. Next, we turned to measured promoter occupancy levels of individual components of the network using genome-wide ChIP-chip data [50
]. We examined SAGA (Spt3 subunit), NC2 (Bur6 subunit), Mot1, TFIID (TAF1, TAF5, TAF6, and TAF9 subunits), TBP, and pol II (Rpb1, Rpb2, and Rpb7 subunits). For a particular cluster, the flux constants governing the 16 reaction segments shown in Figure were adjusted such that a single set of values optimally recapitulated the relative occupancy level of each of these components as well as transcriptional output. Figure compares simulated versus actual levels of RNA output and factor occupancy. The average correlation coefficient for these relationships was 0.8, which indicates a good fit between the model and the experimental data.
Figure 4 Testing the TBP regulatory network through genome-wide occupancy and expression profiling. (a) Correlations in RNA and factor occupancy levels, comparing simulated and experimental data. The flux constants governing the core reaction mechanism in Figure (more ...)
To simulate the experimental effect of each mutation that perturbs the TBP regulatory network, the flux constants for the relevant reaction segments were altered, and steady-state RNA output reported (see Additional data file 3). We allowed only those flux constants that are justifiably governed by the interaction being mutated to change, and only in the direction expected of the mutation, thereby constraining the degrees of freedom. For example, the impact of the NC2-defective TBP(F182V) mutation was simulated by reducing the flux constant for reaction segments 8 and 11, which govern the association of NC2 with TBP (illustrated in Figure ). For multiple mutations, the relevant flux constants were set at or near to the values for the single mutations allowing an approximate three-fold tolerance. This tolerance attempts to accommodate the possibility that one mutation might influence the effect of other mutations.
Figure compares simulated changes in RNA output upon network perturbation to experimentally measured changes when the network is perturbed by mutations. There was a remarkable correspondence (R = 0.8), indicating that the modeled changes can largely account for the experimentally observed changes. To assess the specificity of modeling we randomly assigned each set of cluster-specific values to a different cluster and re-ran the analysis. As expected, no substantial correlation was observed (data not shown), indicating that the flux parameters for each cluster are likely to generate a unique or limited number of solutions.
As a further test of the core mechanism illustrated in Figure , we examined whether a simulated perturbation to the TBP regulatory network could predict changes in TBP occupancy at promoters measured by ChIP-chip. Figure compares the simulated and experimental data for changes in TBP occupancy in a taf1(ΔTAND) strain harboring both wild-type TBP and the TBP(F182V) mutation. With the exception of cluster 5, the simulation correlated with the experimental data (R = 0.56). The alternative outcome for cluster 5 illustrates the value of the simulation, in that it suggests that the mechanism and/or initialized flux constants need to be adjusted for cluster 5 genes, and thus requires further investigation. Deviation of the expression data from the simulated data for cluster 5 is also apparent in Figure (red symbols). Taken together, the correlations between the simulated and experimental data for transcription frequency, ChIP-chip occupancy, and expression profiling suggest that the core mechanism illustrated in Figure is a plausible prototype of the genome-wide TBP regulatory network.
A TBP regulatory network tailored to sets of genes
Given the ability of the core mechanism to act as a template for simulating the behavior of each cluster, we next tailored the basic network model shown in Figure to reflect actual behavior. In Figure , each model is drawn to reflect the 'flow' of TBP regulation that leads to PIC assembly and RNA production. Thicker arrows and thicker lines denote higher levels of flux through that portion of the network. We interpret these models as follows: PIC assembly at cluster 9 is dominated by the TFIID assembly pathway, and so the arrow governing TFIID association and the line representing the promoter are drawn thicker. The SAGA pathway also participates in cluster 9 but contributes little to RNA production because NC2 and Mot1 shunt this pathway in a nonproductive direction. Cluster 6 on the other hand is dominated by the SAGA pathway, possibly due to the presence of a TATA box at these genes (Table , row 17) that is linked to the SAGA pathway [23
]. Cluster 8 is dominated by the nonproductive TBP pathway. Here, NC2 and Mot1 function positively in transcription by dismantling this inhibitory pathway. Cluster 4 genes are lowly expressed due in part to blockage of TBP by dimerization and TAF1 TAND interactions.
Figure 5 The 'flow' of PIC assembly through the TBP regulatory network at different gene clusters. Each assembly model corresponds to the indicated cluster number in Figure 3. The same prototype model described in Figure 1 is applied to all clusters. Line and (more ...)
This TBP regulatory network accommodates much of the biochemical, genetic, and genomic data on TBP, making it particularly compelling and useful for rationalizing the functional significance of the clustering patterns and predicting experimental outcomes. In particular, the network has the following features. First, it allows the three assembly pathways (TFIID, SAGA, and nonproductive) illustrated in Figure to operate in competition and, therefore, influence each other. Hence, regulation of the SAGA pathway can influence contributions by the TFIID pathway. Second, when one pathway is diminished, as in an spt3Δ
strain, the other pathway is allowed to compensate, providing a limited degree of functional redundancy, as has been proposed for these two assembly pathways [21
]. Third, inhibitors of TBP, such as Mot1 and NC2, can produce a net negative or net positive effect (or no effect) on transcription, as has been observed [25
], depending upon which of the three assembly pathways predominately governs RNA output. Importantly, the model provides predictive power for designing, conceptualizing and analyzing genome-wide expression and ChIP microarray data as it relates to the TBP regulatory network.
Design logic behind the TBP regulatory network
As demonstrated here, the flow of the TBP regulatory network is not uniform for all genes. Rather, it is tailored to specific genes. The question arises as to whether this tailoring is randomly placed or whether genes involved in a specific process utilize the TBP regulatory network in the same way. To address whether genes within a cluster have related functions, we examined if any of approximately 1,500 genome-wide properties that characterize yeast genes were particularly enriched within each cluster (Additional data file 4). A selected subset of these relationships is presented in Table . Most clusters were enriched with genes that belonged to a particular cellular process (rows 3-11). For example, cluster 4 was enriched with sporulation genes and cluster 8 with ribosomal protein (RP) genes. Thus, genes involved in the same process tend to be regulated by the TBP regulatory network in the same way. Below, we summarize some insights into the design logic.
RP genes encode physical components of the ribosome and are quintessentially TFIID-regulated and TATA-less [22
]. However, computer modeling of RP-enriched cluster 8 genes (Figure , as well as the data in Figure ) suggest that they are not only positively regulated by SAGA (Spt3) but are also positively regulated by the TBP inhibitors NC2 and Mot1. To assess the validity of these conclusions, we turned to published genome-wide ChIP-chip occupancy data of these factors and genome-wide expression profiles in NC2 and Mot1 mutants. Genome-wide promoter occupancy data of Spt3 (SAGA), Bur6 (NC2), Mot1, TAF1 (TFIID), and TBP [50
] were transformed into percentile scores (ranging from 0 to 1) so that their relative contribution across clusters could be assessed. For each cluster from Figure as well as for RP and ribosomal biogenesis (RB) genes, the bar graphs in Figure display deviations from the genome-wide median (defined as 0.5). Strikingly, compared to all other clusters, the RP genes possessed the highest occupancy levels of SAGA, NC2, and Mot1 (as well as TFIID). However, RP genes displayed a rather modest transcriptional dependency on SAGA, NC2, and Mot1 (Figure ). Thus, the SAGA pathway assembles at RP genes but contributes modestly to transcription of these genes, which is consistent with the modeling of RP-enriched cluster 8 and 9 (Figure ).
Figure 6 Properties of clusters 1-10 and RP and RB genes. (a) Bar graph comparison of promoter occupancy levels for Spt3, Bur6, Mot1, and Taf1 [50,52]. Values correspond to median percentile scores for each indicated gene group or cluster after subtracting the (more ...)
Clusters 3, 5, and 6 were enriched with genes involved in distinct metabolic processes, including amine metabolism, alcohol/sterol metabolism, and energy production, respectively. Genes in these clusters tended to rely on the more tightly controlled SAGA pathway and were negatively regulated by NC2 and Mot1 to varying degrees, resulting in distinct flow patterns in their assembly pathways (see Figure ). The distinction between pathways 5 and 6 might indicate the presence of a regulatory step between TBP loading and PIC assembly (see asterisks in Figure ), which is in line with previous suggestions of post-recruitment regulation [53
Cluster 4 genes are very lowly expressed during exponential growth as might be expected of their enrichment with mid-phase sporulation genes (Table , rows 9 and 12). These genes appear to be inhibited in part by the functionally redundant action of TBP dimerization and the TAF1 TAND domain, and thus are related to cluster 4 in [42
]. They tend to reside in the repressive environment of subtelomeric regions, where they are generally inhibited by the Hda1 histone deacetylase, the Ssn-Tup1 complex, and the methyltransferase Set1 (Table , rows 31-33). Cluster 4 genes, therefore, are likely to be maintained in a low transcriptional state by a variety of repression mechanisms, including those in the TBP regulatory network. The multiple levels of repression associated with cluster 4 genes might reflect the critical need to keep potentially detrimental sporulation genes inactive during vegetative growth. Consistent with their potential toxicity, any combination of mutants used in this study that resulted in elevated expression of cluster 4 was also found to be toxic to cell growth (Figure S3 in Additional data file 1).
The boundaries of SIR-mediated (Silent Information Regulator) repression extending from telomeric into subtelomeric regions along the chromosome are thought to be defined in part by the counteraction of bromodomain factor Bdf1 [55
]. Since cluster 4 genes typically reside in these regions [42
], we were surprised to find that they were not particularly repressed by SIR proteins or activated by Bdf1. They were, however, highly dependent upon BDF2
(Table , row 24), which is the paralog of BDF1
. Little is known about BDF2
, except that it is nonessential when BDF1
is present, and loss of this gene affects the expression of very few genes [55
]. The linkage between cluster 4 and genes that tend to be more Bdf2-dependent provides a novel distinguishing feature between Bdf1 and Bdf2. While Bdf1 might be specialized to counteract SIR-mediated repression [55
], in addition to its function with TFIID, Bdf2 might be particularly suited to operate in repressive environments that do not involve SIR repression.