|Home | About | Journals | Submit | Contact Us | Français|
We assess the role of DNA breathing dynamics as a determinant of promoter strength and transcription start site (TSS) location. We compare DNA Langevin dynamic profiles of representative gene promoters, calculated with the extended non-linear PBD model of DNA with experimental data on transcription factor binding and transcriptional activity. Our results demonstrate that DNA dynamic activity at the TSS can be suppressed by mutations that do not affect basal transcription factor binding–DNA contacts. We use this effect to establish the separate contributions of transcription factor binding and DNA dynamics to transcriptional activity. Our results argue against a purely ‘transcription factor-centric’ view of transcription initiation, suggesting that both DNA dynamics and transcription factor binding are necessary conditions for transcription initiation.
RNA polymerases require access to a locally denatured single-strand DNA segment (2,3) at the transcription start site (TSS) in order to initiate transcription. It has been demonstrated that introduction of an artificial bubble at the TSS of a viral promoter, via insertion of a 5-bp mismatched segment, is sufficient for the polymerase to initiate transcription in the absence of basal transcription factors (4,5). Use of a negatively supercoiled DNA template (5–8) can also obviate the requirement by polymerase II for basal transcription factors binding (5–7) and helicase activity (8). It has been suggested that under natural conditions in vivo, formation of the transcriptional bubble is seeded by transient, thermally induced strand separation motions of the DNA double helix, commonly known as DNA breathing (9). To investigate this possibility, we have been studying the sequence dependence of breathing dynamics with the non-linear Peyrard–Bishop–Dauxois model (PBD) of DNA (10,11). In support of a link between spontaneous DNA strand separation and transcription initiation, we found that mammalian promoter sequences frequently exhibit a breathing dynamics maximum (bubble) coinciding with the TSS (4,9). We introduced the use of Langevin molecular dynamic (LMD) simulations and use of three dynamic criteria: bubble length, bubble amplitude and bubble lifetime, which can be extracted from the simulated dynamic trajectories of experimentally identified TSS (4). Bubble length is defined as the number of consecutive base pairs that are simultaneously separated from their hydrogen bond minima by more than a given distance threshold (the bubble amplitude). Simulations of several mammalian core promoters demonstrated that a relatively large (length: ~10 bp; amplitude A: >1.5 Å) and stable (lifetime τ: > 5 ps) bubble forms frequently at the examined TSS (4). We reported that A/T-rich regions such as TATA boxes exhibit faster, lower amplitude motions than TSS regions (4,12). G/C-rich promoters, however, display less obvious bubble-forming motifs in the simulations (4).
The main source of structural and dynamic heterogeneity in G/C-rich sequences presumably originates from a dramatic difference in the stacking interaction between GG/CC steps on the one hand and CG/CG and GC/GC on the other (1,13,14). However, the original PBD Hamiltonian does not account for the sequence dependence of the stacking potentials and performs poorly at reproducing the melting transitions of G/C-rich DNA. For accurate analysis of G/C-rich DNA, we recently derived an extended PBD (EPBD) Hamiltonian that includes sequence-dependent base-stacking potentials, and calibrated the model with DNA melting studies of short repeats and homopolymers (1). Monte Carlo simulations with the resulting EPBD model faithfully reproduce the melting behavior of highly homogenous and repetitive sequences, e.g. the famous 10° Tm difference between poly(dG).poly(dC) and poly(dGdC) (13). Consistent with such differences in melting behavior and with NMR studies of the millisecond-scale dynamics of G/C-rich DNA (15), the EPBD simulations predict significant heterogeneity in pre-melting (breathing) dynamics of various G/C-rich DNA sequences.
Here we examine the EPBD breathing dynamics of two representative mammalian promoters with high G/C content. We aim to establish whether DNA breathing dynamics profiles at the TSS are merely coincidental, or a necessary factor for transcription. We use EPBD LMDs simulations, gene transcription and gel shift assays to explore the relationship between DNA dynamics, efficiency of transcription and basal transcription factor binding at the core promoter. Our hypothesis is that a TSS-specific dynamic signature is a necessary feature of transcription initiation. As a model system, we chose a fully characterized ‘classical’ promoter, the SCP1 promoter (16) and the CpG island promoter of the mouse thymidylate synthase (TS) (17).
SCP1 is a single start site promoter artificially constructed from functionally established eukaryotic promoter elements. SCP1, also called the ‘superpromoter’ (16), exhibits one of the strongest known basal activities and is a classical promoter in the sense that it contains the well-known TATA, Initiator (Inr), downstream promoter element (DPE) and motif ten element (MTE) element sequences. It was shown that the Inr, MTE and DPE sequences are sufficient to recruit the TFIID basal transcription factor complex and initiate transcription. This was established by mutations in these promoter elements, correlating TFIID binding with transcriptional activity (16). To separate the effects of transcription factor binding from DNA dynamic properties, we first identified mutations in close proximity to the TSS that do not affect TFIID binding but do change the dynamic signature of DNA. We conducted EPBD simulations on various sequences mutated outside the regions involved in direct contacts with the TFIID complex and chose two of the mutant variants, m1SCP1 and m2SCP1, with silenced and intact TSS dynamics, respectively.
Analysis of the dynamic trajectories, bubble probability and bubble lifetime calculations are performed as previously reported (4).
pUC119 plasmid containing the wtSCP1 promoter sequence insert from −36 to +45 (relative to the TSS) (16) was used as a template to construct the m1SCP1 and m2SCP1 mutant variants. Mutations were introduced by QuikChange Mutagenesis Kit (Stratagene) following the protocol of the supplier.
The reactions were assembled as previously described (5,6) TFIID was isolated from HeLa cells (18). TFIIF, TFIIB, TFIIE, TFIIA were purified from Escherichia coli (6). Oligonucleotide probes were labeled with [γ-33P]ATP by T4 polynucleotide kinase (Invitrogen). The sequences of the probes used in the gel shift reactions are as follows: wtSCP1—GGGGCGCGTTCG TCCCAGTCGC GATCG AACACTCGA; m1SCP1—GGGGCGCGTTCGCG CCAGTCGCGG TCGAACGCTCGA; non-specific gel shift reaction competitor—TTCTTCTTCTTCTTCTTCTTCT TCTTCTTCTTCTTCTTC.
HeLa cells (1 × 105) were transfected with 2 μg of pUC119-SCP1 variants or empty pUC119 plasmid DNA using Magus reagent (BIDMC) and electroporation. Cells were then cultivated in DMEM/10% FBS. RNA was harvested 19 h later and subjected to Q-PCR.
Total RNA was extracted from cells by RNeasy® kit (Qiagen), following the manufacturer’s instructions. First-strand cDNA synthesis was performed using RETROscript® (Ambion). PCR was performed using cDNA synthesized from 1.5 μg total RNA in an Mx3000P QPCR system (Stratagene) with the pUC119-specific primers downstream of the SCP1 TSS and SYBR Green PCR Master Mix (Stratagene). The sequences of the primers are as follows: AGCGG ATAACAATTTCACACAGGA and ATCGAACACTCGAGCCGAG. All Q-PCR data are expressed as mean ± SD.
The dynamical trajectories of the SCP1 variants are shown in Figure 1. The wild-type promoter sequence revealed the characteristic dynamic pattern observed previously for other mammalian promoters (4), containing two major sites of dynamic activity, the Inr/TSS region and the TATA region upstream of the TSS. The TSS bubble is distinguished by the longest lifetimes at high amplitudes (Figure 1A). The T/A-rich region, in contrast, exhibits higher bubble probability but lower lifetimes at high bubble amplitudes (Figure 1B). These data closely match our previous observations of long-lived bubbles with high amplitudes at the TSS and short-lived bubbles at TATA and TATA-like sites for several promoters (4).
The dynamic profile of the SCP1 TSS is very similar to other studied promoters despite the fact that its Inr sequence has significantly higher (50%) G/C content than the previously investigated Inr sequences (G/C ~ 30%) (4). As discussed previously by us (12) and Dornberger et al. (15), this underscores the non-trivial dependence of DNA breathing dynamics on G/C content and the effect of adjacent sequences. Remarkably, the m1SCP1 mutant, which differs from the wild-type sequence by four-point mutations located outside of the TSS, exhibits a dramatically different dynamic profile (Figure 2A). The mutations suppress the dynamic activity of the TSS, clearly silencing the TSS bubble, while preserving the sequence. The second mutant, m2SCP1 displayed a dynamic profile essentially identical to the wild type (Figure 2A). In both mutants, the Inr, MTE and DPE motif sequences were preserved.
The role of the Inr sequence element in transcription is generally attributed to binding of transcriptional initiator factors such as the large TFIID complex (16,19) and/or YY1 (5). These proteins serve to recruit the polymerase and the other basal transcription factors at the TSS (6,8). The m1SCP1 sequence retains the original Inr element, but lacks the characteristic TSS dynamic signature, providing a test case to establish whether binding of the basal transcription factors is sufficient for transcription, or whether the TSS dynamic signature is necessary. To confirm that transcription factor binding was, as intended, unaffected by the m1SCP1 mutations, we performed gel shift assay with TFIID and basal transcription factors (Figure 2B). As positive control, we conducted gel shift reactions with the wtSCP1 promoter oligo fragment. Reactions were assembled with equal protein amounts of TFIID (3 ng/reaction) and TFIIB (2 ng/reaction) alone and together with transcription factors TFIIF (4 ng), TFIIE (3 ng) and TFIIA (3 ng). The results suggest that both the wild-type and the m1SCP1 oligos form nearly identical complexes with TFIID and the tested basal transcription factors. The observed complexes result from sequence-specific recognition, since presence of unlabeled wtSCP1 oligo in the reactions competes equally well for protein binding with both radioactively labeled wtSCP1 and m1SCP1 fragments. To verify the composition of the protein–DNA complexes in crude nuclear extract, we performed gel shift reactions with HeLa extract and anti-TFIIF basal transcription factor-specific antibodies (Supplementary Data). The results suggest that the selected mutations in m1SCP1 only result in suppression of bubble dynamics at the TSS without affecting the binding of the basal pre-initiation transcription complex. The m2SCP1 promoter variant displays both intact dynamics and protein binding (data not shown).
The effect of the mutations on promoter strength was assessed by transiently transfecting wtSCP1, m1SCP1 and m2SCP1 promoter templates in HeLa cells. The transcriptional activity of the promoter variants was determined by measuring the cellular levels of specific RNA transcripts in real-time PCR reactions (Q-PCR). As expected, wtSCP1 and m2SCP1 support high level of transcription in HeLa cells resulting in accumulation of specific RNA (Figure 2C). In comparison, m1SCP1 showed a 4-fold decrease in the level of RNA transcripts. The results of these experiments suggest that suppression of TSS bubble dynamics leads to a decrease in promoter activity, independent of basal transcription factor binding to the core promoter.
To further establish the requirement for strong DNA dynamics in determining a TSS, we conducted EPBD Langevin dynamic simulations on the mouse TS promoter (17,20). The TS promoter is a CpG island promoter that does not contain any of the known elements present in SCP1. It has been suggested that CpG island promoters are commonly associated with constitutively expressed housekeeping genes and may be regulated differently than the other known classes of promoters, such as promoters containing the TATA and Inr elements. TS is a ‘dispersed’ promoter (19,21), by virtue of having multiple TSS dispersed over a 100-bp region (17). This is in contrast to ‘focused’ promoters such as SCP1, which display one or several clearly defined start sites. Most of the TSSs of the TS promoter are known to be regulated by the Ets family of transcription factors in collaboration with Sp1 (20).
The simulations of the wild-type TS promoter revealed a dynamic activity that is evenly distributed and noticeably weaker than SCP1 (Figure 3A). Conspicuously lacking is the characteristic long-lived bubbles observed for SCP1 (Figure 2A) and other previously studied ‘focused’ promoters (4). This result is again consistent with the notion that a relatively stable, well-defined bubble is required to define a strong, localized TSS.
It was previously reported (17) that insertion of an Inr sequence in the promoter without changing the dispersed transcriptional window and transcription factors binding sites leads to the appearance of a new strong TSS at the Inr. The transcriptional activity of the wild-type sequence upstream of the insert was altered. The mutant TS exhibited more focused character, with the appearance of another pronounced TSS in addition to the Inr site.
The EPBD simulations of the TS-Inr mutant (Figure 3) revealed a significantly changed dynamic profile, compared with wtTS, with the appearance of a new and strong TSS located within the Inr insert, and a more ‘concentrated’ dynamic activity in the original wild-type sequence upstream of the insert (Figure 3). This dynamic pattern is in striking agreement with the reported (17) pattern of TSSs.
The EPBD LMDs simulate DNA transient openings in the picosecond-time scale, a range that is to be expected from the strength of the hydrogen bond potential (1–20 meV) and is consistent with experimental studies of the vibrational frequencies of biomolecules (22–25). While the accuracy of the predicted sequence dependence of DNA breathing frequencies has not been directly verified experimentally, the PBD simulations faithfully reproduce a variety of phenomena directly linked to the hydrogen bond stretching and unstacking (26–29). These include accurate prediction of unzipping force measurements (28), bubble nucleation size (29) and melting profile predictions (1,26,27). Given the striking accuracy of the melting temperature predictions of EPBD, including homopolymers and repeats that are difficult to model (1), we anticipate that the pre-melting dynamic trajectories produced by EPBD simulations are at least qualitatively correct.
The results reported here are consistent with our previous findings that core promoter TSSs exhibit specific breathing dynamic signatures (4). Further, the new data suggest that the observed transient collective openings in the DNA double helix are not simply a coincidental property of the TSS sequence, but actually play an important role in determining the TSS strength and position. Such collective openings may seed the formation of the transcriptional bubble needed for transcription, or they may represent another, yet unknown but important conformational feature of promoter DNA (3). In any event, the data rule out the possibility that the introduced mutations in the SCP1 promoter directly disrupt specific DNA–transcription factor contacts. These mutations were chosen to avoid points of contact with the pre-initiation complex, as identified in thorough DNAse I footprinting studies of SCP1 and SCP1 mutants (16). Our gel shift experiments confirm that transcription factor–DNA interactions are indeed unperturbed.
The data indicate that the dynamics at the TSS are as important as the binding of the basal transcriptional factors for determining the transcription initiation strength, and it is thus a ‘localizer’ of the TSS position. Mutations in the MTE and DPE elements that prevent TFIID binding (12) result in decrease in SCP1 activity comparable (~3- to 4-fold in luciferase reporter assay) with the impact of silencing the transcriptional bubble without perturbing TFIID binding (4-fold in Q-PCR assay). Curiously, a mutation inside the Inr motif of SCP1 was reported to suppress transcription more potently (15-fold in luciferase reporter assay) (16). EPBD dynamic simulations revealed that, in addition to the reported inhibition of TFIID binding, this Inr mutant has significantly reduced dynamic activity (Figure 1C), consistent with our hypothesis that both DNA dynamics and transcription factor binding are determinants of transcription initiation.
The lack of sharply defined TSS-specific dynamics observed in the TS promoter could explain the lack of a ‘focused’ TSS in this gene. Moreover, implanting a segment with localized TSS-like dynamics resulted in the appearance of a new strong TSS coinciding with the location of the dynamic maximum. Surprisingly, changes in the dynamic profile of the original promoter sequence resulting from the insert also reflect the experimentally observed transcriptional activity of this sequence. Considering that the known transcription factor binding sites for the TS promoter were intact in the Inr-insert variant, the reported experimental differences between the TS and the TS-Inr promoters can be readily explained from considerations of DNA dynamics. In contrast, according to a purely ‘transcription factor-centric’ view of transcription initiation, insertion of an Inr segment out of context in terms of transcription factor binding sites may introduce a new start site at the Inr but should not otherwise affect the original TSS distribution.
In conclusion, we propose that transcription factor binding and dynamic activity are both necessary for cellular gene transcription and are interdependent. The EPBD dynamic model appears to be uniquely capable of describing the sequence dependence of DNA dynamic features that are functionally relevant to transcription. In cells, the TSS-specific DNA breathing dynamics are likely to depend not only on DNA sequence, but also to be regulated by transcription factor binding, chromatin and DNA methylation.
Supplementary Data are available at NAR Online.
National Institutes of Health (GM073911 to A.U.); National Nuclear Security Administration of the US Department of Energy at Los Alamos National Laboratory (Contract DE-AC52-06NA25396). Funding for open access charge: DE-AC52-06NA25396.
Conflict of interest statement. None declared.
We thank Prof. James Kadonaga for kindly providing the wtSCP1 construct.