|Home | About | Journals | Submit | Contact Us | Français|
The importance of DNA supercoiling in transcriptional regulation has been known for many years, and more recently, transcription itself has been shown to be a source of this superhelicity. To mimic the effect of transcriptionally induced negative superhelicity, the G-quadruplex/i-motif-forming region in the c-Myc promoter was incorporated into a supercoiled plasmid. We show, using enzymatic and chemical footprinting, that negative superhelicity facilitates the formation of secondary DNA structures under physiological conditions. Significantly, these structures are not the same as those formed in single-stranded DNA templates. Together with the recently demonstrated role of transcriptionally induced superhelicity in maintaining a mechanosensor mechanism for controlling the firing rate of the c-Myc promoter, we provide a more complete picture of how c-Myc transcription is likely controlled. Last, these physiologically relevant G-quadruplex and i-motif structures, along with the mechanosensor mechanism for control of gene expression, are proposed as novel mechanisms for small molecule targeting of transcriptional control of c-Myc.
The c-Myc gene product is a potent oncoprotein and a transcription factor that plays an essential role in the control of cell growth, as well as in cell fate determinations related to the induction of apoptosis.1-3 The activation of c-Myc is known to induce transcription of growth-stimulating genes in many types of human cancer, including breast, stomach, lung, prostate, colorectal, and pancreatic cancers and leukemia.4,5 c-Myc overexpression can be caused by different mechanisms, including gene amplification,6-8 translocation,9,10 and simple upregulation of transcription.4,5 In a very recent paper, the validity of c-Myc as a molecular target in oncology has been demonstrated.11
Polypurine/polypyrimidine tracts are overly represented sequences in the mammalian genome and are often found in the proximal promoter region of many growth-related genes, including many oncogenes.12-19 Previous studies have shown that DNase I or S1 nuclease hypersensitive sites are often found within the regions of DNA that contain these tracts in both chromatin and negatively supercoiled plasmid DNA.13-16 These studies also suggest that these tracts are structurally dynamic and can be converted easily into secondary structures. The structural transition of B-DNA to non-B-DNA secondary structures must be preceded or accompanied by localized unwinding or melting of the double helix.20-23 Local unwinding or melting of duplex DNA is known to be facilitated by negative supercoiling stress, which is naturally generated behind the translocating RNA polymerase complex during gene transcription.24,25
The promoter of the c-Myc oncogene contains a number of elements that have either been shown to assume,26 or are suspected to assume, non-duplex regions under negative superhelical stress. We have been particularly interested in a polypurine/polypyrimidine tract known as the nuclease hypersensitive element III1 (NHE III1) or CT element (Figure 1A),27 because under conditions when both strands are single-stranded, they assume secondary DNA structures, namely, a G-quadruplex on the purine-rich strand28-30 and an i-motif on the pyrimidine-rich strand.31 In vivo, the single-stranded DNA-binding proteins CNBP32 and hnRNP K33 bind to these regions to affect transcription, suggesting that the NHE III1 can assume a single-stranded form. However, several important questions remain unanswered. Are these elements capable of forming specific secondary DNA structures, and if so, are they the same as those found in the single-stranded DNA templates? Can negative superhelicity provide the torsional stress necessary to induce these structures? These questions have been addressed in this contribution.
G-quadruplexes have become of considerable interest in medicinal chemistry because they represent alternative DNA structures that are folded and as such assume globular structures and are thus potentially targeted by small molecules. In this respect they mimic protein structures that are specifically assembled into their precise folding patterns determined by the primary amino acid sequence. Consequently a 30-base sequence can assume a specific folding pattern and globular structure determined by the precise DNA sequence in the promoter element. In eukaryotic promoter sequences of such genes as c-Myc, Bcl-2, c-Kit, RET, VEGF, Hif-1α, PDGFA, c-Myb, and KRAS,34 a variety of G-quadruplex folding patterns and structures have been shown to form from single-stranded DNA, and there is accumulating evidence that these structures can be targeted by small molecules to modulate gene transcription.34 The C-rich strand has received much less attention because it would not be expected to form i-motif structures under physiological conditions. Thus the questions posed at the end of the previous paragraph become critical for future drug discovery efforts.
In a recent paper by Levens et al., the effect of transcriptionally generated negative supercoiling in vivo has been demonstrated in the c-Myc promoter.26 This supercoiling is not relieved by topoisomerase I or II and directs the melting of the Far Upstream Element (FUSE) 1.4 Kb upstream of the P1 promoter, which in turn binds to the transcriptional activating and extinguishing factors FUSE Binding Protein (FBP) and FBP Interacting Repressor (FIR), a process that is likened to a “cruise control” mechanism, in which undulating superhelical stress controls the rate of promoter firing through a feedback loop. The NHE III1 is much nearer the P1 promoter and thus is likely to be subjected to greater torsional stress than the FUSE element. To mimic this transcriptionally induced negative superhelicity, we have used a supercoiled plasmid system in which either the wild-type (Del4) or mutant (Del4-DM) NHE III1 element has been included (Figure 1B). The supercoiled plasmid used in this study retained the native level of DNA supercoiling. The footprinting agents used in this study include S1 nuclease, DNase I, dimethyl sulfate (DMS), KMnO4, and bromine (Br2), and the reactivity of these probes is very sensitive to the conformation of DNA molecules.35-39 Thus, these reagents have often been used to probe the structural transition from BDNA to non-B-DNA secondary structures, such as melted DNA, hairpin, G-quadruplexes, or i-motifs.36-39 The results of this study, alongside the data from the Levens laboratory, provide compelling data for a dynamic equilibrium between duplex, unwound duplex, single-stranded DNA, and secondary DNA structures within and around the NHE III1 that determines the transcriptional state of the c-Myc promoter in cells.
S1 nuclease preferentially cleaves single-stranded regions over locally unwound regions, normal duplex regions, or secondary structures, such as cruciform DNA.37,40 Thus, S1 nuclease footprinting experiments were carried out to examine the possibility of secondary structures forming in the NHE III1 within the c-Myc promoter under the influence of negative supercoiling. Our conditions were optimized so that there was one hit or cleavage event per DNA molecule. Once cleavage occurred, the experiment was over, but this would not invalidate the experiment under the conditions used. As shown in Figure 2, A and B, the G-rich and C-rich strands of the wild-type NHE III1 both show overall increased reactivity to S1 nuclease on the 5′-side of the NHE III1, particularly on the C-rich strand, but reduced reactivity within the NHE III1. The increased reactivity to S1 nuclease on the 5′-side could be explained either by a highly dynamic region exhibiting single-stranded characteristics or by the formation of hairpin structures, which are commonly formed under negative supercoiling conditions.20,21 Examination of the sequence of the flanking region did not reveal any palindromic sequences, ruling out hairpins as the basis for this enhanced reactivity to S1 nuclease. The effect of 100 mM KCl on both strands of the NHE III1 and the flanking regions is to reduce the reactivity to S1 nuclease (compare lanes 2 and 3 in Figure 2A). Since KCl is known to stabilize G-quadruplex structures, this suggests that the reduced reactivity to S1 nuclease within the NHE III1 region is due to the enhanced stability of these structures, which also affects both the C-rich strand in the NHE III1 region and the flanking regions. We propose that the G-quadruplex and an accompanying structure on the C-rich strand (the i-motif) act as a buffer for the dynamic stress induced by negative supercoiling and that this then dampens the dynamic nature of the flanking regions on both strands, but particularly on the 5′-sides of both strands.
To further investigate the potential significance of the formation of G-quadruplex structures in the structural transition of the NHE III1 region, particularly in the presence of KCl, S1 nuclease footprinting experiments were carried out using both linear DNA and a mutant plasmid, where a single GC-to-AT mutation (Figure 2, C and D) was introduced into the NHE III1 of the wild-type plasmid by site-directed mutagenesis to abolish the capability of the G-rich strand to form G-quadruplex structures.28 For linear DNA, S1 nuclease activity in the Del4 DNA was extremely low compared to supercoiled DNA, suggesting the importance of negative supercoiling in promoting local unwinding of DNA. To more accurately define the effect of negative supercoiling on the NHE III1 and surrounding region, the mutant plasmid was used. As shown in Figure 2C, the S1 nuclease cleavage pattern on the NHE III1 exhibits selective enhanced reactivity of S1 nuclease toward the G-rich strand of the mutant NHE III1 occurred in the presence of KCl (Figure 2C, left panel, and Figure 2D, upper trace). This result suggests that this sequence exists partly in a single-stranded form rather than the normal double-stranded form or as a secondary structure. The 5′ flanking region shows reduced reactivity to S1 nuclease in comparison to the wild-type (Figure 2A), suggesting that this region is far less dynamic. In contrast, the complementary C-rich strand showed reduced reactivity to S1 nuclease, which was maintained in the presence of KCl (Figure 2C, right panel, and Figure 2D, lower trace), suggesting that secondary structures may still exist in this region and are less sensitive to the C-to-T mutation. However, the identify of these structures cannot be ascertained from these experiments but are addressed later in sections related to chemical footprinting.
DNase I is known to preferentially cleave locally unwound or normal duplex regions over single-stranded regions or secondary structures, such as the hairpin.40 In this study, we performed in vitro footprinting experiments with DNase I to probe the possible structural transition of the NHE III1 of the c-Myc promoter in supercoiled plasmid DNA and the nature of the flanking regions (Figure 3). Since the S1 nuclease experiments (Figure 2) had shown a partially single-stranded character of the 5′ flanking sequence, which was dampened in the presence of KCl, we used DNase I to determine whether this single-stranded region was in equilibrium with partially unwound or duplex DNA. The results (Figure 3) show that strong cleavage of the 5′ flanking region occurs, which is significantly dampened in the presence of 100 mM KCl. Taken together with the S1 nuclease results, this strongly suggests that the 5′ flanking region on the C-rich strand is highly dynamic in nature and consists of an equilibrium between duplex DNA, partially unwound duplex, and single-stranded DNA. Furthermore, the addition of 100 mM KCl, which further stabilizes the G-quadruplex, severely dampens the enzymatic cleavage in the flanking regions by both DNase I and S1 nuclease. This suggests that the stabilization of the secondary DNA structures in the NHE III1 acts as a dynamic buffer to absorb the torsional stress associated with the negative superhelicity.
DMS footprinting is particularly useful for probing G-quadruplexes in vitro, because these structures require N7 of guanine and there is protection of the guanine residues involved in G-quadruplex formation from N7 methylation by DMS, as we have previously demonstrated.28 Thus, DMS footprinting is useful for fine-mapping the presence of G-quadruplex structures or single-stranded regions within the NHE III1 of the c-Myc promoter. As shown in Figure 4, A and B (lanes 1 and 2), under supercoiled conditions DMS shows significantly reduced reactivity within the four 5′-end guanine tracts on the G-rich strand of the NHE III1, where G-quadruplex structures are proposed to be present. Significantly, the immediate flanking regions of the proposed G-quadruplex-forming sequence show enhanced reactivity to DMS (Figure 4, A and B), presumably because of the presence of locally unwound structures at the junctions between normal duplex regions and stable secondary structures. It is the relative reactivity of the protected guanine tracts (3−6) to those in the adjacent region (tracts 1−2) that is critical in making the conclusion as to which guanines are involved in G-quadruplex formation. It is important to note that some adenine residues at these flanking regions also show unusual reactivity toward DMS, suggesting that the N3 position of adenine is sterically accessible to electrophilic attack by DMS, as has been previously observed at or near B-DNA–Z-DNA junctions.41
We were initially surprised to observe that DMS reactivity toward guanine or adenine residues within the G-rich region was not affected by the presence of KCl (compare lanes 1 and 2 in Figure 4, A and B), suggesting the ability of the c-Myc promoter region in supercoiled form to form G-quadruplexes in vitro irrespective of the presence of KCl. This contrasts with the differential reactivity of the same region to enzymatic probes in the absence or presence of KCl (see Figures 2 and and3).3). The reactivity of DMS toward the guanine residues within the complete G-rich region is normal in the linearized form of the wild-type plasmid (Figure 4A and B, lanes 3 and 4), showing that negative supercoiling is required to drive the local unwinding of the NHE III1 region, allowing a specific G-rich region (tracts 3−6) to form a G-quadruplex.
The mutant plasmid was used to further investigate the potential significance of the formation of G-quadruplex structures by the G-rich strand and the unusual reactivity of the NHE III1 region toward DMS under negative supercoiling stress by carrying out a footprinting experiment. As noted previously, we had demonstrated that the single-base mutation in this plasmid is able to dramatically reduce the propensity of the sequence to form a G-quadruplex structure.28 As shown in Figure 4, C and D, the DMS reactivity pattern on the G-rich sequence of the mutant plasmid is very different from that of the wild-type plasmid and more closely resembles that associated with duplex or single-stranded DNA. This observation strongly suggests that the secondary structure formed under negative superhelicity is a G-quadruplex, because it involves just four runs of guanines, and a single-base mutation known to destabilize the G-quadruplex results in loss of this DMS protection.
KMnO4 and Br2 were used in further footprinting experiments to probe for secondary structures that can be formed specifically in the C-rich strand in the NHE III1 region of the c-Myc promoter. KMnO4 is a chemical probe that reacts with the C5–C6 double bond of thymidine and is specific for thymine residues in single-stranded regions in which the C5–C6 bond is more exposed than usual.38-40 Thus we would expect thymine residues in the 5′ flanking region to be more reactive to KMnO4 than thymines in the NHE III1 if they are contained in a secondary structure such as an i-motif. Furthermore, their reactivity to KMnO4 in both the flanking region and NHE III1 element should be reduced by stabilization of the G-quadruplex upon addition of KCl. The results shown in Figure 5 are precisely in accord with these expectations. While there is more pronounced KMnO4 cleavage of the thymine residues in the flanking region relative to the potential i-motif region, both regions of cleavage are suppressed upon addition of KCl. The lower reactivity of thymines in the polypyrimidine tract to KMnO4 in both the absence and presence of KCl suggests that this region adopts a secondary structure that may be in part stabilized by stacking of the thymidine residues into the secondary structure. Significantly, the 5′ thymine between cytosine tracts 1 and 2 (asterisk in Figure 5) showed no reactivity to KMnO4, suggesting that it is sterically inaccessible to KMnO4, in contrast to its 3′ neighbor.
Next Br2 protection experiments were used to probe for secondary structures that can be formed specifically by the C-rich strand in the NHE III1 region of the c-Myc promoter (Figure 6A). Br2 is known to react selectively with the C5–C6 double bond of the pyrimidine base within DNA, resulting in 5-bromocytosine and 6-bromothymidine.39 In particular, the Br2 reaction with the cytosine residues in a single-stranded region is at least 10-fold higher than that in duplex DNA.39 Therefore, we presumed that the cytosine residues in the loop regions of i-motif structures would be more reactive to Br2 than the cytosine residues involved in base pairing and intercalation, allowing us to deduce specific cytosine residues required for base pairing and intercalation in these structures.19,42 Indeed, we have previously shown that the Br2 reactivity of the i-motifs in VEGF42 and RET19 shows a clear pattern of protection for cytosines involved in these structures. Several cytosines that are more reactive to Br2 occur within four of the six cytosine tracts in the NHE III1 (arrows in Figure 6A).
This same region and its reactivity to Br2 are examined in more detail in Figure 6, B and C. First, only in the tracts of four cytosines is there enhanced cleavage of the 3′-cytosine by Br2, as was previously shown for similar cytosine tracts in the RET and VEGF promoters that form i-motifs.19,42 The two tracts of three cytosines (3 and 5 in Figure 6, B and C) do not show 3′-enhanced cleavage. Second, the extent of cleavage of the cytosine tracts is biased toward the 3′-end of the NHE III1, with cytosine tracts 1, 2, and 3 showing noticeably less cleavage than the other three tracts. The pattern of cleavage of cytosine tracts 4 and 6 is more complex than the other cytosine tracts. Tracts 4 and 6, which have four bases each, show pronounced cleavage not only at the 3′-ends but also at intermediate bases, suggesting that they may reside in loop regions. Tract 5, which has three bases, is also moderately cleaved but without a bias toward the 3′-end. Overall, this analysis implies that the three 5′ tracts of cytosines (1, 2, and 3) are clearly preferred for providing three of the four tracts of cytosines necessary for i-motif formation. A fourth tract of three cytosines is needed to complete an intramolecular i-motif formed from six cytosine–cytosine base pairs. The relatively uniform and moderate cleavage of cytosine tract 5 and the uneven cleavage patterns of tracts 4 and 6 both suggest that this fourth tract is most likely provided by tract 5, leaving tract 4 in a loop region and tract 6 at a junction site. We speculate that the 3′ cytosine in run 6 is much more accessible than any of the other cytosines, because that residue might be located at the junction between normal duplex and stable secondary structures formed by the C-rich sequence of the NHE III1 region. We believe that the loop regions of the i-motif structure proposed in Figure 6D could form hydrogen-bonded capping structures to further stabilize i-motif structures, thus inhibiting cleavage by single-strand-specific probes such Br2.
This overall pattern of cytosine protection and cleavage is best described by the i-motif 6:2:6 loop isomer shown in Figure 6D. This model predicts the marked Br2 cleavage of the 3′ cytosines in tracts 1 and 2 and the relatively even but modest cleavage of tracts 3 and 5. Tracts 4 and 6 and the five-base sequence between tracts 1 and 2 then provide the loop regions. The less predictable Br2 cleavage patterns of tracts 4 and 6 and the two cytosines between tracts 1 and 2 (protected) are most likely explained by either stable capping structures or loop slippage, the latter of which would give rise to alternative loop isomers. Last, the relatively protected cleavage of thymines to KMnO4 in the loop regions also suggests that these bases are involved in stable capping structures. Significantly, we observed that 5 mM bromine did not produce any cleavage above background levels of the NHE III1 in a linearized plasmid DNA, even after 1 h incubation (see Supporting Information). This confirms the critical importance of negative supercoiling in inducing transient single-stranded regions and the secondary DNA structures in duplex DNA, as shown in Figure 7A.
The enzymatic (S1 nuclease and DNase I) and chemical (DMS, KMnO4) footprinting experiments provide a picture of dynamic regions that extend well beyond the polypurine/polypyrimidine region of the NHE III1. In both flanking regions, S1 nuclease and DNase I show evidence for a dynamic equilibrium between duplex, partially unwound duplex, and single-stranded regions. As anticipated, addition of 100 mM KCl, which is expected to stabilize the G-quadruplex, dampens the dynamics of these flanking regions, presumably because the more stable secondary structures act as a buffer for the absorption of the negative superhelicity. However, these enzymatic probes do not provide direct insight into the identity of the secondary DNA structures.
Significantly, G-quadruplex formation was shown to be dependent on both negative superhelicity and the wild-type sequence; however, the polypurine/polypyrimidine tract still present in the mutated NHE III1 shows evidence of a partially single-stranded region under conditions of negative superhelicity. In this case, the addition of KCl does not have as nearly a significant effect on the S1 nuclease cleavage of the flanking regions as with the wild-type sequence, again supporting the role of the secondary DNA structures in absorption of the negative superhelicity (see Figure 2, C and D). Figure 7A summarizes the equilibrating species in the NHE III1 with the requirements for transition to a partially single-stranded form (iii) and the secondary DNA structures (iv).
In sharp contrast to the enzymatic probes S1 nuclease and DNase I, the reactivity of the chemical probes DMS, KMnO4, and Br2 toward the NHE III1 region of the c-Myc promoter was little affected by the presence of potassium ions, suggesting that secondary structures already exist within this region in supercoiled form in vitro, even without additional potassium ions (compare Figures 2A and and33 with Figures 4A, ,5,5, and and6A).6A). Enzymatic nucleases are known to bind to and interact with DNA to induce large conformational changes in both the enzyme and the DNA, bringing the scissile phosphate backbone in closer proximity to the enzyme-active site.40,43 Therefore, the enhanced stability of the secondary structures in the presence of KCl is anticipated to have more influence on the extent and sites of DNA cleavage by enzymatic nucleases than by chemical DNA-cleaving or DNA-modifying agents such as DMS, KMnO4, and Br2, which do not require conformational changes of the DNA for cleavage and, correspondingly, are predicted to be less sensitive to KCl-dependent stabilization.
DMS and Br2 footprinting experiments provide direct insight into the G-quadruplex and i-motif folding patterns in the NHE III1. These footprinting experiments suggest that there is a single G-quadruplex isomer (1:2:1 loop isomer) and also one major form of the i-motif (6:2:6 loop isomer) (Figure 7B). These two structures are offset on both strands, showing a common use of only three purine/pyrimidine tracts (2−4) and leaving 14- and 5-base overhangs on the purine and pyrimidine strands, respectively (Figure 7B). Significantly, these single-stranded overhang regions are extensively cleaved by DMS (purine strand; see Figure 4, A and B) and Br2 (pyrimidine strand; see Figure 6, A and B). As anticipated, a specific G-to-A mutation of the central tetrad of the G-quadruplex (asterisk in Figure 7B) results in destabilization of this structure, as shown in Figure 4, C and D. While we had previously correctly identified the five guanine tracts at the 5′-end of the NHE III1 as being involved in the biologically relevant G-quadruplex structure,28 the DMS footprinting analysis of the single-stranded G-rich strand had identified tracts 2−5 rather than tracts 3−6 as forming the more stable G-quadruplex. In the context of duplex DNA under negative superhelicity, it is perhaps not too surprising that the result might be different from that in a single-stranded substrate.
One of the most significant conclusions of our study is that the i-motif is able to form under neutral pH conditions, provided negative superhelicity is maintained. This is in direct contrast to single-stranded DNA, where acidic pHs are required to form the hemiprotonated cytosine–cytosine base pair. In a previous study using a single-stranded G-rich 33-mer corresponding to the full NHE III1 sequence at a pH of 5.0−6.0, CD titration analysis showed eight cytosine–cytosine base pairs, which must arise from cytosine tracts 1, 2, 4, and 6.31 Presumably the acidic pH–determined stability is maximized in this form where all four tracts of four cytosines are utilized. In contrast, under negative superhelicity, the 6:2:6 isomer utilizes four tracts of three cytosines, thereby increasing the loop sizes relative to those found under acidic conditions. This is perhaps why a 14-base overhang on the G-rich strand is required to permit simultaneous formation of a specific 6-base capping loop in the i-motif. The partial or complete protection of many of the cytosine and thymine bases in the loops to chemical reagents is also supportive of the idea that the specific capping structures are an important component of the conformational stability of the i-motif induced under negative superhelicity. This important result implies that i-motifs as well as G-quadruplexes can form in promoter regions under conditions of transcriptionally induced negative superhelicity and may therefore displace transcriptional factors, such as CNBP or hnRNP K, that bind to single-stranded CT elements to activate transcription of c-Myc.32,33
A long-neglected aspect of transcriptional regulation is the potential importance of torsional stress, which is propagated through DNA as a consequence of the forward movement of polymerase relative to the active promoter.44 The previously described mechanism for control of the rate of transcriptional firing from the c-Myc promoter involves transcription-generated supercoiling together with melting of the FUSE 1.7 Kb upstream of the P2 promoter, which has been termed a “molecular servomechanism.”45 This servomechanism, which is controlled by the magnitude of ongoing transcription, has been aptly called a “cruise control,” but this mechanism does not constitute a “starter” mechanism, since the FUSE is unable to modulate transcriptional output until transcription is initiated.45 Two questions arise from this observation: what are the elements that constitute the starter and how can these control mechanisms coordinate their activities? The local topology of the c-Myc promoter with P1 and P2 promoters, together with the NHE III1 and the FUSE, is shown in Figure 8A. The P1 promoter is 117 base pairs downstream of the NHE III1 and 1.5 Kb downstream of the FUSE. As demonstrated here, supercoiling has the ability to generate the formation of G-quadruplex and i-motif structures within the NHE III1. These structures have been shown to be silencer elements in a luciferase reporter system, because when destabilized in a mutant plasmid, they enhance transcriptional activity.28 Conversely, stabilization by G-quadruplex-interactive compounds reduces transcriptional activity.28 In the serum-starved condition, we propose that these structures exist in NHE III1, which would prevent transcriptional factors from binding to this region. These structures would also provide a dynamic buffer able to absorb transcriptionally induced negative superhelicity, thus maintaining the FUSE in a nucleosomal state (Figure 8A). Additionally, the superhelicity stored in the G-quadruplex and i-motif structures might be used for DNA melting at the start point and then to assist in polymerase escape. Removal of the negative superhelicity by topoisomerase I provides conditions more amenable to conversion back to duplex and, subsequently, single-stranded DNA (Figure 8B). Both forms are binding sites for transcriptional factors (Sp1 for duplex and hnRNP K and CNBP for single-stranded DNA), which are known to bind to this element to activate transcription (Figure 8C).32,33 Protein-facilitated removal of the secondary DNA structures, which are buffers for absorption of negative superhelicity, also provides a permissive topology for this transcriptionally induced supercoiling to propagate and move the FUSE into a nonnucleosomal configuration (Figure 8C).26 The FUSE “mechanosensor” function then takes over to provide a real-time feedback from the P2 promoter involving the competitive binding of FBP (promoter firing; Figure 8D) and FIR (promoter extinction; Figure 8, E and F) until a time when starvation triggers the FUSE once more to become nucleosomal, P1 transcriptionally induced torque rejects the transcriptional factor, and the G-quadruplex and i-motif are reinstated to silence transcription (Figure 8A).
In a more general sense, putative G-quadruplex-forming elements have been found to be concentrated in regions immediately upstream of the transcription initiation site,12 and conservation of the sequences that form stable intramolecular G-quadruplexes has been noted in a variety of human oncogenes, including c-Myc, Bcl-2, VEGF, Hif-1α, Ret, c-Kit, PDGF-A, KRAS, and c-Myb (reviewed in refs. 34 and 46). Although Sp1 has been found to be a prime factor that binds to G-quadruplex-forming elements,46 other transcriptional factors, such as hnRNP K and CNBP, that bind to single-stranded elements have also been described.32,33 In addition, proteins that recognize and bind to G-quadruplexes may also play an important role.47 As has been previously noted, “...transcription is inevitably coupled with dynamic perturbation of the double helix [and] the mechanisms [mechanosensor] discussed here may prove to be widespread if not universal.”26 This statement may also be equally applicable to transcriptional activation/silencing mechanisms involving G-quadruplexes and i-motifs, since sequences that can form these structures are found in high abundance in regions upstream of the transcription start sites in eukaryotic organisms.12
The insight provided by the model (Figure 8) for the mechanosensor mechanism and activation/silencing of the c-Myc that involves the NHE III1 and the FUSE provides a number of opportunities for small molecule–mediated external control of gene expression. As eluded to earlier, the globular structures of the G-quadruplex and i-motif determined by the primary base sequence of the NHE III1 afford a unique opportunity to target in this case a 31-base-pair sequence with small drug-like molecules. Evidence from several laboratories28,48-50 already demonstrates that targeting G-quadruplexes has effect on transcription of c-Myc, c-Kit, KRAS, and VEGF, but in the majority of cases selectivity has yet to be determined. The abundance of these putative G-quadruplex-forming units in eukaryotic promoters12,51 suggests that drug selectivity may be a problem, but the diversity of folding patterns and loop sizes offers opportunities to distinguish between these structures. Only sparse information is available on drug targeting of i-motif structures,52 and here the folding patterns may diverge considerably between single-stranded templates at acidic pH levels and negatively supercoiled induced structures (see before). This may be less of an issue for G-quadruplexes, but in the case of c-Myc there is a subtle difference between the G-quadruplex that predominates in a single-stranded state and that found in this study. Of particular interest is the possibility of drug targeting the complex structure that exists within a combined G-quadruplex/i-motif complex as shown in Figure 8 (A).
In addition to the NHE III1, the mechanosensor mechanism involving the FUSE/FBP/FIR system also provides a unique opportunity to modulate gene transcription. Indeed, in a collaborative effort between scientists at NCI and Abbott labs the targeting of the interface between the FUSE and FBP has been achieved with benzoylanthranilic acid–type molecules identified through SAR by NMR.53 Although this approach was not carried forward into cellular studies, it remains a viable complementary approach to that of targeting the NHE III1.
We believe the potential “widespread if not universal”26 molecular features of this transcriptionally induced negative superhelicity provide an enormous opportunity in drug discovery and development, perhaps equivalent to that of drug targeting of kinases. Quarfloxin, a first-in-class drug now in phase 2 clinical trials, is believed to exert its mechanism of action through targeting G-quadruplexes.54,55 Much remains to be learned in this area, and in particular, proof of the existence of these structures in the promoter elements of cells and their involvement in modulation of gene transcription are important objectives of this lab and others involved in this emerging area for drug discovery.
T4 polymerase kinase, DNase I, and S1 nuclease were purchased from Promega. PAGE-purified oligonucleotides were obtained from Sigma Genosys.
For in vitro footprinting of the c-Myc promoter region, we used the supercoiled form of the human c-Myc reporter Del4, which was kindly provided by Kenneth W. Kinzler (Johns Hopkins University, Baltimore, MD). The Del4 plasmid contains ~850 bp of c-Myc sequence, including the two major promoters (P1 and P2) and the NHE III1 region.28 The mutant reporter plasmid Del4-DM was constructed by introducing point mutations into the specific guanine residues within the NHE III1 region of Del4 using a QuickChange mutagenesis kit (Stratagene) to destabilize the G-quadruplex-forming ability of this region.28
In vitro plasmid footprinting experiments were performed as described previously.16 In brief, a supercoiled form of the plasmids (1 μg of Del4 or Del4-DM) was incubated in 20 μL Tris-HCl buffer (20 mM, pH 7.6) without or with 100 mM KCl at 37 °C overnight and then treated with DNase I (0.5 U) or S1 nuclease (200 U) for 2 min. DNA was precipitated with ethanol and resuspended in double-distilled water after vacuum drying. For chemical footprinting, plasmid DNA was treated for 2 min with 0.5% DMS and 10 min with 10 mM KMnO4 or 1 mM molecular Br2 (KBr with KHSO5) as previously described.39 The reactions were then terminated by adding 50 μL of stop mix containing 0.6 M Na-acetate (pH 5.2), and unreacted chemicals were removed by ethanol precipitation. Following ethanol precipitation, the DNA pellet was dried and resuspended with 100 μL of double-distilled water, and the samples were heated at 90 °C for 30 min to induce strand cleavage. Following thermal treatments, the DNA samples were completely dried and resuspended with double-distilled water. To map cleavage sites on the plasmid DNA, linear amplification by PCR was performed using a Thermo Sequenase Cycle Sequencing kit (USB), following the manufacturer's instructions, with the 32P-labeled gene-specific primers GGGCCGGTGGGCGGAGATTAGCG and CGCGCGTAGTTAATTCATGCGGC to amplify the top and bottom strands of the plasmid DNA, respectively. PCR was carried out using cycling conditions consisting of an initial 4-min denaturation step at 94 °C, 1 min at 60 °C, and 1 min at 72 °C, for a total of 40 cycles. After primer extension, the samples were dried under vacuum, and the DNA pellet was resuspended in 15 μL of formamide dye and electrophoresed in a 8% urea-acrylamide gel for 8 h at 1400 V. Dideoxy sequencing reactions were carried out using the same labeled primer, according to the protocol provided by the manufacturer. The dried gel was exposed to a phosphorimager screen and quantified using ImageQuant software.
This research was supported by the National Institutes of Health (CA94166 and CA109069). We thank David Levens, Shankar Balasubramanian, Andrew Travers, and Danzhou Yang for insightful discussions. We are grateful to David Bishop for preparing, proofreading, and editing the final version of the manuscript and figures.