|Home | About | Journals | Submit | Contact Us | Français|
Gene expression during lytic development of bacteriophage Mu occurs in three phases: early, middle, and late. Transcription from the middle promoter, Pm, requires the phage-encoded activator protein Mor and the bacterial RNA polymerase. The middle promoter has a −10 hexamer, but no −35 hexamer. Instead Pm has a hyphenated inverted repeat that serves as the Mor binding site overlapping the position of the missing −35 element. Mor binds to this site as a dimer and activates transcription by recruiting RNA polymerase. The crystal structure of the His-Mor dimer revealed three structural elements: an N-terminal dimerization domain, a C-terminal helix-turn-helix DNA-binding domain, and a β-strand linker between the two domains. We predicted that the highly conserved residues in and flanking the β-strand would be essential for the conformational flexibility and DNA minor groove binding by Mor. To test this hypothesis, we carried out single codon-specific mutagenesis with degenerate oligonucleotides. The amino acid substitutions were identified by DNA sequencing. The mutant proteins were characterized for their overexpression, solubility, DNA binding, and transcription activation. This analysis revealed that the Gly-Gly motif formed by Gly-65 and Gly-66 and the β-strand side chain of Tyr-70 are crucial for DNA binding by His-tagged Mor. Mutant proteins with substitutions at Gly-74 retained partial activity. Treatment with the minor groove- and GC-specific chemical chromomycin A3 demonstrated that chromomycin prevented His-Mor binding but could not disrupt a pre-formed His-Mor·DNA complex, consistent with the prediction that Mor interacts with the minor groove of the GC-rich spacer in the Mor binding site.
Bacteriophage Mu, the prototype for a large family of transposable phages, has two alternate life styles, lytic and lysogenic (1). Gene expression during the lytic cycle occurs in three phases: early, middle, and late (2, 3). The early promoter Pe has the characteristics of a typical bacterial promoter with both recognizable −10 and −35 hexamers, the promoter recognition elements for the Escherichia coli RNA polymerase (RNAP)3 with the “housekeeping” sigma factor, σ70 (4, 5). The middle promoter Pm possesses a −10 hexamer but lacks similarity to the consensus −35 element TTGACA (Fig. 1A). As a result, gene expression from Pm requires a phage-encoded transcription activator protein, the middle operon regulator, Mor (6). Likewise, transcription from the Mu late promoters requires activation by the Mu C protein (7), which shares significant sequence similarity with Mor (6–9) (Fig. 1B).
Mor and C are the most well studied members of this Mor/C family of transcription factors. They are characterized by an acidic N terminus and a basic C terminus that contains a helix-turn-helix (HTH) DNA-binding motif (6). Mor is a homodimeric, sequence-specific DNA-binding protein with a monomer length of 129 amino acids (Fig. 1B) (6, 10). Consistent with its role as a transcription activator, Mor binds to the inverted repeats of a pseudo-palindromic DNA element located between −51 and −36 base pairs upstream of the transcription start site +1 (Fig. 1A) and overlapping the region normally occupied by the missing −35 recognition element (Fig. 1A) (11). In addition to Mor binding, activation of the middle promoter requires the C-terminal domains of the α and σ subunits of the E. coli RNA polymerase (12–14). Thus, the working model for middle promoter activation involves binding of Mor to the promoter, and its recruitment of RNAP to the otherwise nonfunctional middle promoter through Mor-RNAP interactions (13, 14).
The high-resolution crystal structure of a histidine-tagged Mor (His-Mor) dimer showed the locations of Mor amino acids 27 to 120; the His tag and Mor amino acids 1–26 and 121–129 were not visible (15). The dimer structure has three domains: an N-terminal dimerization domain and two C-terminal DNA-binding domains, each with a classical HTH motif (Fig. 2A) (15). The N-terminal α-helices, α1 and α2, of each His-Mor subunit intertwine with each other to form a four-helix bundle, creating a single central dimerization domain (Fig. 2A). The C-terminal three-helix bundle in each monomer forms the canonical HTH DNA-binding motif in which helix α3 forms the “scaffolding helix,” and helices α4 and α5 together with the intervening loop form the HTH motif (Fig. 2A) (16, 17). The DNA-binding domains of the His-Mor dimer flank the dimerization domain, putting them at opposite ends of the structure (Fig. 2A). Comparisons of the HTH domain structure with other proteins identified the trp repressor TrpR and region 4.0 of the σ subunit of Thermus aquaticus RNAP as close structural neighbors (15, 18, 19). These proteins use a perpendicular “ends-on” mode of DNA binding, inserting their recognition helices into the major groove using only the first two turns of the recognition helix for DNA contacts. This led us to propose that His-Mor might bind similarly (15); however, the recognition helices of the His-Mor dimer are located too far apart (63 Å) to interact with two adjacent major grooves of B-DNA, which reach a maximum distance at the outside edges of 54 Å, indicating that conformational changes in both His-Mor and Pm are probably required to achieve DNA binding by His-Mor (15). Consistent with this prediction, circular permutation gel-shift assays with His-Mor bound to Pm sequences revealed DNA curvature with a ~45° bending angle.4 Earlier we proposed that conformational changes in His-Mor would involve movement of the DNA-binding domains up and away from the dimerization domain to contact the DNA (Fig. 2, C and D).
The structure of the His-Mor dimer revealed an additional novel secondary structure element that might play a role in DNA binding (15). The anti-parallel β-strands at the top of the molecule (Fig. 2, A and B) form a β-ribbon in a 12-amino acid linker that connects the dimerization and DNA-binding domains. The residues in the β-strand and flanking loops of this linker are highly conserved, with 7 of 12 amino acids identical between Mor and C (Fig. 1, B and C). The β-strand amino acid side chains of Gln-68 and Tyr-70 (numbered as in native Mor) extend away from the protein (Fig. 2B). When the two HTH motifs make contacts in the major groove of the inverted repeats in Pm (Fig. 1A), the surface-exposed side chains of Gln-68 and Tyr-70 are ideally positioned to interact with the minor groove of the GC-rich spacer between the inverted repeats (Fig. 1A).
In the crystal structure the side chains of residues Val-69, Ile-71, and Pro-72 are buried and involved in hydrophobic interactions in the dimer interface; they form a “cap-like” structure that seals the hydrophobic interior from the solvent (15).
This linker region contains three highly conserved glycines (Fig. 1C) that are predicted to form hinges that contribute to the conformational flexibility of Mor (15) (Fig. 2, B-D). The N-terminal amino acids of this hinge region form a “Gly-Gly-Gly” motif in which two of the glycines, Gly-65 and Gly-66, are highly conserved among Mor/C family members (Fig. 1C); whereas the C-terminal amino acids include only one conserved glycine, Gly-74 (Fig. 1C). Given their location between the two domains and the tolerance of glycine to extreme bond angles, we proposed that glycine hinges might act as pivot points for the conformational changes in Mor required for it to interact with the DNA (15).
In this study, the roles of the potentially key linker amino acids in His-Mor function were investigated by codon-specific site-directed mutagenesis of a his-tagged mor gene (his-mor). A two-plasmid transcription-activation phenotypic screening assay (Fig. 1D) for Pm-lacZ function was employed to identify candidate mutants with different levels of Pm activity, and their his-mor genes were sequenced. The mutant His-Mor proteins were then characterized for their overexpression, solubility, DNA binding, and transcription activation. The results presented here clearly demonstrate that multiple conserved amino acids in the β-strand and flanking loops of this linker are crucial for DNA binding by His-Mor.
Protein overexpression and routine cell growth were done in LB medium (20), whereas cultures for β-galactosidase assays were grown in minimal medium with casamino acids (M9CA) (7). MacConkey lactose plates with 25 g/liter of MacConkey agar and 25 g/liter of MacConkey agar base (Difco) and thus only half the normal amount of lactose was used for plate phenotyping. When necessary, chloramphenicol and ampicillin were used at 25 and 40 μg/ml, respectively. Oligonucleotides for mutagenesis, sequencing, and probe preparation were purchased from Integrated DNA Technologies, Inc.; their sequences are given in Table 1. Chromomycin A3 was obtained from AG Scientific Inc. Sources for the remaining chemicals and enzymes are given in previous publications (10, 12, 15).
The host strain background for most of the plasmid constructions and in vivo assays was E. coli K-12 strain MH13312 (mcrA Δpro-lac thi gyrA96 endA1 hsdR17 relA1 supE44 recA/F′ pro+ lacIQ1 ΔlacZY), a derivative of JM109 (21) carrying an F′ plasmid deleted for both lacZ and lacY and expressing higher than normal levels of Lac repressor (11). Strain MH13435 was made by transforming MH13312 with the Pm-lacZ fusion plasmid pIA14; it was used as a host for phenotypic assays of the ability of mutant Mor proteins to activate transcription from Pm. Strain MH13422 is a derivative of MH13312 containing both pIA14 and pIA69 and was used as the positive control in the MacConkey plate and β-galactosidase assays. Strain MH18211 was derived from MH13435 by transformation with the Mor deletion plasmid pMUT10; it was used as the negative control in the MacConkey plate and β-galactosidase assays. Strain MH13355 was derived from JM109(DE3) and contains the same F′ plasmid as MH13312. The λDE3 in MH13355 is a derivative of λD69 containing the T7 RNA polymerase gene under control of the IPTG-inducible PlacUV5; it also carries imm21 and nin5 mutations (21). This strain was used for protein overexpression and detection of wild-type and mutant proteins.
Plasmid pIA14 contains wild-type Pm sequences from −62 to +10 (the base at −62 is provided by the vector) cloned between the EcoRI and BamHI sites in the lacZ reporter plasmid (Fig. 1D) such that β-galactosidase levels provide an indicator of Pm activity (11). Plasmid pIA69 contains sequences encoding His-Mor with introduced silent restriction sites; relevant sites are shown in Fig. 1D. His-Mor expression in pIA69 is under the control of both a T7 promoter and PlacSYN (Fig. 1D), a slightly altered form of PlacUV5 (15). Amino acid changes were made in this his-mor gene; then the ability of the mutant protein to activate Pm was assayed by using the Pm-lacZ fusion in pIA14. The His-Mor deletion plasmid, pMUT10, is a derivative of pIA69 deleted from PstI to SphI, missing Mor amino acids 8–116, and in the wrong reading frame after 116.
Plasmid pIA69 was used as the template for the PCR-based mutagenesis of his-mor and as the vector for cloning the PCR products. Degenerate mutagenic primers (Table 1) were designed to introduce base substitutions at all three positions of a single codon by using an equimolar mixture of bases (NNN or NNG+C) at the targeted codon (22). In most cases the degenerate primers were used for PCR with wild-type primers MUT13 or ZAO3 flanking the his-mor gene to create a library of mutant DNA fragments containing part of the his-mor gene with single codon mutations. The libraries were combined with overlapping wild-type PCR products containing the remaining part of the his-mor gene and were used as templates for overlapping PCR. In two cases both strands were mutagenized in separate PCR reactions and then combined and used as templates for overlapping PCR. After gel purification, the resulting mutagenized cassettes were cloned (using standard procedures; 23) into pIA69 between the PstI and either HindIII or BamHI sites, in and following his-mor, respectively. The ligation mixtures were transformed as described previously (15) into strain MH13435 containing the Pm-lacZ fusion reporter plasmid, pIA14 (Fig. 1D) (11). In early constructions, the transformants were spread on LB plates with ampicillin (40 μg/ml) and chloramphenicol (25 μg/ml), and the resulting colonies were screened for Pm activity by picking and stabbing 48 to 96 colonies onto MacConkey lactose indicator plates with ampicillin, chloramphenicol, and different concentrations of IPTG (10, 50, and 100 μm) to induce His-Mor expression. In later experiments transformants were plated directly onto the MacConkey lactose plates containing 50 μm IPTG. The direct plating gave much better resolution of distinct single colony phenotypes. Representative mutant plasmids were chosen based on their plate phenotypes, and the mutations were identified by automated DNA sequencing of their his-mor genes by the UTHSC Molecular Resource Center. For protein overexpression, mutant plasmids were transformed into MH13355 for His-Mor protein overproduction and purification.
Overexpression of His-Mor proteins was achieved by using the T7 promoter in pIA69 and the T7 RNA polymerase provided by the MH13355 host. The expression levels and solubility of the mutant proteins were assessed by SDS-PAGE of supernatants from crude cell extracts that were made from those cells. Small scale His-Mor protein purification was accomplished using His tag affinity chromatography of the above supernatants. The ability of the mutant proteins to activate transcription was based on in vivo assays of β-galactosidase produced from a Pm-lacZ fusion plasmid. Detailed descriptions of these assays can be found in Ref. 15.
The DNA-binding ability of the mutant His-Mor proteins was analyzed by using a gel retardation assay. The DNA probe was a PCR product made with 32P-labeled oligonucleotide MLK7 and unlabeled oligonucleotide IRI21 using pIA14 (containing Pm sequence −62 to +10) as template. After purification with a QiaQuick PCR purification kit (Qiagen), the probe concentration was estimated on a 2% agarose gel by comparison with a low mass DNA ladder (Invitrogen).
A 20-μl reaction volume containing 400 pg of probe, 50 ng of calf thymus DNA, and 200, 400, or 800 ng of His-Mor protein in binding buffer (20 mm Tris-HCl, pH 7.9, 50 mm NaCl, 5% glycerol and 1 mm DTT) was incubated at room temperature for 20 min, then resolved on a 10% nondenaturing acrylamide gel containing 0.5 × TBE and run in 0.5 × TBE buffer at 260 V for 3 h at 4 °C. Initial exposure of the gels to X-Omat Bio-Max MR film was done without drying.
Chromomycin interference and disruption assays were performed by using the probe preparation described above and chromomycin A3 from A.G. Scientific Inc. For the interference assay, different concentrations of chromomycin (final concentrations 0.5–50 μm) were incubated with the probe (~400 pg) in a 20-μl reaction volume at room temperature for 15 min followed by addition of 800 ng of purified His-Mor (1.18 μm dimer) and incubation at room temperature for an additional 20 min. For the disruption assay, the effect of chromomycin on preformed Mor-DNA complexes was assayed similarly except that the order of Mor and chromomycin addition was reversed. The reaction mixtures were resolved on a 10% native polyacrylamide gel as described above, and the gels were exposed to X-Omat-MR film without drying but with an intensifier screen for 16 h.
To demonstrate that a particular amino acid is essential for protein function, one could construct a set of mutants, one for each of the other 19 amino acid substitutions, assay them, and show that none retain protein function. The mutants could be made by brute force, one mutant at a time, or more efficiently by making a library of substitutions at a single codon by using a primer with degeneracy at all three positions of the codon, and then sequencing clones until the 19 mutants have been found and tested for function. An even more efficient approach is to use an indicator plate assay for screening the library, where functional and nonfunctional phenotypes are easily discriminated. Sequencing of the target gene from phenotypically functional colonies will identify substitutions that retain protein function. If the library is diverse and all the phenotypically functional colonies have the wild-type protein sequence, then one can conclude that the amino acid at that position is essential. To demonstrate that the library is sufficiently diverse to contain all 19 substitutions, the genes from seven to 10 nonfunctional colonies are sequenced; they should all contain different single amino acid substitutions.
To determine the importance and roles of the inter-domain linker amino acids in His-Mor function, we carried out site-directed mutagenesis by PCR with oligonucleotide primers containing degeneracy at all three positions of a single codon, i.e. oligonucleotides designed to introduce all possible amino acid substitutions at a given position (numbered as in native Mor). The mutated DNA fragments were cloned into the His-Mor expression plasmid pIA69 (Fig. 1D) and transformed into strain MH13435 containing the reporter Pm-lacZ fusion plasmid pIA14 (Fig. 1D). The resulting libraries of transformants containing mutant His-Mor proteins were screened on MacConkey lactose plates containing small amounts (50 μm) of IPTG to induce low levels of His-Mor expression. The colony color then reflected the ability of its mutant His-Mor protein to activate transcription from Pm. Transformants were initially scored in a “pick and stab” assay as defective (white or very light pink) or functional (red) by comparison to the phenotypes of strain MH18211 containing a deletion in His-Mor (Δhis-mor) and strain MH13422 with wild-type His-Mor, respectively. Subsequent re-testing by streaking for single colonies, and by direct plating of the libraries on the MacConkey plates allowed us to identify two additional intermediate phenotypes, diffuse pale pink colonies and white colonies with red centers. The phenotypes being reported here (Table 2) were derived from this isolated colony assay. Libraries with substitutions in the most highly conserved positions Gly-65, Gly-66, and Tyr-70 produced 91, 94, and 71% white colonies, respectively. DNA sequencing showed that the plasmids from these colonies contained his-mor genes encoding single amino acid substitutions at the targeted positions. When the his-mor genes in the red colonies were sequenced, they had no mutations. In contrast, libraries with substitutions in the less highly conserved positions Gln-68 and Gly-74 had only 63 and 56% mutant phenotypes and exhibited a range of mutant colony colors including white, pale pink, and white with red centers. For these positions we chose representatives from each phenotype for sequencing. The Gly-74 mutants were unusual in that one red colony and one white colony with a red center each contained single amino acid substitutions, demonstrating that Gly-74 is not essential. In addition, multiple mutants from the Gly-74 library made white colonies, but those colonies acquired red centers upon continued incubation; these colonies made His-Mor mutant proteins that retained partial function. The amino acid substitutions and colony phenotypes for the above mutants are given in Table 2. Finally, to validate this genetic approach we also made libraries for the nonconserved position Gly-67. For these libraries only 8% of the colonies had a mutant phenotype. Beyond sequencing to demonstrate the presence of mutations, analysis of these mutants was not pursued.
Amino acid substitutions in a protein can cause misfolding and lead to aggregation, precipitation, and increased sensitivity to proteolytic degradation. These consequences should be detectable by a reduction in levels of His-Mor protein in supernatants from crude cell extracts, as observed previously for several mutant proteins with substitutions in the dimerization domain (15). To ensure that the severe defects of these inter-domain linker mutants were not caused by precipitation or degradation of the mutant proteins, we assayed for the presence of overexpressed mutant protein in the supernatants from sonicated cell extracts by SDS-polyacrylamide gel electrophoresis. The gels in Fig. 3 show that the mutant proteins were overexpressed and present at levels comparable with that of wild-type His-Mor. Given the chemical diversity of the substitutions (Table 2) and their negligible effects on protein solubility and degradation (Fig. 3), we favor the interpretation that the defective phenotypes caused by the majority of these mutations are due to their interference with protein function.
Representative mutant His-Mor proteins were purified by His tag affinity chromatography and tested for their ability to bind to a wild-type middle promoter DNA fragment by using a gel retardation assay (Fig. 4). Substitutions at the most highly conserved positions Gly-65, Gly-66, and Tyr-70 abolished DNA binding (Fig. 4), making it most likely that their defects in transcription activation are due to their inability to bind to promoter DNA. Substitutions in Gln-68 also prevented DNA binding with the exception of Q68R, which retained some ability to bind DNA but failed to activate transcription. Mutations in the less conserved Gly-74 position conferred less defective phenotypes, and almost half of their mutant His-Mor proteins retained at least partial function for DNA binding and for transcription activation as observed in the colony color assay. Intriguingly, two substitutions at Gly-74 that exhibited significant DNA-binding activity were aromatic substitutions, G74W and G74F, whereas those with other substitutions, G74Y and G74S, failed to bind DNA.
To quantify the transcription activation ability of the mutant His-Mor proteins in vivo, we performed liquid β-galactosidase assays with the Pm-lacZ fusion as the reporter. The results are shown in Table 3. Consistent with their DNA-binding defects, mutants with substitutions at the most conserved positions Gly-65, Gly-66, and Tyr-70 were extremely defective, with β-galactosidase levels comparable with those of the Mor deletion (<10 units). The majority of Gln-68 mutants were similarly defective. Interestingly, for position Gly-74, the same two mutant proteins with aromatic substitutions, G74W and G74F, that retained some DNA-binding activity also exhibited more than 50% of wild-type levels of transcription activation. A summary of the β-galactosidase and DNA binding properties for the mutant proteins is given in Table 3.
A second complementary approach to identifying amino acids that are likely to be important for protein function is to examine the degree to which they are conserved in members of a protein family. In 2003 a BLAST similarity search (24) of the GenBank protein data base (25) identified 15 Mor/C family members (Fig. 1C, 15). In late 2010 a similar single-round BLAST search identified 90 family members that were then directly aligned on the NCBI server by using COBALT (26). The majority of the family members were located in complete prophages or prophage remnants in bacterial genome sequences. Although it is not feasible for us to determine which of these proteins function, it is possible to tally the amino acids at each position and identify those that occur frequently and others that are relatively rare. The rationale for this analysis is that amino acids that occur frequently are reasonably likely to retain function; whereas amino acids present rarely would more likely occur in mutant proteins that do not function.
Examination of the tally results shown in Table 4 reveals that the amino acids that were highly conserved in the earlier 15-member family (Fig. 1C) are extremely highly conserved in the current 90-member family (Table 4). They are Gly-65, Gly-66, Tyr-70, and Pro-72; at these positions there are only 3, 1, 0, and 1 nonconserved amino acids, respectively. For Gly-65 two of the three exceptions were found in the initial mutant isolation and produced white colonies and <10 units of β-galactosidase, supporting our hypothesis that rare amino acid changes result in nonfunctional mutant proteins. These exceptions also lead us to suspect that amino acids present three or fewer times are likely to be in defective proteins. The BLAST results for Gly-74 show that it is highly conserved, with Gly in 75 of 90 members, and the other 15 distributed among 7 different amino acids, with a maximum of four times for a single amino acid. These data alone would have led us to predict that these other 15 proteins would be entirely nonfunctional, but the colony colors in Table 2 and β-galactosidase values in Table 3 indicate that most of them retain partial Mor function. For Gln-68, Asn is the most frequent amino acid, being present 27 times, with Gln found 24 times and Ser found 14 times; the remainder of the substitutions were present from 1 to 6 times and contain Leu, Gly, and Ile substitutions, which were also found among the highly defective mutants (Table 2). Because Asn occurred so frequently, we constructed an Asn substitution mutant at position 68 of His-Mor; in contrast with the other mutations it made white colonies with red centers, retaining partial His-Mor function. Positions Val-69, Ile-71, and Pro-72 were not mutated because of their role in the hydrophobic cap on the dimerization domain. The BLAST results show that other hydrophobic amino acids are frequently present at positions Val-69 and Ile-71, consistent with that role.
Modeling studies demonstrated that the side chains of β-strand amino acids Gln-68 and Tyr-70 are favorably placed to interact with the minor groove of the GC-rich spacer between the inverted repeats in the Mor binding site. To investigate the significance of the proposed minor groove contacts to Mor-Pm interactions, we used gel mobility shift assays to test whether His-Mor could bind to the middle promoter in the presence of the GC-specific minor groove binding ligand, chromomycin A3, at concentrations known to inhibit binding of other proteins (27, 28). In the first experiment chromomycin was incubated with the promoter DNA and then followed by addition of His-Mor. Chromomycin at a concentration as low as 2 μm completely inhibited His-Mor binding to the promoter DNA and reduced Mor binding at 0.5 and 1 μm; only the chromomycin-DNA complex was present at, and above, 2 μm (Fig. 5A).
To ask whether chromomycin would be able to access the minor groove and displace His-Mor from a His-Mor·DNA complex, as observed previously for an EGR1-DNA complex (28), we conducted a second experiment in which the order of addition of His-Mor and chromomycin was reversed. When chromomycin was added to preformed His-Mor·DNA complexes, it was unable to completely disrupt the complexes even at concentrations as high as 50 μm (Fig. 5B), suggesting that binding of chromomycin and His-Mor is mutually exclusive, as it would be if they both interact with the minor groove. Similar results were obtained when the same experiments were performed with short oligo duplexes containing Pm sequences from −33 to −54 only, and thus including only the single minor groove and two flanking major grooves of the His-Mor binding site (data not shown). These results are consistent with our hypothesis that His-Mor interacts with the intervening GC-rich minor groove.
The crystal structure of His-Mor revealed a novel structural element, a 12-amino acid inter-domain linker containing a β-strand, which we proposed serves as a hinge needed for DNA binding of His-Mor (15). The results from this genetic analysis demonstrate that the highly conserved Tyr-70 residue in the β-strand and the Gly-65, Gly-66, and Gly-74 residues in the flanking loops play a significant role in DNA binding and, thus, transcription activation by His-Mor.
Docking of the crystal structure of His-Mor onto a 16-bp long B-DNA, permitting the HTH motifs to contact the DNA major grooves, revealed that the side chains of β-strand residues Gln-68 and Tyr-70 would be brought into close proximity to the minor groove and, therefore, might participate in DNA binding by His-Mor. The severe DNA-binding and transcription-activation defects of the mutant proteins with most substitutions at Gln-68 and Tyr-70 are consistent with this hypothesis. One possibility is that the Tyr-70 side chain intercalates into the DNA, causing the observed DNA bend. Tyrosine intercalation is found in multiple DNA-binding proteins, including bovine pancreatic DNase I (29), human 3-methyladenine DNA glycosylase (30), and MutY of Bacillus stearothermophilus (31) leading to DNA bends of 20, 22, and 55 degrees, respectively. It seems more likely that the amide side chain of Gln-68, and its chemically conserved substitution Asn-68, play a stabilizing role in interactions with the negatively charged sugar-phosphate backbone of the DNA.
In the DNA-free crystal structure of His-Mor, the tips of the predicted recognition helices of the HTH motif are unfavorably positioned to interact with two adjacent major grooves of DNA (15) (Fig. 2C). In particular, they are 63 Å apart, which is ~29 Å farther than the distance between the centers, and ~9 Å farther than the outermost edges, of two adjacent major grooves of B-form DNA. Structural changes that would allow His-Mor to assume an altered DNA-bound conformation were predicted to stem from the flexibility provided by the three highly conserved glycines flanking the β-strands (15). The mutational analysis presented here demonstrates that these glycines, Gly-65, Gly-66, and Gly-74, are important for His-Mor function. It also revealed the unexpected finding that some substitutions at Gly-74 retain partial His-Mor function. Given the lack of side chains in glycines and their known roles in the flexibility of other proteins (32–35), it is more likely that they play a role in the conformational change in His-Mor to allow it to bind to DNA than in making direct DNA contacts.
Sequence-specific DNA-binding proteins generally make base-specific contacts in the major groove due to its better accessibility and greater information content (36–38). Other proteins such as TATA-binding proteins and proteins with high mobility group domains, e.g. sex-determining region Y and Lef-1, exclusively contact bases in the minor groove (39–41). Finally, some transcription regulators make simultaneous contacts with both the major and minor grooves (17, 42–44). For example, proteins containing winged HTH motifs use their HTH motifs for major groove binding, and their “wings” to probe the flanking minor grooves (45–48). Similarly, transcription regulators in the LacI family utilize an HTH motif for major groove interactions and a hinge region connecting the domains to contact the intervening minor groove (43, 49). Despite the variation in the secondary structure elements making the minor groove contacts, these proteins induce DNA unwinding and bending toward the major groove by inserting amino acid side chains into the minor groove. In the case of DNase I (29) and human 3-methyladenine DNA glycosylase (30), it is tyrosine intercalation into the minor groove that results in widening of the minor groove, narrowing of the major groove and bending of the DNA toward the major groove.
Based on the crystal structure of the DNA-free His-Mor dimer, we proposed that His-Mor also uses two elements for DNA binding: the HTH motifs for major groove interactions and the inter-domain linker for binding to the intervening minor groove on the same face of the DNA. To address this possibility we assayed the effect of the GC-specific minor groove binding drug chromomycin A3 on His-Mor·DNA interactions. Consistent with our prediction, chromomycin was able to prevent His-Mor from binding to the DNA, but it is not clear that this occurred due to minor groove binding by His-Mor or narrowing of the adjacent major grooves as a consequence of chromomycin binding. Chromomycin did not inhibit DNA binding by the HSV ICP4 (infected-cell polypeptide 4) (earlier called IE175) major transcriptional regulator (50, 51), which from bioinformatic analyses is predicted to bind in the major groove via a helix-turn-helix DNA binding motif (52). However, chromomycin did prevent binding of proteins EGR1 and WT1 (Wilms tumor 1) known to bind in the major groove (28). It is not clear whether this is due to groove width changes from the widened minor groove causing narrowing of the major groove or because of structural changes it causes in the DNA immediately flanking its binding site (27, 53).
When we tested whether preformed His-Mor·DNA complexes could be disrupted by chromomycin, they were not, i.e. chromomycin was unable to access the GC-rich minor groove in preformed His-Mor·DNA complexes even at very high concentrations (50 μm). Just the opposite was observed for the zinc finger, major groove binding, transcription factor EGR1 (28). When chromomycin was added to preformed EGR1·DNA complexes, it was able to access the minor groove and disrupt the preformed complexes, even at the very low chromomycin concentrations (~1 μm) that blocked binding of EGR1 to its DNA-binding site (28). We suggest that the inability of chromomycin to disrupt the preformed His-Mor·DNA complexes indicates that chromomycin cannot access the minor groove when His-Mor is bound and, thus supports the hypothesis that Mor interacts with the minor groove as well as the two flanking major grooves. Providing additional support, binding of the Mu C protein to the late promoter, Pmom, blocked the minor groove accessibility to the minor groove-specific chemical nuclease, 1,10-phenanthroline (54). Taken together, these results support our hypothesis that His-Mor interactions with the minor groove of the GC-rich spacer in the Mor-binding site are essential for His-Mor·DNA binding and that the minor groove becomes inaccessible upon His-Mor binding.
In the crystal structure of the His-Mor dimer the hydrophobic residues Val-69, Ile-71, and Pro-72 were found in a “cap-like” structure at the top of the dimerization interface with their side chains extending down into the hydrophobic core (15). We thought that these residues would serve the same role in DNA-bound His-Mor, so we did not include them in the mutational analysis. The natural hydrophobic substitutions observed in the BLAST results for Val-69 and Ile-71 are consistent with this hypothesis. The essential nature of Pro-72, however, forces us to consider whether Pro-72 plays a different role in the DNA-bound form of His-Mor. Proline is unique among naturally occurring amino acids in the cyclization of its side chain to the backbone amide and its ability to adopt two conformations, cis and trans. It is known to function as a molecular hinge (55), mediating conformational transitions in a number of proteins (55–58). In His-Mor, Pro-72 is located at the C-terminal end of the β-strand (15), and its isomerization could act as a molecular switch inducing structural changes in His-Mor to confer, in combination with the conserved glycines, the DNA-bound form of His-Mor. Alternatively, the above structural changes in His-Mor could be independent of Pro-72 but its side chain might participate directly in His-Mor·DNA interactions, either by intercalation between bases or by interactions with the sugar-phosphate DNA backbone. Proline intercalation and the resulting ~90° DNA bend has been well documented in the bacterial histone-like proteins, integration host factor and HU, and also in eukaryotic high mobility group proteins (36, 59, 60). The smaller, ~45°, bend observed for His-Mor might occur if there is only partial intercalation. On the other hand, interactions between the proline side chain and sugar phosphate backbone does occur, for example, in DNA binding by the E. coli transcription regulator PutA (61). Thus, similar interactions between the side chain of Pro-72 and the sugar-phosphate DNA backbone might participate in His-Mor-DNA interactions and play a role in stabilizing the bending angle generated by His-Mor binding. Finally, it is also possible that Tyr-70 and Pro-72 both interact within the minor groove. There is precedence for such interactions, for example, in the 3-methyladenine DNA glycosylase (30) Tyr-162 intercalates between the bases, resulting in widening of the minor groove, which is then filled by Met-164 and Tyr-165 (30).
Results from this mutational and homolog identification analysis favor the model for middle promoter binding by His-Mor in which the conformational changes in His-Mor stemming from the inter-domain linker region and the structural changes in DNA due to His-Mor-minor groove interactions are crucial for DNA binding and transcription activation by His-Mor. Given the high degree of sequence conservation in the β-strand linker region among the Mor/C family members, this mode of DNA binding is likely to be common in this family.
We acknowledge the Molecular Resource Center of the University of Tennessee Health Science Center for determining the DNA sequences reported herein.
*This work was supported, in whole or in part, by National Institutes of Health Grant GM-35642, the Harriett S. Van Vleet Chair of Excellence in Virology, Harriett S. Van Vleet Chair of Excellence in Microbiology and Immunology, the College of Medicine at the University of Tennessee Health Science Center, and National Science Foundation Grants MCB-0318108 and MCB-0418108.
4Y. Mo and M. M. Howe, unpublished results.
3The abbreviations used are: