|Home | About | Journals | Submit | Contact Us | Français|
The transcription-repair coupling factor (TRCF, the product of the mfd gene) is a widely conserved bacterial protein that mediates transcription-coupled DNA repair. TRCF uses its ATP-dependent DNA translocase activity to remove transcription complexes stalled at sites of DNA damage, and stimulates repair by recruiting components of the nucleotide excision repair pathway to the site. A protein/protein interaction between TRCF and the β-subunit of RNA polymerase (RNAP) is essential for TRCF function. CarD (also called CdnL), an essential regulator of rRNA transcription in Mycobacterium tuberculosis, shares a homologous RNAP interacting domain with TRCF and also interacts with the RNAP β-subunit. We determined the 2.9-Å resolution X-ray crystal structure of the RNAP interacting domain of TRCF complexed with the RNAP-β1 domain, which harbors the TRCF interaction determinants. The structure reveals details of the TRCF/RNAP protein/protein interface, providing a basis for the design and interpretation of experiments probing TRCF, and by homology CarD, function and interactions with the RNAP.
Transcribing RNA polymerase (RNAP) molecules stalled at sites of DNA damage elicit preferential repair of the DNA in a process called transcription-coupled repair (TCR) (1,2). Paradoxically, the stalled RNAP molecules are inhibitory to DNA repair in vitro (3), pointing to the role of additional factors mediating TCR in vivo. In bacteria, transcription-repair coupling factor (TRCF), the product of the mfd gene (4,5), was shown to be necessary and sufficient for TCR in vivo and in vitro. TRCF plays two key roles in mediating TCR: (i) relief of transcription-dependent inhibition of nucleotide excision repair (NER) by recognition and ATP-dependent removal of a stalled RNAP covering the damaged DNA and (ii) stimulation of DNA repair by recruitment of the Uvr(A)BC endonuclease (3,6,7)
TRCF is a large (130kDa), evolutionarily conserved, multi-functional protein with a complex structure/function relationship. The 3.2-Å resolution X-ray crystal structure of Escherichia coli (Eco) TRCF comprises a compact arrangement of eight structured domains [D1a, D1b and D2–D7) linked by flexible linkers (8); Figure 1A]. These domains are arranged in functional modules that perform the various TRCF functions. Recruitment of the NER machinery through binding of the NER component UvrA is accomplished by the TRCF UvrB homology module (D1a, D1b and D2) (6,8–10). RNAP binding is mediated by the TRCF RNAP interacting domain [RID; (8)]. The ATP-dependent double-stranded DNA translocase activity is due to the translocation module [TD1 and TD2; (8)], which contains the seven signature sequence motifs of superfamily 2 helicases/ATPases (11). All of these activities are repressed in isolated TRCF through an interdomain interaction between D2 and the C-terminal D7 domain (12,13).
Critical among the TRCF functional modules is the RID, a Tudor-like domain that mediates protein/protein interactions between TRCF and RNAP that are essential for the RNAP release function (8,14) (Figure 1A). The interaction between the TRCF–RID and RNAP: (i) recruits TRCF to the site of the stalled RNAP; (ii) is thought to trigger the conformational changes associated with derepression of the TRCF activities (8) and (iii) provides an anchor for TRCF on RNAP, allowing the TRCF DNA-tracking translocation activity to exert forces on the DNA that are thought to cause collapse of the transcription bubble within the RNAP ternary elongation complex (TEC), resulting in RNAP and transcript release (15).
Taken together, biochemical, as well as yeast and bacterial two-hybrid analyses have identified an RNAP β-subunit segment, Eco β residues 19–142 (β19–142), as being sufficient for TRCF–RID interaction (8,14,16). Amino acid substitutions in both the TRCF–RID (Eco TRCFL499R) and in the RNAP β-subunit (Eco β I117A, K118A or E119A) disrupt the TRCF/RNAP protein/protein interaction and cause defects in the RNAP release activity of TRCF (8,14).
The TRCF–RID is homologous to a domain widely distributed in bacteria (17) and grouped as the CarD_TRCF protein family (Pfam: PF02559). CarD (also called CdnL) (18) is an essential regulator of rRNA transcription in Mycobacterium tuberculosis (Mtb) that is upregulated in the general stress response and plays a key role in persistence and pathogenesis (19). The CarD N-terminal domain (NTD) shares striking sequence homology with the TRCF–RID and interacts with the RNAP β-subunit in the same manner (18,19).
In this work, we determined a 2.9-Å resolution X-ray crystal structure of a complex between the Thermus thermophilus TRCF–RID and the T. aquaticus (Taq) RNAP β-subunit β1 domain (which harbors the TRCF–RID interaction determinant). The details of the protein/protein interface explain the effects of amino acid substitutions in both the TRCF–RID (8) and the RNAP-β1 domain (8,14) that cause defects in the protein/protein interaction, and guides the design of new substitutions tested herein to further elucidate the interaction. The structure also reveals a local conformational change in the β1 domain upon binding the TRCF–RID. This work provides a basis for the design and interpretation of experiments probing TRCF function and interactions with the RNAP, as well as the function and RNAP interactions of other members of the CarD_TRCF protein family.
The DNA encoding Tth HB27 TRCF–RID (residues 321–387) was amplified by the polymerase chain reaction (PCR) using primers that appended NdeI and HindIII sites at the 5′- and 3′-ends, respectively. The PCR-amplified DNA fragment was cleaved with NdeI and HindIII and cloned between the NdeI and HindIII sites of a pET28a-derived plasmid, creating pET28a TthHB27(His)6MfdRID. PCR was used to amplify and fuse the DNA encoding Taq β1a (Taq β17–139) and β1b (Taq β334–395). Primers were designed to introduce a –Gly–Gly– linker between the two β1 segments and NdeI and BamHI restriction endonuclease sites were appended to the 5′- and 3′-ends of the fused DNA fragment, respectively. The resulting DNA fragment was cleaved with NdeI and BamHI and cloned between the same sites of pET21a (Novagen), creating pET21a Taq β1. All DNA manipulations were confirmed by DNA sequencing.
The plasmids pET21a Taq β1 and pET28aTth HB27Mfd-RID were transformed simultaneously into Eco BL21 (DE3) cells (Novagen) and transformants were grown at 37°C in Luria–Bertani media supplemented with ampicillin (200µg/ml) and kanamycin (50µg/ml) to an A650nm between 0.6 and 0.8. Subsequently, ampicillin (100µg/ml) and isopropyl-β,d-thiogalactopyranoside (1mM final concentration) were added to the culture. After incubation at 37°C for 3h, the cells were harvested by centrifugation, resuspended in buffer A [20mMTris–HCl (pH 8.0 at 4°C), 200mM NaCl, 5%(v/v) glycerol, 0.5mM β-mercaptoethanol], lysed using a continuous-flow homogenizer (Avestin), and then centrifuged to remove insoluble debris. The clarified cell lysate was applied to a Ni2+-charged HiTrap column (GE Healthcare) equilibrated in buffer A+5mM imidazole. The column was washed with five column volumes (cv) of buffer A+20mM imidazole, 5cv buffer A+40mM imidazole, 5cv buffer A+60mM imidazole and finally 5cv buffer A +80mM imidazole. Proteins bound to the column were eluted with buffer A+250mM imidazole. After overnight cleavage with PreScission protease (GE Healthcare) to remove the (His)6-tag and dialysis against buffer A+5mM imidazole, a subtractive Ni2+-chelating chromatographic step removed uncleaved (His)6TRCF–RID and the cleaved (His)6-tag. The sample was concentrated and applied to a Superdex 75 gel filtration column (GE Healthcare). Finally, the purified sample was concentrated to 10.6mg/ml and exchanged into storage buffer (10mM Tris–HCl[pH 8.0 at 4°C], 100mM NaCl, 5mM DTT). The purity of the complex was judged to be >95% as analyzed by sodium dodecyl sulfate polyacrylamide gel electrophoresis and Coomassie blue staining (data not shown).
Crystals were grown by hanging-drop vapor diffusion by mixing 2µl protein solution (10–12mg/ml in storage buffer) with 1µl crystallization solution (0.1M Tris–HCl [pH 7.5 at 22°C], 1.6M di-potassium ammonium phosphate), and incubating over a well containing crystallization solution. Crystals (10–20µm octahedra) grew in ~1 week. The crystals were prepared for cryo-crystallography by stepwise transfer into 0.1M Tris–HCl [pH 7.5 at 22°C], 100mM NaCl, 7M ammonium formate, then frozen in liquid nitrogen. X-ray diffraction data sets (Table 1) were collected at the Advanced Photon Source (Argonne National Laboratory) NE-CAT 24-ID-E beamline, using an MD-2 microdiffractometer with a 20µm aperture. Because of the small size of the crystals, the data quality degraded before a full data set from a single crystal could be collected, but partial data sets were collected from nine separate crystals. The crystals appeared to belong to space group P43212, but subsequent analysis indicated the crystals were hemihedrally twinned and belonged to space group P43. One data set was chosen as a reference (crystal a, Table 1). The additional data sets were processed in two orientations (h, k, l and −h, k, −l), and were then combined, one at a time, and kept or discarded, depending on whether the overall data improved, resulting in the final combined data set from partial data sets collected from three separate crystals (Table 1).
Using the structure of the Taq RNAP-β1 (Taq RNAP-β[17–195 and 334–395]) as a search model, a molecular replacement solution containing four copies in the asymmetric unit was obtained using CNS (20). Additional molecular replacement searches to locate the TRCF–RID, using a homology model of the Tth TRCF–RID based on the Eco TRCF–RID (8) as a search model, using either CNS or BRUTEPTF (21) were unsuccessful. Nevertheless, maps phased from the Taq RNAP-β1 molecular replacement solution alone, combined with density modification using CNS, revealed clear electron density for the TRCF–RIDs (Supplementary Figure S1A). After iterative rounds of building and minimization to 2.9Å, the final model was refined to an R/Rfree of 0.228/0.250 using Refmac5 (22). Initial rounds of refinement incorporated tight non-crystallographic symmetry (NCS) restraints (with two NCS groups, the TRCF–RID and RNAP-β1). Final rounds kept loose NCS restraints and also utilized TLS refinement (with two TLS groups, the TRCF–RID and RNAP-β1).
Plasmids pAC λCI-Eco β, pAC λCI-Eco TRCF, pAC λCI-Tth β and pAC λCI-Tth TRCF encode residues 1–236 of the bacteriophage λCI protein fused to residues 19–142 of the Eco RNAP β-subunit, residues 472–603 of Eco TRCF, residues 10–133 of the Tth RNAP β-subunit or residues 314–444 of Tth TRCF, respectively, under the control of the IPTG-inducible lacUV5 promoter. Plasmids pBR α-Eco β, pBRα-Eco TRCF, pBRα-Tth β and pBRα-Tth TRCF encode residues 1–248 of the Eco RNAP α-subunit fused to residues 19–142 of the Eco RNAP β-subunit, residues 472–603 of Eco TRCF, residues 10–133 of the Tth RNAP β-subunit or residues 314–444 of Tth TRCF, respectively, under the control of tandem lpp and IPTG-inducible lacUV5 promoters (8). Substitutions Tth RNAP-β1Q99R, Tth TRCF–RIDR341L, Eco TRCF–RIDL499R, Eco RNAP-β1 R101Q and R101 E, were introduced into the appropriate plasmid by PCR. Plasmid pBRα encodes wild-type α and plasmid pACλCI encodes the bacteriophage λCI protein (23).
FW102 OL262 reporter strain cells were transformed with the indicated plasmids. Individual transformants were selected and grown in LB supplemented with carbenicillin (100µg/ml), kanamycin (50µg/ml), chloramphenicol (25µg/ml), and the indicated concentration of IPTG. β-galactosidase assays were performed as described earlier (24) using microtitre plates and a microtitre plate reader. β-Galactosidase activity (reported as Miller Units) was calculated as described earlier (24).
The TRCF/RNAP protein/protein interaction was discovered and characterized studying the Eco system (8,10,14). Nevertheless, biochemical experiments suggested that the Eco TRCF–RID/RNAP-β1 complex was relatively unstable and might not be suitable for structural studies. For instance, while we could detect and study the Eco TRCF–RID/RNAP-β1 interaction using the highly sensitive bacterial two-hybrid system, we did not observe a stable interaction in vitro using typical biochemical methods, such as affinity isolations (8). However, in other experiments, we observed evidence for robust interaction between Tth TRCF and Tth RNAP (L.F. Westblade, B.T. Chait and S.A. Darst, unpublished data), suggesting that the Thermus TRCF–RID/RNAP-β1 interaction is suitably stable for X-ray crystallographic studies. The two protein domains are well conserved; the Tth TRCF–RID is 40% identical in sequence to the Eco TRCF–RID and the Thermus RNAP-β1 domain is 50% identical to Eco RNAP-β1 (Figure 2A). We chose to crystallize the complex of only the protein domains required for the TRCF/RNAP protein/protein interaction (i.e. the Tth TRCF–RID and Taq RNAP-β1 domains) to increase the chances of obtaining crystals of the complex that diffract to high resolution.
While the RNAP-β1 domain is a well-folded structural domain with a distinct hydrophobic core, a complication for our strategy was that the β1 structural domain is not contiguous in the RNAP β-subunit sequence. In Thermus RNAP, the β1 domain comprises β residues 17–139 (β1a) and 334–395 [β1b; (25,26)]. The RNAP mutants that disrupt the TRCF/RNAP protein/protein interaction (Eco β117–119/Tth β108–110) (8,14) lie in the larger β1a, but this segment alone is unlikely to form a stable, well-folded domain suitable for structural studies. Fortunately, within the RNAP structure, the C-terminus of β1a (residue 139) and the N-terminus of β1b (residue 334) are only 5.4Å apart (Cα–Cα distance), allowing them to be connected by a –Gly–Gly– linker. We therefore sub-cloned the Tth TRCF–RID (Tth TRCF321–387) into a pET28a-based co-expression cassette (27) along with the –Gly–Gly– linked segments of the Taq RNAP-β1 domain (Taq RNAP-β[17–130]–GG–[334–395]), or expressed the two proteins simultaneously from separate plasmids. The Taq and Tth RNAP-β1 domains are 93% identical/98% homologous over 185 residues, and all of the β1 residues that interact with the TRCF–RID are identical between Taq and Tth (Figure 2A).
Upon co-expression, both the Tth TRCF–RID and the –Gly–Gly– linked segments of the Taq RNAP-β1 domain assembled into a stable, soluble complex that was purified and crystallized as small (15–20µm width) octahedra. SDS–PAGE and mass-spectrometric analyses confirmed that the crystals contained both proteins. Subsequent optimization procedures yielded slightly larger crystals (25µm width). Despite the small size of the crystals, X-ray diffraction to ~5Å-resolution was observed at various synchrotron sources, where the smallest achievable beamsize was ~50µm in diameter. Diffraction data <3Å-resolution was collected at a microdiffractometer beamline, using a collimated 20µm diameter beam. Individual crystals succumbed to radiation damage before complete data sets could be collected, but a data set to 2.9-Å resolution was obtained by combining partial data sets from three separate crystals (Table 1).
The hemihedrally twinned crystals belong to the space group P43. The structure was solved by molecular replacement using the Taq RNAP-β1 fragment as a search model. Four solutions were eventually found, consistent with four copies of the heterodimeric complex in the asymmetric unit. Additional molecular replacement searches including a homology model of the Tth TRCF–RID (20,28), and even ‘brute-force’ phased translation searches (21) failed to position the TRCF–RID. Nevertheless, electron density maps calculated from the RNAP-β1 molecular replacement solution alone showed good density corresponding to the TRCF–RID (Supplementary Figure S1A). An atomic model of the TRCF–RID/RNAP-β1 complex was built and refined to a twinned R/Rfree of 0.228/0.250 at 2.9-Å resolution (Table 1 and Figure 1A).
The TRCF–RID/RNAP-β1 structure reveals the expected 1:1 heterodimer (Figure 1). The four crystallographically independent heterodimers are all very similar in structure; the maximum root-mean-square-deviation (rmsd) in α-carbon positions when comparing 190 positions within well-defined regions of the structure (excluding flexible loops) was 0.41Å.
The two proteins engage in a complex involving interfaces containing the residues found previously to be important for the protein/protein interaction. This includes Tth TRCF–RIDR341 [corresponding to Eco TRCF–RIDL499; (8)] and Taq RNAP-β1I108/K109/E110 [corresponding to Eco RNAP-β1I117/K118/E119; (8,14)] (Figures 1, ,2A2A and B). Formation of the complex results in the burial of a modest 555Å2 of otherwise exposed surface area.
The structural core of the RNAP-β1 domain can be described as a four-stranded antiparallel β-sheet buttressed on one face by α-helices (Figure 1). The TRCF–RID comprises a Tudor-like fold, a highly bent, five-stranded antiparallel β-sheet that folds into a barrel-like roll (8). The C-terminal β-strand of the β1 domain (β1 β-strand 4, Figure 1B) and the penultimate TRCF–RID β-strand (TRCF–RID β-strand 4, Figure 1B), are exposed at the edge of the individual domain structures (Figure 1). In the TRCF–RID/RNAP-β1 complex, these two edge-exposed β-strands interact to form an antiparallel, intermolecular β-sheet that extends across the two separate proteins (Figure 1).
The Tth TRCF–RID in the TRCF–RID/RNAP-β1 complex is very similar in structure to the Eco TRCF–RID, except that the N-terminal and C-terminal β-strands (TRCF–RID β-strand 1 and β-strand 5, Figures 1B and and3A)3A) are disordered in all four crystallographically independent copies of the Tth TRCF–RID. Thus, the extended, intermolecular β-sheet in the TRCF–RID/RNAP-β1 complex comprises seven β-strands, rather than the expected nine (Figures 1B, B,3A).3A). Analysis of the TRCF–RID/RNAP-β1 structure indicates that the intermolecular crystal packing is incompatible with the presence of the TRCF–RID β-strand 1 and β-strand 5 (Supplementary Figure S2), suggesting that crystal packing forces induced partial unfolding of the TRCF–RID. This may explain the small size of the crystals. Superimposition of the ordered portion of the Tth TRCF–RID with the Eco TRCF–RID yields an rmsd of 1.15Å over 32 α-carbon positions (Figure 3A).
Although the overall structure of the RNAP-β1 domain in the TRCF–RID/RNAP-β1 complex is similar to the RNAP-β1 domain in the context of RNAP (rmsd of 0.97Å over 114 well-defined α-carbon positions; Figure 3B), a significant, local conformational change was observed (Figure 3B–E). The conformational change entails a ‘register shift’ of β1 β-strand 4 with respect to β-strand 3 (Figure 3B–E). In all available bacterial RNAP structures, the register of the β1 domain β-sheet is such that L98 pairs with E110 (L98:E110) and L100:I108 (Figure 3D and E, left). In the β-side view of the RNAP (looking down on the β-subunit from outside the active site channel, the perspective seen in Figure 3D and E), the side-chains of I108 and E110 point down away from the viewer, while the side-chain of K109 points up towards the viewer (Figure 3D and E, left). On the other hand, in the complex with the TRCF–RID, the register is such that L98:D111 and L100:K109. The side-chains of I108 and E110 now point up towards the viewer, while K109 points down away from the viewer (Figure 3D and E, right). The overall register shift involves residues 103–111. In the RNAP without TRCF, E110 or D111 is ‘pinched out’ of β1 β-strand 4, while in the complex with TRCF, these residues are incorporated into the β-strand (Figure 3D and E). The shift in β1 β-strand 4 is accommodated in the flexible loop connecting β-strand 3 and β-strand 4 (residues 103–105). Since this is an unusual conformational change and the resolution of our analysis (2.9Å) is modest, we performed test refinements in which the RNAP-β1 β-strand 4 register was modeled as in the available RNAP structures. The results of these tests confirmed that the RNAP-β1 β-strand 4 register shift was not the result of a mistracing (Supplementary Data).
Although this conformational change is very localized, it must play an important role in the TRCF–RID interaction with the RNAP β1 domain, since the residues involved in the shift and reorientation are exactly the residues of RNAP that make critical interactions with the TRCF–RID, corresponding to Thermus I108, K109 and E110. We presume that the RNAP-β1 fluctuates normally between these two conformational states, and that the TRCF–RID binds to and stabilizes the state observed in the TRCF–RID/RNAP-β1 structure (Figure 3D and E, right).
The predominant interaction in the TRCF–RID/RNAP-β1 interface occurs across the extended, intermolecular β-sheet and involves van der Waals as well as polypeptide backbone hydrogen bonding between RNAP-β1 βstrand4 and TRCF–RID βstrand4 (Figures 1 and and2).2). On the β1 domain, this interface extends from Taq RNAP-β1 G106 to E110, and includes the three RNAP residues shown to be important for the TRCF–RID/RNAP protein/protein interaction, corresponding to Thermus I108/K109/E110 (8,14). On the TRCF–RID, this interface extends from Tth TRCF–RID G359 to P364. TRCF–RID residues 361–364 comprise a well-conserved motif, f-o-f-P, where ‘f’ stands for hydrophobic aliphatic residue (I, L or V), and ‘o’ stands for aromatic (Y or F) (Figure 2A) (8). TRCF–RID β-strand 2 and β-strand 3 extend past the end of TRCF–RID β-strand 4 and arch over the top of the RNAP-β1 domain, affording additional interactions. Y350 (from TRCF–RID β-strand 3) is highly conserved as an aromatic residue (Figure 2A), and makes van der Waals contacts with RNAP-βI108. Finally, TRCF–RIDR341 (from TRCF–RID β-strand 2) makes polar contacts with RNAP-β E110 as well as with Q99 (Figure 2). Interestingly, Tth TRCF–RIDR341 corresponds to Eco TRCF–RIDL499, which is critical for the TRCF/RNAP protein/protein interaction (8), but is poorly conserved, as is one of the residues it interacts with, RNAP-βQ99 (Figure 2A). Examination of the sequence alignment, however, reveals a strong correlation between the identity of these two residues (corresponding to Tth TRCF–RIDR341 and Tth RNAP-βQ99) across phyla (Figure 2A, columns highlighted in cyan and green). Figure 2A shows a limited set of 21 sequences, but the rules denoted below were determined from a more extensive alignment containing more than 100 sequences from actinobacteria, cyanobacteria, deinococcus-thermus, firmicutes, fusobacteria, planctomycetes, proteobacteria and spirochaetes:
This suggests that conceptually, the TRCF–RID/RNAP-β1 interface can be thought of as bipartite. A central, relatively conserved set of contacts across the intermolecular β-sheet, primarily between TRCF–RID β-strand 4 and RNAP-β1 β-strand 4 is relatively conserved across all phyla. On the other hand, a peripheral interaction between residues corresponding to Tth TRCF–RIDR341 and Tth RNAP-β1Q99 may occur in a phylum-specific manner.
This hypothesis makes four predictions for the results of single amino acid substitutions in the Thermus or Eco proteins:
We tested these predictions using the bacterial two-hybrid assay with the individual protein domains (TRCF–RID and RNAP-β1; Figure 4A). The bacterial two-hybrid assay previously established the minimal domains required for the TRCF/RNAP protein/protein interaction, and demonstrated that single amino acid substitutions in either the TRCF–RID or the RNAP-β1 domains disrupt the interaction (8). The results show that predictions 1–3 are fulfilled (Figure 4B–D, Supplementary Figure S3B and C), and prediction 4 is partially fulfilled (Supplementary Figure S3D):
The specific protein/protein interaction between TRCF and RNAP is key to TRCF function and regulation in bacterial TCR. TRCF does not recognize DNA damage per se, it recognizes sites of DNA damage by proxy through the protein/protein interaction with elongating RNAP stalled at sites of damage. Furthermore, the TRCF/RNAP protein/protein interaction is thought to trigger the conformational changes in TRCF that are necessary for derepression of key TRCF activities, UvrA binding and ATP-dependent DNA translocase activity (8,12,13). Finally, the TRCF/RNAP protein/protein interaction provides an anchor point for TRCF on RNAP, allowing the TRCF DNA-tracking translocation activity to exert forces on the DNA that are ultimately responsible for releasing the RNAP from the DNA template and RNA transcript, enabling NER (15). In this work, the structural basis for the TRCF/RNAP protein/protein interaction has been elucidated with the 2.9-Å resolution X-ray crystal structure of a complex between the interacting domains (the TRCF–RID and the RNAP-β1 domain) necessary and sufficient for the protein/protein interaction.
The structural details of the complex explain many aspects of previous mutagenesis data for both binding partners. An amino acid substitution was identified within the Eco TRCF–RID, L499R, that abolished an activity of TRCF essential for the displacement of stalled elongating RNAP, but did not abolish DNA binding, ATP binding or ATP-hydrolysis (8). The simplest interpretation of these results was that the L499R substitution disrupted the protein/protein interaction between the TRCF–RID and the RNAP. This interpretation was supported by previous bacterial two-hybrid experiments (8) as well as in this work (Figure 4C, compare lanes 4 and 5). In the structure of the Tth TRCF–RID/Taq RNAP-β1 complex, Tth TRCF–RIDR341 (corresponding to Eco TRCF–RIDL499; Figure 2A) forms a hydrogen-bond with RNAP-β1Q99 as well as salt-bridge/hydrogen-bond interactions with RNAP-β1E110 (Figure 2B and C). Although the residue at this position of the TRCF–RID (corresponding to Tth TRCF–RID341) is not conserved across species, its identity is correlated with the RNAP-β1 residue corresponding to Tth RNAP-β199 (Figure 2A).
Amino acid substitutions were identified in three consecutive residues of RNAP-β1 that abrogated the RNAP displacement function of TRCF. The effect of these substitutions was interpreted to mean that these three residues participate in the TRCF/RNAP protein/protein interaction (14). This interpretation was supported by bacterial two-hybrid experiments (8). In the structure of the Tth TRCF–RID/Taq RNAP-β1 complex, these three RNAP-β1 residues are central to the protein/protein interface (Figure 2B and C). In addition to participating in backbone hydrogen bonding that mediates the formation of the intermolecular β-sheet (Figures 2B and C), the side chain of I108 makes extensive van der Waals contacts with TRCF–RID Y350 and Y362, the side chain of K109 makes van der Waals contacts with TRCF–RID residues (primarily TRCF–RIDL361), and the side chain of E110 participates in polar interactions, primarily with TRCF–RIDR341 (Figure 2B and C). Thus, qualitatively, the finding that substitutions at these three positions of the RNAP-β1 domain cause defects in the TRCF/RNAP protein/protein interaction (8,14) is explained by the structure.
Smith and Savery (14) found that substitutions in Eco TRCF corresponding to Tth TRCF–RID I108A and K109A had only mild effects on the TRCF/RNAP protein/protein interaction, while the E110A substitution had a more severe effect. Quantitatively, these results are more complicated to reconcile with the structure. Given the extensive participation of RNAP-β1I108 in van der Waals contacts with TRCF (Figure 2B and C), one might expect the I108A substitution to have a more severe defect. Moreover, it is difficult to see from the structure why lysine is so relatively well conserved at position 109. Also, despite the apparent importance of RNAP-β1E110, it is relatively poorly conserved (Figure 2A). These observations may be reconciled, however, by the finding that the RNAP-β1 undergoes a local conformational change in the complex with the TRCF–RID (Figure 3B–E). The conformational change involves a register shift of RNAP-β1 β-strand 4 and includes I108-E110. Thus, the effects of amino acid substitutions in this region of RNAP-β1 may affect the direct interaction with the TRCF–RID, but can also influence the interaction through its effect on the conformational equilibrium of RNAP-β1 β-strand 4. An amino acid substitution in RNAP-β1 that favored the conformational state of RNAP-β1 seen in the RNAP structures (Figure 3D and E, left) over the state seen in the complex with TRCF (Figure 3D, right) would introduce an apparent defect in the interaction with TRCF by reducing the population of RNAP competent to bind TRCF, even if that residue did not directly contact TRCF. In this case, it could be misleading to interpret the results of mutagenesis studies on the TRCF/RNAP protein/protein interaction by considering only the final TRCF–RID/RNAP-β1 structure.
The observed conformational change in RNAP-β1 raises another, more interesting issue; whether the conformational state of RNAP-β1 could be controlled allosterically as a consequence of the functional state of the RNAP. Could the TRCF-binding conformation of RNAP-β1 be favored in stalled elongating RNAPs, marking it as a target for TRCF function? A priori, marking stalled elongating RNAPs is not necessary if the rate of TRCF-mediated RNAP release is much slower than elongation. In general, this appears to be the case; TRCF-mediated RNAP release occurs on a time scale of minutes (8,14,16), while nucleotide addition by RNAP can occur on a time scale of milliseconds (29). As a rule, then, TRCF would not be kinetically competent to disrupt actively elongating RNAPs in vivo unless they were stalled.
Several crystal structures of elongating Tth RNAP are available. In one, the elongation complex was in the post-translocated state, but transcription elongation was stalled by nucleotide deprivation (30). In principal, such an RNAP elongation complex stalled by nucleotide deprivation would be subject to TRCF release (16), but the conformational state of RNAP-β1 in both crystallographically independent complexes is essentially identical to that observed in all other bacterial RNAP structures (to date, more than 30 crystallographically independent complexes), where RNAP-β1 is not in the TRCF-binding conformation. Taken together, consideration of kinetic parameters and available structural evidence suggests that the conformational change in the RNAP-β1 domain does not serve as an allosteric signal for stalled RNAPs, but such a scenario cannot be completely ruled out. There may be situations in vivo, such as at regulatory transcriptional pause sites (31), where the kinetics of TRCF-mediated RNAP release could compete with RNAP elongation, and TRCF-mediated release of these paused elongation complexes might be disadvantageous for the cells. From a structural viewpoint, the conformational change in RNAP-β1 could conceivably occur in an elongation intermediate that has not been trapped in any crystal structures. The TRCF–RID/RNAP-β1 structure presented here provides a basis for designing experiments to address this question.
In addition to the previous mutagenesis studies of (14), we tested additional amino acid substitutions at two positions (corresponding to Tth TRCF–RID341 and Thermus RNAP-β199) that are not conserved but appear to be correlated with each other in a phylum-specific manner (Figure 2A). As a rule, the cross-species interactions of the wt Eco and Tth proteins were relatively weak compared to the correct binding partners (Figure 4B–D, compare lane 1 and 4). In three cases (Tth TRCF–RIDR341L, Figure 4B; Eco TRCF–RIDL499R, Figure 4C; Tth RNAP-β1Q99R; Figure 4D), the predicted mutations at the correlated positions improved the cross-species interactions (Figure 4B–D, compare lane 1 and 2). In one case (Eco RNAP-β1 R101Q or R101 E), the predicted mutation had little effect (Supplementary Figure S3). In this case, the effects of this substitution in the Eco RNAP-β1 on the TRCF–RID/RNAP-β1 interaction may be complicated by possible effects of the substitution on the RNAP-β1 conformational change that occurs upon TRCF–RID binding.
Using the apo-TRCF crystal structure, a model of the RNAP TEC, and additional constraints, a preliminary model for the TRCF/TEC assembly was constructed (8). The model was preliminary since it was concluded that the conformation of TRCF observed in the crystal structure may not to correspond to the active conformation (8). Indeed, it is now established that the conformation of the apo-TRCF observed in the crystal structure corresponds to a repressed state in which the UvrA binding determinants are occluded, ATPase activity is very low, and DNA translocase activity is essentially nonexistent (12,13). Derepression of these TRCF activities is expected to be associated with profound conformational rearrangements of the TRCF domains with respect to each other. It is thus not surprising that the orientation of the TRCF–RID with respect to the RNAP-β1 domain in the TRCF–RID/RNAP-β1 crystal structure is quite different from the previous TRCF/TEC model (Supplementary Figure S4). Because the conformation of the derepressed TRCF associated with the TEC is expected to be very different from the repressed conformation of the TRCF crystal structure, it is not fruitful to update the TRCF/TEC model by superimposing the RID in the repressed TRCF crystal structure onto the RID from the TRCF–RID/RNAP-β1 structure.
CarD has been identified as an essential Mtb protein that is induced by DNA damage and starvation, and controls rRNA transcription through a direct interaction with the RNAP (19). CarD is a widely conserved, two-domain protein (17) with an NTD with striking sequence similarity to the TRCF–RID (19), and a C-terminal domain of unknown structure. Moreover, the TRCF–RID-like CarD-NTD is sufficient for RNAP interaction, which, like the TRCF–RID, is targeted to the RNAP-β1 domain (19). Thus, the Tth TRCF-RID/Taq RNAP-β1 structure serves as an excellent model for understanding the CarD/RNAP protein/protein interaction.
A host of accessory factors directly interact with the RNAP to modulate every step of the transcription cycle. As the structural delineation of accessory factor/RNAP interactions progresses, a handful of RNAP structural features have emerged as regulatory ‘hot spots’, such as the β-flap [NusA (32); σ factors (33–35); bacteriophage T4 AsiA (36); T4 gp33 (37); bacteriophage λ Q (38)], the secondary channel [Gre-factors, (39–41); DksA (42,43)], and the β′ clamp helices [σ (34,35); RfaH (44)]. The interaction of the RNAP-β1 domain with the TRCF–RID detailed here, along with the analagous CarD–RID interaction, point to the RNAP-β1 domain as another hot spot for RNAP regulation.
The TRCF–RID/RNAP-β1 crystal structure presented here reveals the structural details of the TRCF/RNAP protein/protein interaction, which is key for TRCF function and regulation. Structural analysis also reveals a local conformational change in the RNAP-β1 in the TRCF–RID-bound state. The effects of amino acid substitutions in RNAP-β1 on this conformational change must be taken into account when interpreting the results of protein interaction assays. This structure provides a basis for the design and interpretation of experiments probing TRCF and CarD function and interactions with the RNAP.
Structure coordinates and structure factors have been deposited in the Protein Data Bank under ID code 3MLQ.
Supplementary Data are available at NAR Online.
NE-CAT beam lines of the Advanced Photon Source (APS), supported by (award RR-15301) from the National Center for Research Resources at the National Institutes of Health; Use of the APS is supported by the United States Department of Energy, Office of Basic Energy Sciences, under (contract No. W-31-109-ENG-38); National Institutes of Health (RR00862 and RR022220, to B.T. Chait); (GM073829, to S.A.D.). Funding for open access charge: Laboratory funds.
Conflict of interest statement. None declared.
The authors thank M. Glickman, K. Rajashankar, N. Savery and C. Stallings for helpful discussion and advice on the manuscript, P. G. Devi for help with plasmid construction and B. T. Chait for access to mass spectrometry facilities.