In order to investigate how MTERF1 is able to recognize specific sequences in the mitochondrial genome as well as the mechanism by which it can modulate transcription we decided to crystallize the full-length human protein (minus the N-terminal mitochondrial localization sequence; ) bound to the leu-tRNA sequence responsible for termination of transcription to which MTERF1 binds with high affinity (Nam and Kang, 2005
). We expressed and purified human MTERF1 from E. coli cells and obtained crystals of the protein in complex with a 22-mer double stranded substrate (residues 3232 to 3253 of the human mitochondrial DNA). In order to solve the structure taking advantage of Multiwavelength Anomalous Dispersion phasing (Hendrickson and Ogata, 1997
) we also crystallized MTERF1 in complex with a double stranded DNA containing eight 5-bromocytosine residues (see Experimental Procedures and for data collection and refinement statistics). The native crystals diffracted to 2.2Å and contained one molecule of MTERF1 bound to DNA in the asymmetric unit. The final density was of sufficient quality to build most of the protein (residues 73 to 396; and 1S
) and all 22 base pairs of DNA.
Data collection, phasing and refinement statistics
MTERF1 adopts a fold that is very different from previously proposed models (Fernandez-Silva et al., 1997
; ). Consistent with predictions based on its primary sequence (Fernandez-Silva et al., 1997
; Roberti et al., 2006
), MTERF1 displays a modular architecture. Dali (Holm et al., 2008
) and DejaVu (Kleywegt and Jones, 1997
) were unable to identify significant structural homology with any protein in the PDB. MTERF1 is an all-alpha-helical protein composed of 19 α-helices and 7 310
helices that is structured around a motif (two α-helices followed by a 310
helix; hence referred to as the mterf motif) that is repeated throughout the structure (). This structural arrangement is remarkably similar to that seen in other all-alpha helical regions of proteins such as repeats of armadillo (ARM), HEAT (Groves and Barford, 1999
) and PUM/PUF motifs (Lu et al., 2009
). In all these cases the protein fold consists of a number of repeats of more or less conserved α-helical motifs. Interestingly, PUM/PUF motifs are the basis of RNA binding in PUMILIO proteins (Lu et al., 2009
) while the HEAT motifs in DNA PK-cs have been suggested to play a role in double stranded DNA binding (Sibanda et al., 2010
The MTERF1 fold contains 8 mterf motifs and an additional distorted motif in the C-terminus (). Within each motif, several hydrophobic residues create a hydrophobic core between the two α-helices (). The abundance of hydrophobic residues is what makes the primary sequence of MTERF1 similar to a leucine-zipper protein, but no association could be found between the predicted motifs (Fernandez-Silva et al., 1997
; Roberti et al., 2006
) and any differential structural feature. Hydrophobic interactions are also frequently observed between mterf motifs, but are far less abundant. This suggests a certain rigidity of the individual motifs but relative flexibility between motifs and thus of the overall fold.
DNA binding and unwinding
MTERF1 binds to the double-stranded substrate containing the termination sequence (Figure 2S
) as a monomer and adopts a binding mode resulting in a remarkably large footprint on the DNA molecule. MTERF1 covers twenty base pairs even though binding does not strongly alter the curvature of the DNA duplex (). Taking advantage of its specialized architecture, MTERF1 binds along the major groove of DNA. DNA binding by MTERF1 imposes a slight bend (25°) in the DNA duplex (). More importantly, while the ends of the DNA duplex remain in a mostly B-DNA conformation, the DNA structure of the central part of the recognition sequence is heavily distorted. Binding by MTERF1 appears to decrease the twist of the DNA duplex, resulting in significant DNA unwinding (). Moreover, MTERF1 binding promotes partial duplex melting: the central part of the DNA contains three nucleotides that are everted from the double-helix ().
Each of the nucleotides everted from the double-helix is stabilized by the protein via a stacking interaction () and by hydrogen bonds to the base and phosphate (). In order to investigate this point, we constructed and purified a triple R162A/F243A/Y288A mutant in which all stacking interactions that presumably stabilize the conformation observed in the structure would be eliminated. We decided to crystallize this mutant in complex with the termination sequence. We obtained crystals that diffracted to 2.8Å (see ) and were able to solve the structure by molecular replacement using the wt MTERF1 structure as a model (see Experimental Procedures). Inspection of the structure showed that the triple mutant is folded exactly as WT MTERF1 (rmsd of 0.6Å for 324 C-α atoms; ). Importantly, the protein is bound to the termination sequence in an identical manner as the wt protein: the DNA backbone in the mutant structure presents the same bend observed in the wt structure and the double-helix is equally unwound. However, removal of the stacking interactions altered the conformation of all three everted nucleotides (). Two of them, the adenine and thymine (green and blue in ) are now flipped back into the double helix. A third one, the cytosine (yellow in ), while not in the same conformation as in the wt structure, is still everted from the double helix. It is now occupying the position that the Arg162 side chain occupies in the wt structure. The cytosine base appears to be stabilized by contacts to protein backbone atoms but, unlike in the wt structure, it does not take advantage of π-π stacking interactions and therefore this conformation does not appear as favorable.
These observations demonstrate that stacking interactions are essential to stabilize the everted nucleotides and that the protein is actively promoting base-flipping. Furthermore, the fact that backbone distortion and unwinding is still present in the mutant structure indicates that MTERF1 binding by itself (at least in a sequence-specific context) unwinds and kinks the DNA duplex. This backbone distortion is independent of base-flipping, although by destabilizing the central base pairs it is likely essential to facilitate it. It is interesting to note that Arg162 is the only one of the three stacking residues that is universally conserved in MTERF1 proteins (). This suggests that differences must exist in the way MTERF1 associates with DNA in some of these species. It also perhaps suggests that the role of the three stacking residues in promoting base-flipping is not equivalent.
DNA backbone interactions and unspecific DNA binding
MTERF1 establishes numerous interactions with the DNA duplex. The backbone of both strands is bound along positively charged grooves in the protein surface (). Most of the interactions that are observed in the structure are electrostatic in nature and are established with the phosphate groups of the DNA strands (). This type of interaction does not impart any sequence specificity. Each of the mterf motifs contributes DNA backbone interactions. The interactions with the light strand appear to be much more numerous (), consistent with the reported stronger affinity of the protein for the mitochondrial light strand (Nam and Kang, 2005
The number of sequence-independent interactions suggested that MTERF1 should be able to bind double stranded DNA regardless of sequence. To investigate this point, we performed isothermal titration calorimetry (ITC) experiments. As expected, MTERF1 was able to bind to its specific recognition sequence and did so with near 1:1 stoichiometry (), further indicating specific binding. Binding was significantly affected by the salt concentration, consistent with the number of protein-DNA electrostatic interactions. Because of the tendency of the protein to aggregate at lower salt concentrations in the absence of DNA we performed all subsequent experiments at 200 mM KCl, even though at this salt concentration MTERF1 exhibits weaker binding (see Experimental Procedures and ). Our measurements also demonstrate that MTERF1 is capable of binding a double stranded DNA of arbitrary sequence, although with significantly lower affinity than for its specific recognition sequence ( and 3SA
). Importantly, the lower DNA to protein ratio of binding indicates that MTERF1 does not preferentially associate with the DNA duplex in a particular conformation, consistent with the lack of sequence specificity. Instead, it can bind to different regions of the duplex and as a consequence more than one MTERF1 molecule can simultaneously bind the same substrate molecule. To eliminate the possibility that accidental sequence similarity with the wt sequence could result in residual specific binding and affect the measurements, we repeated them with two additional substrates: a homopolymeric DNA (Figure 3SC
) and a completely random sequence (not shown). The results led in both cases to identical conclusions ().
Nucleotide eversion is essential for stable DNA binding
The difference in binding affinity between unspecific and specific binding can be rationalized from the crystal structure. In a sequence specific context as observed in the structure, MTERF1 binds DNA in a distorted conformation, where each of the three nucleotides everted from the double-helix is stabilized by the protein via a stacking interaction () and, in addition, by hydrogen bonds to the base and phosphate (). It was therefore tempting to speculate that while MTERF1 can bind to any double stranded DNA molecule by virtue of its electrostatic surface, the distorted conformation observed in the crystal structure is only adopted upon recognition of a specific sequence and is responsible for stable specific binding. In order to investigate this, we performed DNA binding measurements using the triple R162A/F243A/Y288A mutant to analyze its ability to bind to the termination sequence. As expected from the structure, the affinity for DNA of the triple mutant is significantly reduced () and is similar to that of wt MTERF1 for an unspecific DNA sequence (), indicating that nucleotide eversion is essential for stable binding. Furthermore, the triple mutation does not seem to affect binding to the oligonucleotide of arbitrary sequence in a significant way ( and 3SB
), strongly supporting the idea that base eversion is only a feature of sequence-specific binding. Importantly, since the triple mutant structure indicates that bending and unwinding of the DNA still take place, the increase in affinity can be entirely attributed to the ability of the enzyme to promote base-flipping.
Mechanism of sequence recognition
The observed binding stoichiometry when titrating the triple mutant () implies that despite lower binding affinity the mutant still conserves site-specificity (i.e., the stoichiometry is still close to 1). Moreover, the structure indicates that it binds DNA just like the wt protein. This is also consistent with the fact that the stacking residues would appear to contribute little to sequence recognition. Because most of the protein-DNA interactions observed in the structure are to the DNA backbone, there are only a handful of interactions that appear capable of discriminating against a particular sequence. Sequence recognition thus seems to be mediated by specific hydrogen bonding to the DNA major groove. Five arginine residues (blue in ) establish base interactions that are likely to determine sequence-specificity. Three of these (Arg 169, Arg350 and Arg387) simultaneously hydrogen-bond to N7 and O6 of a guanine base (). Arg202 hydrogen bonds to two adjacent guanines in opposite strands (), while Arg 251 hydrogen bonds to O6 of a single guanine and to N7 of the adjacent adenine. These arginine residues (black dots in ) are conserved in MTERF1, as are the nucleotides that they recognize in the mitochondrial DNA (). Interestingly, these arginines are not conserved in other mterf proteins (), consistent with their different sequence specificity.
To assess the role of these residues in determining sequence specificity, we decided to construct individual arginine to alanine substitutions. ITC measurements () allowed us to conclude that these residues play a role in sequence recognition. Interestingly, only R387A appears to have completely lost specific binding (/D). All other mutants appear to preserve some sequence specificity but, with the possible exception of R350A, they all show lower binding affinity for the termination sequence than the wt protein. This suggests that some of these residues might play an important role in determining the conformations observed in the crystal structure, and also indicates that the importance of each of these residues for DNA binding and sequence recognition is not equal, prompting us to analyze their ability to promote transcriptional termination.
Implications for transcriptional termination
To analyze the functional importance of the different protein-DNA interactions we studied the ability of the different proteins to promote transcriptional termination in a reconstituted in vitro system. We adapted the assay utilized by Asin-Cayuela and colleagues (Asin-Cayuela et al., 2005
) and generated a substrate for run-off transcription from the HSP where the termination sequence (the 22-mer sequence used for crystallization) has been inserted 100 bp downstream from the promoter (see Experimental Procedures). As can be seen in , TFAM, TFB2M and POLRMT generate a unique run-off transcription product on this substrate, but addition of MTERF1 results in appearance of a specific termination product. Our results are essentially equivalent to previously published data (Asin-Cayuela et al., 2005
), suggesting that our purified MTERF1 is active and confirming that, at least in vitro, MTERF1 only displays moderate termination activity. It has however been previously shown that the termination activity of MTERF1 is substantially stronger when the termination sequence is inverted as to simulate termination of LSP-initiated transcription (Asin-Cayuela et al., 2005
). We decided to analyze transcriptional termination in this orientation as well. As can be seen in , termination in this orientation is indeed much more robust, perhaps in agreement with the observed pattern of interactions and the stronger affinity of MTERF1 for the light strand (Nam and Kang, 2005
). In contrast, as expected from the binding data, no termination was observed with the triple mutant even when a substantial excess of protein was added to the reaction (, RFY). Unexpectedly, none of the five arginine mutants were able to promote termination (). The higher signal observed in the reverse orientation allowed us to quantify the termination activity. The results () indicate that while wt MTERF1 achieved an average of near 75% termination over the course of a twenty minute reaction, the triple mutant only supported residual termination (<5%). The arginine mutations, irrespective of DNA binding affinity, were equally unable to support termination (<5%), with only some termination activity observed for R350A (17.7%).
Termination activity of WT and mutant MTERF1
This supports the functional importance of the main interactions suggested by the crystal structure. Since the triple mutation solely affects nucleotide eversion, our data clearly demonstrate the critical importance of base-flipping for transcriptional termination. They also indicate that while there is a correlation between DNA binding affinity and termination activity, the role of some residues appear to be more important for termination than for binding. Because all arginine-guanine interactions appear to be essentially equivalent, this suggests that, beyond the importance of the interaction itself, some or all of these residues might also be essential to enable the conformation observed in the wt structure. Additional studies will be needed to establish their individual roles.
A model for DNA binding
Based on our results, we can recapitulate the events leading to specific MTERF1 binding and propose a model for how this protein promotes transcriptional termination. We have shown that MTERF1 is able to interact with DNA in a sequence-independent manner, suggesting that the protein probes the mitochondrial DNA randomly in search of its recognition sequence. Several of our observations suggest that unspecific DNA binding is likely to be structurally different from specific binding. Our data demonstrate that base-flipping determines the higher binding affinity observed for the specific termination sequence. This strongly suggests that base-flipping does not take place when binding to an unspecific sequence. Moreover, the stacking interactions that stabilize base-flipping do not appear to be responsible to discriminate against unspecific sequences. Thus the wt protein would be likely to promote base-flipping regardless of sequence if the DNA duplex was destabilized as in the wt structure. This would imply that sequence-unspecific binding is unlikely to lead to the DNA conformation observed in the wt structure, suggesting a need for MTERF1 to adopt different conformations. Finally, our termination results suggest that the five arginine residues involved in sequence recongition might play an important role in determining these events. We therefore propose a model for MTERF1 binding () wherein binding to the correct sequence results in the six key arginine-guanine interactions being established (perhaps sequentially), which leads to a concurrent protein conformational change that bends and unwinds the DNA duplex. This in turn promotes the unstacking of three nucleotides, which are then stabilized in an extrahelical conformation by three stacking interactions. Our results indicate that base-flipping is essential for stable binding, suggesting that MTERF1 takes advantage of this mechanism to increase the stability of its interaction with DNA. Since we have shown that base-flipping is essential to promote transcriptional termination, this suggests that this activity is simply related to the ability of MTERF1 to successfully prevent the RNA polymerase from displacing it from the termination sequence. Similarly, acting as a temporary roadblock might be the mechanism by which MTERF1 participates in controlling mitochondrial replication.
Pathogenic mutations in the leu-tRNA
Our structures allow us to precisely determine which nucleotides of the mitochondrial DNA are contacting MTERF1 while binding to its termination sequence. In addition, the mechanism by which MTERF1 recognizes its binding sequence implies that the identity of bases far away from the center of the target sequence are critical for sequence recognition. Based on these observations, we surveyed a collection of pathogenic mitochondrial DNA mutations that occur in the MTERF1 binding sequence of the leu-tRNA and examined their potential to interfere with binding and transcriptional termination. Nine mutations (two of them on the same nucleotide; ) fall in the sequence recognized by MTERF1. It is important to note that the pathogenic effects of these mutations might be simply due to an alteration on the leu-tRNA structure, and this appears indeed to be the case for the A3243G mutation (Sasarman et al., 2008
). A3243G is frequently identified in MELAS patients and has been previously shown to slightly reduce MTERF1 binding (Hess et al., 1991
) and proposed to interfere in vitro with transcriptional termination (Hess et al., 1991
; Chomyn et al., 1992
). We can understand these effects based on the crystal structure. A3243 is one of the three everted nucleotides () and a transition mutation would maintain the basic structure of the purine ring, likely allowing the same conformation observed with the wt termination sequence. However, the mutation would substitute an A:T base pair by a stronger G:C base pair, which might explain the slight decrease in binding. In addition, the O6 of the guanine base would be located in close proximity of a phosphate oxygen (O2P in A3242), perhaps contributing to destabilization of the wt conformation. Our measurements confirm that A3243G results in slightly decreased binding ( and 3SE
). A similar binding affinity was observed for the A3243T mutation (). In this case the A:T base pair is preserved, but the larger purine ring is now located in the heavy strand, which would likely lead to a steric clash with Asn199 (). All remaining mutations would not be expected to severely conflict with MTERF1 sequence recognition, except for G3242A and G3249A. G3242 interacts in the structure with Arg251, while G3249 forms a double hydrogen bond with Arg387 (). Both of these mutations would eliminate a guanine-arginine interaction: the hydrogen bond between O6 of the guanine and the amino group of the arginine could not be formed in the mutant DNA. Consistent with the mild decrease in binding observed in the R251A mutant, G3242A resulted in weak or no effect on binding. However, the G3249A mutation appears to completely eliminate specific DNA binding ( and 3SE
), consistent with the effect of the R387A mutation.
Pathogenic DNA mutations in the mitochondrial termination sequence
To further analyze the functional consequences of these mitochondrial DNA mutations, we carried out termination assays using both the forward and reverse orientations of the termination sequence. The results () are mostly consistent with the DNA binding data. As expected, only residual termination is observed in both orientations with the G3249A mutation. In addition, consistent with what was observed with the R251A mutant, the G3242A mutation, while not substantially altering binding, results nevertheless in a strong decrease in termination. The A3243G mutation moderately reduces termination, while termination on A3243T is only marginally weaker than for the wt sequence. The reduction in termination observed for A3243G is consistent with what was previously reported (Hess et al., 1991
). Unexpectedly, one additional mutation appears to moderately affect termination: G3244A, while not affecting DNA binding, results in a moderate but reproducible effect on termination. Interestingly, no specific interaction is observed in the crystal structure with any of the nucleotides in this base pair. However, it is adjacent to the extrahelical A:T base pair and displays a high base pair buckle (see Experimental Procedures). It is possible that the strength of the G:C base pair in the wt sequence is important to maintain this conformation and that the G3244A mutation thus leads to a slight destabilization of the bound MTERF1 that is sufficient to make it more readily displaced by the RNA polymerase.
We have reported the structure of human MTERF1. Our structure suggests that the mterf proteins constitute a family of dedicated DNA binding proteins, as can be readily concluded from inspection of the protein fold and the relatively high conservation between family members (). In addition, our results suggest a unique mechanism of sequence recognition and binding where sequence-independent binding is followed by sequence recognition, resulting in a conformational change leading to unwinding of the DNA double-helix and base flipping. Base flipping appears to be critical to confer on MTERF1 the ability to stably bind to a specific site in DNA. Stable DNA binding, in turn, is necessary for the ability of this protein to promote transcriptional termination. Since base flipping does not affect the conformation of the DNA duplex, our results suggest that MTERF1 promotes termination by interfering with the elongation machinery. In this respect it is interesting to stress that, as was previously observed (Asin-Cayuela et al., 2005
), MTERF1 appears to be much more efficient at promoting termination from the LSP than from the HSP. This might reflect the need to prevent transcription from the LSP from interfering with rRNA synthesis, and is consistent with the stronger affinity for the light strand (Nam and Kang, 2005
) and the large number of interactions established with this strand. While our structure does not address if or how MTERF1 can form the MtDNA loop implicated in rRNA synthesis, the extensive protein-DNA interaction surface observed in the structure indicates that MTERF1 would not be able to simultaneously associate with two DNA duplexes. This therefore implies that it is unlikely that the loop can be mediated by a single MTERF1 molecule. On the other hand, the partial duplex melting observed in the structure, together with the preferential binding to the light strand and relatively low number of interactions with the heavy strand, suggest a mechanism by which MTERF1 or other mterf proteins could facilitate transcriptional initiation by contributing to initial duplex melting. In this respect it is important to note that Phe243 appears to be conserved in MTERF2 and MTERF3 (Linder et al., 2005
; ), and that MTERF2 has been suggested to positively regulate transcriptional initiation (Wenz et al., 2009
). Finally, it is easy to imagine how this binding fold can be utilized to recognize different sequences given that only five guanine residues appear to be essential for sequence recognition. An entirely different specificity could perhaps be obtained by alteration of the handful of residues responsible for sequence recognition. In this respect, it is interesting to note that both the modular architecture of MTERF1 and its sequence recognition mechanism are highly reminiscent of the PUF family of RNA binding proteins, where recognition of individual bases can be modulated by mutation of residues at key positions (Lu et al., 2009
). These similarities also perhaps suggest that the mterf fold is not necessarily restricted to binding double stranded DNA.
Base flipping is a key feature of the interaction of MTERF1 with DNA. Base-flipping was identified as a mechanism used by HhaI methyltransferase to access its substrate (Kimasauskas et al., 1994
). Since then, a large number of base-flipping enzymes have been identified that take advantage of this mechanism (Reinisch et al., 1995
; Roberts and Cheng, 1998
; Huffman et al., 2005
). It is commonly employed by different DNA repair and replication proteins that require access to a specific base in the DNA duplex (Huffman et al., 2005
). Because nucleotide eversion is not energetically favorable, in these proteins, as in MTERF1, the everted base is usually stabilized by π-stacking interactions. Base-flipping is usually utilized either to help recognize specific features of the DNA molecule or to assist in the catalytic mechanism, usually to gain access to a substrate that is otherwise buried in the DNA double-helix. In this case, base-flipping appears to be employed solely to stabilize the protein on DNA and make it more difficult to be displaced. The fact that this mechanism can be successfully utilized for a completely different purpose further illustrates the frequent recycling of structural motifs and solutions that has occurred throughout evolution.
An important conclusion derived from the crystal structure is that two pathogenic mutations that interfere with arginine-guanine interactions involved in sequence recognition by MTERF1 result in a strong impairment in the ability of MTERF1 to terminate transcription. G3242A does not substantially affect binding, but very strongly reduces the ability of MTERF1 to terminate transcription, while G3249A appears to completely eliminate specific binding by MTERF1 and consequently severely impairs termination. Their strong effect on termination suggests that the pathogenic effects of the G3249A and G3242A MtDNA mutations might be at least partially mediated by interfering with the activity of MTERF1 at the leu-tRNA site. The G3249A mutation leads to the development of a variant of Kearns-Sayre syndrome (Seneca et al., 2001
), a mitochondrial myopathy, while the G3242A mutation has been associated with an uncharacterized mitochondrial disorder (Mimaki et al., 2009
). It has been previously suggested that transcriptional deregulation might be one of the mechanisms leading to mitochondrial dysfunction. This is consistent with the fact that mice deficient in MTERF2 develop features of mitochondrial myopathies (Wenz et al., 2009
). Further experiments will be needed to address whether the G3249A and G3242A mutations result in transcriptional alterations in vivo and if this effect is sufficient to explain their clinical phenotype.