Overall Structure of DIM-5
We used recombinant DIM-5 protein (residues 17 to 318 of Protein Data Bank accession number AF419248) for crystallographic studies (see Experimental Procedures). Electron density maps were calculated using multiwave-length anomalous diffraction data from three intrinsic zinc ions (). A model of DIM-5 was built and refined to 1.98 Å resolution with a crystallographic R factor of 0.205 and Rfree value of 0.258. The final model includes 1913 protein atoms (with mean B values of 26.9 Å2), 3 zinc ions, and 103 water molecules, with rms deviations of 0.008 Å and 1.5° from ideality for bond lengths and angles, respectively.
| Table 1Summary of X-Ray Diffraction Data Collection |
Our structural determination on DIM-5 allowed us to perform a structure-guided sequence alignment of SET proteins () that includes human SUV39 family proteins, all verified active HKMTs reported so far, and three bacterial SET proteins. The 318 residue DIM-5 protein is the smallest member of the SUV39 family. It contains four segments: (1) a weakly conserved amino-terminal region (light blue), (2) a pre-SET domain (yellow) containing nine invariant cysteines, (3) the SET region (green) containing signature motifs of NHXCXPN and DY (magenta), and (4) the post-SET region (gray) containing three invariant cysteines. The 9 Cys pre-SET region is unique to the SUV39 family, while the post-SET region is also present in many members of SET1 and SET2 families (
Kouzarides, 2002), and even in one bacterial SET protein from
Xylella fastidiosa (). Two active human HKMTs contain neither pre- nor post-SET regions: SET7 (
Wang et al., 2001a) (also called SET9 [
Nishioka et al., 2002a]) methylates lysine 4 of histone H3 and SET8 (
Fang et al., 2002) (also called PR-SET7 [
Nishioka et al., 2002b]) methylates lysine 20 of H4.
The pre-SET residues (yellow) form a 9 Cys cage enclosing a triangular zinc cluster (). The SET residues (green) are folded into six β sheets surrounding the catalytic methyl transfer site (magenta), with a helical cap (αF) above the β sheets. The amino-terminal residues (light blue) appear to be critical to the structural integrity of the molecule: the 38 residue segment extends through nearly the entire back of the molecule in the orientation shown (), providing an edge strand (β1, β2, or β3) to three separate β sheets and a 1 turn helix αA connecting to the pre-SET triangular zinc cage. The overall dimensions of the molecule are 60 × 50 × 30 Å. The triangular zinc cluster and the cofactor binding site are approximately 38 Å apart, located at opposite ends of the molecule along the longest dimension (). A cleft can be seen running across from the cofactor binding site to the zinc cluster ().
The Pre-SET Domain Forms a Triangular Zinc Cluster
The pre-SET domain contains nine invariant cysteine residues that are grouped into two segments of five and four cysteines separated by various numbers of amino acids (46 in DIM-5). These nine cysteines coordinate three zinc ions to form an equilateral triangular cluster (). Each zinc ion is coordinated by two unique cysteines (six total), and the remaining three cysteine residues (C66, C74, and C128) are each shared by two zinc atoms, thus serving as bridges to complete the tetrahedral coordination of the metal atoms. The distance between zinc atoms is ~3.9 Å, and the Zn-S distance is ~2.3 Å. A similar metal-thiolate cluster can be found in metallothioneins that are involved in zinc metabolism, zinc transfer, and apoptosis (reviewed in
Vasak and Hasler, 2000). Methallothioneins often have two metal clusters: a (Me)
3Cys
9 and a (Me)
4Cys
11, where Me can be Zn
2+, Cd
2+, Cu
2+, or another heavy metal. The tri-zinc cluster of DIM-5 can be superimposed perfectly upon the (Zn
2Cd)Cys
9 cluster of rat metallothionein (
Robbins et al., 1991) (not shown). The significance of this apparent similarity is unclear.
The SET Domain Forms the Active Site
The SET domain resembles a square-sided β barrel topped by a helical cap (αF, αG, αH, and αI). Four β sheets—(1↑ 5↑ 6↓), (7↑ 16↓), (4↓ 14↑ 15↓ 8↑), and (3↑ 9↑ 11↓ 10↑)—form the sides of the barrel and one sheet—(2↓ 12↓)—forms one end (). In the middle of the open end of the barrel is a crossover structure (magenta) formed by threading the β17-loop through an opening formed by a short loop between strands β13 and β14. This brings together the two most-conserved regions of the SET domain: the αJ-β13-loop (N241HXCXPN247) and β17-loop (DY283) (). The side chains of these two highly conserved segments are involved in (1) hydrophobic structural packing (I240 of αJ and L279 and F281 of β17), (2) intramolecular side chain-main chain interactions (after a sharp turn at P246, the side chain of N247 interacts with the main chain carbonyl oxygen of E278 and the main chain amide nitrogen of T280), (3) AdoMet binding site and active site formation (R238 and F239 of αJ, N241:E278 pair, H242:D282 pair, and Y283). These invariant residues are clustered together, via pair-wise interactions such as the interactions between N241 and E278 and between H242 and D282, forming an active site in a location immediately next to the AdoMet binding pocket and peptide binding cleft (see below).
Enzymatic Properties of DIM-5
The DIM-5 protein is a very active HKMT in vitro. We noticed several rather unusual properties of DIM-5. (1) Under our laboratory conditions, the enzyme is most active at ~10°C and nearly inactive at 37°C (). (2) DIM-5 is extremely sensitive to salt, e.g., 100 mM NaCl inhibited its activity about 95% (). (3) The enzyme has a high pH optimum. DIM-5 showed maximal activity at ~pH 9.8 (), although it showed strongest crosslinking to AdoMet around pH 8 (). Neither HKMT activity nor AdoMet binding were observed below pH 6.0.
Cofactor Binding Pocket
All known HKMTs use AdoMet as the methyl donor. The most common conformation of AdoMet, or its reaction product AdoHcy, is found in the so-called consensus MTases. These MTases are built around a mixed seven-stranded β sheet, and they include more than 20 structurally characterized MTases acting on carbon, oxygen, or nitrogen atom in DNA, RNA, protein, or small molecule substrates (
Cheng and Roberts, 2001). DIM-5 does not share structural similarity to any of these AdoMet-dependent proteins and appears to use a completely different means of interaction with its cofactor.
A difference electron density is observed in an open pocket on one end of the DIM-5 molecule opposite from the triangular zinc cluster ( and ). We interpret this density as the cofactor product, AdoHcy, which was present during crystal growth (see Experimental Procedures). Although part of the AdoHcy can be fit into the density (not shown), it is difficult to fit the entire molecule, particularly because there is no recognizable density for the adenine ring of AdoHcy. This could potentially reflect flexibility of the cofactor bound to DIM-5. Unlike the “consensus” MTases where AdoMet/AdoHcy binds in a relatively closed pocket with hydrophobic stacking on both sides of adenine ring (
Fauman et al., 1999), the density we observe is located in an open pocket, sitting above the antiparallel strands β5 and β6 and against the short helix αJ (). This environment may contribute to its flexibility or allow multiple conformations in the absence of substrate. The flexibility may also result from low pH during crystallization (pH 5.4–5.6), a condition in which no UV crosslinking of AdoMet to the protein was observed (). At low pH the adenine ring might not interact stably enough with DIM-5 to be crosslinked to the protein or observed in the structure.
The significance of this density is further enhanced by the highly conserved residues with which it is surrounded. Two conserved arginines (R155 of β5 and R238 of αJ) and three aromatic residues (W161, Y204, and F239) directly contact the density (). The side chains of these two arginines are locked in place by other conserved residues: the guanidino group of R155 is parallel to the plane of the W161 indole ring and ion pairs with D35; and the guanidino group of R238 is surrounded by three aromatic rings, F43, F239, and Y204, and its two terminal nitrogen atoms (N
![[sm epsilon]](/corehtml/pmc/pmcents/x220A.gif)
and Nη2) form hydrogen bonds to the main chain carbonyl oxygen atoms of G230 and E231, respectively ().
We made conservative substitutions for several of the residues surrounding this density: R155H, W161F, Y204F, and R238H (see Experimental Procedures). The enzymatic activities of all the mutants were reduced ranging from a 75% reduction (W161F) to nearly inactive (R238H) (). The ability of these mutants to bind AdoMet, as measured by crosslinking, was also reduced but not abolished (). It appears that the reduced AdoMet binding alone could account for the reduction in HKMT activity for the R155H, W161F, and Y204F mutations. The R238H mutation, however, caused a much greater reduction in HKMT activity than in AdoMet binding, suggesting that R238 may also play roles in other aspects of catalysis (see below). In SUV39H1 and SUV39H2, a histidine is in the position of R238 in DIM-5; changing this histidine to an arginine resulted in at least 20-fold increase of activity in SUV39H1 (
Rea et al., 2000), consistent with the greatly reduced activity in the converse R238H mutants of DIM-5.
Putative Peptide Binding Cleft
The cleft along the surface emanating from the presumed cofactor binding site is the likely binding site for the substrate polypeptide (). One side of this cleft is formed by strand β10 (green in )—the outermost strand of the β sheet (3↑ 9↑ 11↓ 10↑)—and the other side is formed by the loop after strand β17, which is the beginning of the disordered carboxy-terminal residues (286–299).
Structural studies have shown that heterochromatin protein HP1 binds to a methylated histone H3 peptide by inserting it as an antiparallel β strand between two HP1 strands, forming a hybrid three-stranded β sheet (
Jacobs and Khorasanizadeh, 2002;
Nielsen et al., 2002). Encouraged by the fact that one side of the DIM-5 cleft is a strand (β10), we superimposed the HP1 β strand (
Drosophila HP1 residues 60–62) onto DIM-5 strand β10 (residue 205–207) (). The superimposition placed the H3 peptide (e.g., Q5-S10 as observed in HP1) in the DIM-5 cleft () and residues Y283-V284-N285 following strand β17 on the other side of the peptide (). An induced-fit mechanism is used in HP1, in which the amino-terminal tail of the free HP1 adopts a β strand-like conformation upon interacting with the H3 peptide (
Nielsen et al., 2002). In a similar way, binding of the H3 peptide may induce residues Y283-V284-N285 of DIM-5 and subsequent disordered residues to adopt a more stable β strand conformation that interacts with the peptide to form a hybrid sheet.
Target Lysine Binding Site
The most interesting result of the docking experiment is the placement of the target K9 immediately next to the presumed cofactor binding site () with the target nitrogen atom occupying the position of a water molecule (site 2 in ). We propose that water site 2 is the likely active site of DIM-5, where the terminal amino group (NH
3) of the substrate lysine would form a hydrogen bond with main chain carbonyl oxygen atom of R238. Many highly conserved residues, mainly from the two signature motifs (magenta), surround this site. Side chains of N241, H242, Y283, and Y204 form an inner circle immediately around site 2 (). Residues E278, D282, and Y178 form an outer circle via interactions with the inner-circle residues: E278 interacts with N241, D282 interacts with H242, and Y178 interacts with Y283 via a water molecule (site 4) (). Unlike protein arginine MTases or small molecule glycine N-MTase, which uses acidic residue(s) to neutralize the positive charge on the substrate amino group (
Fu et al., 1996;
Zhang et al., 2000), no acidic residue is immediately present in the proposed active site of DIM-5. Nevertheless, the combination of the negative dipole moment at the carboxyl end of helix αJ (R238 and F239), the negatively polarized main chain carbonyl oxygen atoms (I240 and W161), the side chain hydroxyl oxygen atoms of Y178, Y204, and Y283, and the asparagine oxygen atom of N241 might increase the nitrogen electron density enough to allow a nucleophilic attack on the AdoMet methylsulfonium group. The proton elimination step in conjunction with the methyl transfer is likely accomplished through a charge relay system involving H242 and D282, much as in protein arginine MTases (
Zhang et al., 2000).
One observation consistent with this mechanism is the unusually high optimal pH (~10) of DIM-5 (), despite the fact that AdoMet binding is much more favorable in solutions of lower pH (). At pH 10, the amino group of target lysine (with a typical pKa value of 10) may be partially neutralized and the conserved tyrosines Y283, Y204, and Y178 (also with typical pKa values of 10) near the active site may be deprotonated; both deprotonations would facilitate methyl transfers.
The importance of the proposed active site residues is supported by site-directed mutagenesis experiments. Conservative changes at three residues (N241Q, H242K, and Y283F) immediately surrounding water site 2 essentially abolished HKMT activity (). Y283F has the lowest residual activity, suggesting that the hydroxyl group of Y283 is critical; it is hydrogen bonded to the backbone amide nitrogen of I240 and immediately adjacent to water site 2. Y283 is also one of the most conserved residues of the SET domain, being invariant in most of the SET-containing proteins in the Pfam database. Mutations of the two residues proposed to be involved in proton elimination, H242K and D282N, abolished and reduced HKMT activity, respectively. As expected, both mutants retained AdoMet crosslinking, though at reduced levels (). The complete loss of AdoMet crosslinking in N241Q and Y283F mutant proteins is somewhat unexpected. For N241Q, perhaps the longer glutamine side chain prevents the two hydrogen bonds forming between the side chain amino group of N241 and both the backbone carbonyl of W161 and the side chain of E278 (). Interrupting the W161-N241-E278 interactions probably disrupts local structure, having a more deleterious effect than the replacement of side chain in the W161F mutant. It is also possible that both N241 and Y283 interact with the adenine ring of AdoMet, which is likely involved in the UV crosslinking, although not observed in the crystal.
The presumptive active site of DIM-5 is reminiscent of the consensus NPPY motif involved in the amino-methylation of adenine or cytosine in DNA (
Blumenthal and Cheng, 2001;
Goedecke et al., 2001;
Gong et al., 1997) and of the glutamine in peptide release factor (
Heurgue-Hamard et al., 2002;
Nakahigashi et al., 2002). Remarkably, the invariant N241 and Y283 of DIM-5 are superimposable onto the first and the last amino acids of NPPY in
TaqI DNA adenine MTase (). This suggests a potential similarity in the catalytic mechanism between histone lysine MTases and DNA amino-MTases. In the latter case, the amino group (NH
2) that becomes methylated is positioned for an in-line attack on AdoMet by hydrogen bonding to the backbone carbonyl connecting the two inflexible prolines (
Goedecke et al., 2001). The equivalent backbone carbonyl in DIM-5 is probably that of R238. Since this particular carbonyl needs to be relatively immobile to hinder the free rotation of the amino group bound at site 2, the great reduction of HKMT activity in R238H mutant could be the result of a small change or flexibility in the position of the backbone carbonyl oxygen atom that fails to interact properly with the target amino group.
Mono-, di-, and trimethylated lysines have been observed in histones (
Duerre and Chakrabarty, 1975) but very little information is available about the methylation status of individual residues. Nevertheless, we have found that DIM-5 efficiently methylates dimethylated lysine 9 of histone H3 peptide, and DIM-5 is capable of adding 1–3 methyl groups to K9 of histone H3 peptide (unpublished data of H.T., X.Z., D. McMillen, J. Nakayama, P. Singh, D. Allis, S. Grewal, X.C., and E.U.S.). We propose that water sites 1 and 3 (), which hydrogen bonded to site 2, may accommodate the methyl group(s) on mono- and dimethylated lysine substrates. These additional interactions may help position the nitrogen atom and enhance its reactivity.
The Post-SET Domain
The C terminus, including the post-SET region, is mostly disordered in the crystal except for the segment between residues 299 and 308 (gray in ). This 10 residue segment, identified through M303 in selenomethionine-substituted DIM-5 protein (see Experimental Procedures), was stabilized in the interface between two crystallographic-related molecules. We hypothesize that this segment (along with the adjacent disordered residues) will adopt a different structure upon binding to substrate. The post-SET region contains three conserved cysteine residues that appear to be essential for HKMT activity in the SUV39 family. Changing all three cysteines to serines (3C-S) abolished DIM-5 activity (), as did a Cys to Tyr substitution at C1279 in SETDB1 (
Schultz et al., 2002), which corresponds to C306 of DIM-5. While the exact role of the three post-SET cysteines cannot be determined from the current structure, one intriguing possibility is that, when coupled with the fourth cysteine from the loop formed by the signature motif N
241HXCXPN
247 (C244 in PerDIM-5), these form an additional metal binding site. Several observations are consistent with this hypothesis. (1) Of the more than 50 SET protein sequences that we have examined to date, there appears to be an absolute correlation between the presence of the post-SET and a cysteine corresponding to C244 of DIM-5 (see for examples). In addition, replacing the Cys corresponding to C244 with alanine in SUV39H1 or SETDB1 abolished HKMT activity (
Rea et al., 2000;
Schultz et al., 2002). (2) The total zinc content of DIM-5 protein is 3.51 (), indicating that more than three zinc ions are present. (3) Incubation of metal chelators, phenanthroline or EDTA, with DIM-5 protein inhibited its activity and significantly reduced AdoMet binding (). Interestingly, even when EDTA completely abolished DIM-5 activity, the protein still retained approximately three (2.9) zinc ions (). As the triangular zinc cluster is quite stable, it is conceivable that the chelated zinc was coordinated by the three post-SET cysteines and C244 (yellow in ), which is near the active site. (4) Like the metal chelators, simultaneous mutation of the three cysteines (3C-S) also caused a complete loss of DIM-5 activity and AdoMet crosslinking (), consistent with the idea that the post-SET cysteines are involved in AdoMet binding. Perhaps the observed disorder of the post-SET is partly, or fully, responsible for the poor density of the AdoHcy in the current structure.