|Home | About | Journals | Submit | Contact Us | Français|
S-adenosyl-l-methionine (AdoMet) dependent methyltransferases (MTases) are involved in biosynthesis, signal transduction, protein repair, chromatin regulation and gene silencing. Five different structural folds (I–V) have been described that bind AdoMet and catalyze methyltransfer to diverse substrates, although the great majority of known MTases have the Class I fold. Even within a particular MTase class the amino-acid sequence similarity can be as low as 10%. Thus, the structural and catalytic requirements for methyltransfer from AdoMet appear to be remarkably flexible.
‘There are many paths to the top of the mountain, but the view is always the same.’…Chinese proverb
[(1996) The Columbia World of Quotations, New York Columbia University Press]
Following ATP, S-adenosyl-l-methionine (AdoMet) is the second most widely used enzyme substrate . The majority of AdoMet-dependent reactions involve methyltransfer, leaving the product S-adenosyl-l-homocysteine (AdoHcy). The huge preference for AdoMet over other methyl donors, such as folate, reflects favorable energetics resulting from the charged methylsulfonium center: the ΔG° for (AdoMet + Hcy → AdoHcy + Met) is −17 kcal mol−1 – over double that for (ATP → ADP + Pi) . Methylation substrates range in size from arsenite to DNA and proteins, and the atomic targets can be carbon, oxygen, nitrogen, sulfur or even halides [2,3].
The first structure of an AdoMet-dependent methyltransferase (MTase), determined in 1993, was for the DNA C5-cytosine MTase M.HhaI . For several years thereafter, a variety of additional MTases, with a wide range of different substrates, were found to share the same basic structure. More recently however, AdoMet-dependent methylation has been found to be the target of functional convergence that is catalyzed by enzymes with remarkably distinct structures. The Protein Data Bank (PDB) currently includes >100 structures for 50 distinct AdoMet-dependent MTases from 31 different classes of enzymes as defined by the Enzyme Classification (EC) system (Table 1; for a more extensive list see Ref. ).
The purpose of this review is to compare and contrast the five known structurally distinct families of AdoMet-dependent MTases (Classes I–V). The phenomenon of enzymes from distinct structural families catalyzing the same reaction, termed enzyme analogy, has been noted for several decades . Many enzymes exhibit ‘catalytic promiscuity’ resulting in a pluripotency that can be shaped by mutation and selection [7,8]. This can lead to a given protein structure playing several distinct catalytic roles , but also results in distinct protein structures playing a common catalytic role. Perhaps such flexibility is particularly easy where highly exergonic reactions are involved. As ATP is the only enzyme substrate more widely used than AdoMet, it seems logical that the current champion for greatest number of analogous families is the ATP-dependent protein phosphoryltransferases (protein kinases), with seven known structurally distinct families . Nonetheless, this degree of analogy appears to be quite rare, and the five known families of AdoMet-dependent MTases also provide an impressive example of functional convergence.
Starting with the M.HhaI DNA-MTase structure in 1993 , a continuing string of structures for AdoMet-dependent MTases have been reported. These structures are remarkably similar, comprising a seven-stranded β sheet with a central topological switch-point and a characteristic reversed β hairpin at the carboxyl end of the sheet (6 ↑ 7 ↓ 5 ↑ 4 ↑ 1 ↑ 2 ↑ 3 ↑; Fig. 1a). This sheet is flanked by α helices to form a doubly wound open αβα sandwich, and is henceforth referred to as the Class I MTase structure (Fig. 1a). The first β strand typically ends in a GxGxG (or at least GxG) motif – the hallmark of a nucleotide-binding site – bending sharply underneath the AdoMet to initiate the first α helix. The only other strongly conserved position is an acidic residue at the end of β2 that forms hydrogen bonds to both hydroxyls of the AdoMet ribose . The spatial conservation of this class of MTase is so pronounced that superimposing the cores (150 amino acids on average) of twenty MTases, gives root-mean-square deviations for the Cα atoms of just 3.6 ± 0.5Å when 190 pair-wise comparisons are averaged. Figure 2 displays the alignment of 14 unique structures displaying their β sheet and first α helix. Several groups have noted that, with the exception of strand seven, this structure is quite similar to that of the NAD(P)-binding Rossmann-fold domains . In both groups of proteins, the central topological switch-point results in a deep cleft in which the AdoMet or NAD(P) binds.
Although highly conserved, the Class I family contains several members with variant structural features . Some Class I MTases are homodimeric or even tetrameric , but most are monomers. One of the smallest MTases – catechol O-MTase  – consists of just the consensus structural core, whereas most MTases contain auxiliary domains that are inserted throughout the MTase fold and appear to play roles in substrate recognition. Strands six and seven are reversed in the primary sequence of protein isoaspartyl MTase , and are absent from protein arginine (R) MTase (PRMT) . Finally, circular permutation of the overall topology has been proven for some DNA MTases [16,17] and is predicted for some RNA MTases . Together, these differences suggest structural flexibility, but it (incorrectly) appeared for a time that AdoMet-dependent MTases were variations on a single basic theme.
Many of the known Class I MTases act on DNA to regulate gene expression, to repair mutations or to protect against bacterial restriction enzymes. Initially, it was a mystery as to how MTases acted on nucleotides that are held inside the DNA duplex by base pairing and stacking –seemingly inaccessible to the active site of an enzyme. The answer came from the M.HhaI DNA C5-MTase complexed with a synthetic DNA duplex . In a process termed ‘base flipping’, the enzyme simply rotates the target DNA base,~180° on its flanking phosphodiester bonds such that the base projects into the catalytic pocket . This strategy helps explain the tremendous diversity of substrates accommodated by MTases sharing the Class I structure.
As early as 1996, there was a hint that not all AdoMet-dependent MTases would follow the same structural theme. The Escherichia coli cobalamin (vitamin B12)-dependent methionine synthase, MetH, generates methionine from homocysteine, transferring a methyl group from a folate derivative to the bound cobalamin and thence to homocysteine. Periodically, the B12 cobalt is oxidized to a dead-end form, and reactivation requires reductive methylation using AdoMet, flavodoxin and an additional structural domain. The MetH reactivation domain (Class II MTase) – dominated by a long, central, antiparallel β-sheet flanked by groups of helices at either end (Fig. 1b) – looks nothing like the Class I MTases either in overall architecture or interactions with AdoMet . AdoMet is bound in an extended conformation to a shallow groove along the edges of the β strands, forming hydrogen bonds to a conserved RxxxGY motif . Large conformational changes are required to position the reactivation domain near the cobalamin substrate within the main catalytic domain [21,22]. However interesting from a biochemical standpoint, the flavodoxin requirement and subordinate role of this reactivation domain led many to underestimate its relevance to understanding independent MTases.
Two years later the MTase field was again surprised, this time by the homodimeric structure of CbiF – a MTase that acts on ring carbons of the large, planar precorrin substrates during cobalamin biosynthesis . The active site in this structural family (Class III MTases) is tucked into a cleft between two αβα domains, each containing five strands and four helices (Fig. 1c). A GxGxG motif occurs at the C-terminal end of the first β strand, similar to the Class I enzymes, but surprisingly does not contact AdoMet (at least in the absence of the precorrin substrate). Instead, the AdoMet is tightly folded, and binds between the two domains. Based on sequence similarity, diphthine synthase – which acts on a protein His residue – is also predicted to adopt the Class III structure, which might indicate a broader substrate range for this class of MTase than was initially appreciated.
This past year has provided two more disparate MTase structures. The SPOUT family of RNA MTases provides the only known cases of Class IV structure [24–26]. These enzymes are unique in three ways: (1) they include a six-stranded parallel β sheet flanked by seven α helices, of which the first three strands form half of a Rossman fold (Fig. 1d); (2) their active site is located near the subunit interface of a homodimer, and might encompass residues extending from both subunits; and (3) the topology of the structure is such that a significant portion of the C terminus is tucked back into the structure in a ‘knot’. This rare substructure is formed by the last ~30 residues, including the last α helix, and contains several catalytic residues that confirm its structural importance [25,26]. The structure of a SpoU homolog, Haemophilus influenzae (HI0766) YibK, has been determined in the presence of AdoHcy, and reveals the AdoHcy to be bound above strands four to six in a bent conformation  (Fig. 1d).
The most recently described structural family of AdoMet-dependent MTases is the SET-domain proteins [28–33]. Amino-acid sequence comparison  suggests that there are hundreds of these proteins, and several have been shown to methylate lysines in the flexible tails of histones or in Rubisco (ribulose-1,5-bisphosphate carboxylase) . These Class V MTases contain a series of eight curved β strands forming three small sheets (Fig. 1e), with the C terminus tucked underneath a surface loop forming a knot-like structure similar to the Class IV MTases, but constructed on a totally different topology [28–33]. AdoHcy bound to the SET domain is kinked in a manner similar to that of the Class III CbiF-bound AdoHcy, and binds on a concave surface of the enzyme near an invariant tyrosine residue that has been implicated in the catalytic reaction. Flanking the SET domain are diverse sequences termed the pre- and post-SET regions, which are often essential for MTase activity and might participate in substrate recognition and specificity.
The bound AdoMet or AdoHcy ligand exhibits significantly different conformations in the five structural classes, which emphasizes its flexibility. Figure 3a compares the prototypical AdoMet and AdoHcy conformations of each structural class by aligning the molecules via their ribose moieties. The ribose ring of AdoMet in Class I adopts an envelope 2′-endo conformation, with its base in the anti position at ~135° (defined by the O4′–C1′–N9–C4 dihedral; Fig. 3b). The ribose rings of the other AdoMet classes adopt an envelope 2′-exo conformation with the base in the anti position at ~180°. These differences, although significant, are small compared with the differences in the O4′–C4′–C5′–Sδ dihedral angle, which begins to define whether the nucleotide is extended or folded. An extended trans conformation (~180°) is adopted by the Class I MTases, an angle of approximately −90° is adopted by Classes II, III and V, whereas the AdoHcy twists in the opposite direction with an angle of 80° in Class IV. The next dihedral angle (C4′–C5′–Sδ–Cγ; Fig. 3b) further separates the classes such that, overall, the Class I and Class II ligands are relatively extended, but Class III–V MTases bind the methyl donor in tightly folded conformations (Fig. 3a). Because AdoMet is an exchanged substrate, its solvent accessibility might be directly related to the rate of catalytic turnover, or could at least provide an indication of structural flexibility of the catalytic core. The classes of MTase surround their methyl donor to differing degrees, such that in catechol-O-MTase (Class I) <1% of the AdoMet surface area is exposed to solvent, whereas this exposure is 8–21% in the other classes.
Substrate-bound complexes have been determined mainly for Class I structures, although recently a Class V (SET) MTase in complex with a lysine-containing peptide has also been determined . All MTases are thought to proceed with direct transfer of the methyl group to substrate with inversion of symmetry in an SN2-like mechanism [37,38]. This reaction also requires that a proton be removed before, concurrent with, or after methyl transfer. Even within the structurally conserved family of Class I MTases, a wide variety of mechanisms have evolved to activate the catalytic nucleophile, dependent on the polarizability of the target atom.
A common mode of substrate binding is the use of an [D/N/S]PP[Y/F] motif (DPPY, for brevity), employed by DNA N6-adenine  and DNA N4-cytosine MTases  and by the newly described protein N5-glutamine MTase, PrmC/HemK . This set of diverse substrates indicates that the DPPY motif is not nucleotide specific, but is selective for nitrogens conjugated to a planar system such as an amide moiety or a nucleotide base. DPPY motifs extend from the C terminus of β-strand four in these Class I structures (Fig. 1a), in which the di-proline bends the polypeptide towards the surface of the active site. Two hydrogen bonds are formed between the nucleophilic nitrogen and both the oxygen atom of the [D/N/S] side-chain and the carbonyl oxygen of the first proline (Fig. 4a,b). These hydrogen bonds position the substrate such that the lone-pair electrons on the nitrogen nucleophile point towards the incoming methyl group . A charged methylamine intermediate results after methyl transfer, but resolves into the neutral sp2-hybridized amide upon proton loss to solvent during product release [38,40,42].Other Class I N-MTases – including protein arginine MTase , small molecule glycine-N MTase  and phenylethanolamine-N MTase  – do not contain the DPPY motif, but instead use acidic residue(s) to neutralize the positive charge on the substrate amino group.
Using entirely different structural scaffolding, the Class V SET-domain MTases bind to a kinked AdoMet molecule on the opposite side of a small channel from the N5-nitrogen of a peptide lysine substrate  (Fig. 4c). The C-terminal tail of the domain forms a pseudo knot and provides an integral part of the hydrophobic active-site pocket, including tyrosine residues implicated in the catalytic mechanism.
Many substrate-bound structures are known for DNA C5-cytosine MTases, including M.HhaI [19,44]. A ProCys dipeptide is universally conserved within the active site of C5-cytosine MTases, and structurally resembles the PY portion of the DPPY motif, whereas an aspartic acid residue from a neighboring portion of the enzyme functionally replaces the [D/N/S] residue of the DPPY motif. In M.HhaI, the N4 nitrogen of the cytosine residue is positioned further away from the AdoMet by hydrogen bonds, such that the C5 atom is presented as a methylation target (Fig. 4d). Because methylation on carbon atoms is more difficult than on polarizable nitrogens, the nucleotide must first be activated by covalent-bond formation between the conserved Cys thiol and carbon C6 . This generates a negative charge on C5 that facilitates methyl transfer.
The design flexibility of the Class I structure is further illustrated by a large family of RNA C5-cytosine MTases . Our threading analysis (not shown) suggests that these enzymes adopt a Class I fold; but although a cysteine nucleophile  is predicted to be in the active site similar to M.HhaI, it would be contributed by the end of β-strand five.
Several O-MTase structures have been determined, including catechol O-MTase (COMT) , the dimeric structures of two plant MTases  and the glutamate O-MTase CheR . A Mg2+ ion is required to bind and orient the two catechol hydroxyls in COMT (Fig. 4e). Molecular simulation and pKa studies suggest that Mg2+ acts primarily to organize the substrate-binding site and not as a general base [50,51]. Instead, a nearby lysine residue appears to deprotonate the substrate hydroxyl before attack on the AdoMet methyl group. The chalcone (Fig. 4f) and isoflavone O-MTases do not require a metal ion, but do require a histidine residue to deprotonate the hydroxyls of plant metabolites .
Glutamic acid O-methylation by CheR differs in that the methyl group is transferred to a carboxylic acid rather than to a hydroxyl moiety. The active site of CheR contains the side chains of arginine and tyrosine residues , which might position the glutamic acid substrate and facilitate the methyl-transfer reaction. The negatively charged carboxyl group might attack the methyl group unassisted. The isoaspartate O-MTase recognizes damaged asparagine residues as part of an essential repair process. This enzyme binds a VYP(L-isoAsp)HA peptide by forming a series of hydrogen bonds to the surrounding peptide main-chain, as if the substrate were a β strand within the enzyme itself  (Fig. 4g). The extended polypeptide chains of other protein MTases (e.g. PRMT, CheR, PrmC and the Class V SET-domain lysine N-MTases) might form similar interactions.
Evolution has independently achieved AdoMet-dependent MTase activity at least five times, producing five unique structural MTase classes. Most of the other examples of analogous enzyme families also use substrates, such as ATP or NAD, that include a nucleotide ‘handle’ for binding. The Class I and Class IV MTases are plausibly derived from Rossmann-fold proteins, and even the Class III CbiF structure contains a GxGxG nucleotide-binding motif, but uses it unconventionally. The Class II and Class V MTases do not appear to have structural analogs, so their evolutionary history is not yet clear. If the Class I MTases are actually derived from the ubiquitous Rossmann-fold proteins, then multiple independent evolutionary sublineages might explain the predominance of the Class I enzyme family. The limited sequence similarity between Class I proteins could even be consistent with independent evolution from a generic GxGxG-containing nucleotide-binding domain.
Catalysis of AdoMet-dependent methyltransfer does not appear to be rigidly restricted by tertiary structure or local spatial requirements. Even within the structural constraints of the Class I family, many different methods of binding substrate and activating a nucleophile have been described. One important question that remains is whether chance or functional constraints define which reactions are carried out by which classes of MTase. It could be argued that precorrin MTases have adopted a novel conformation because of their rigid, planar substrates, however, it appears that not all tetrapyrrole biosynthetic MTases are in Class III. Precorrin-C6 MTase CbiT has  and protoporphyrin IX O-MTase is predicted to have a Class I structure . Similarly, not all histone-lysine N-MTases contain the Class V SET domain; the Dot1 histone H3-Lys79 N-MTase belongs to Class I .
New methyltransferase activities are still being described, and although the genome projects are providing large lists of enzymes orthologous to the five known structures, it would not be a great shock to find the members of MTase Classes VI and up among uncharacterized open reading frames. In fact, given what we now know about enzyme evolution, these new MTase classes might already have been (mis)annotated based on their similarity to characterized enzymes . The challenge will be for biochemists to test annotation claims, or to discover the real identity of these new genes, whereas the structural biologists can begin to address substrate recognition and catalytic mechanisms, particularly in the newer structural classes.
We thank Osnat Herzberg and Steve Gamblin for early release of coordinates. H.L.S. was supported by grants from NIH (GM56775 and DK02794), R.M.B. was supported by a grant from the U.S. National Science Foundation (MCB-9904523), and X.C. was supported by NIH (GM49245 and GM61355) and the Georgia Research Alliance.