|Home | About | Journals | Submit | Contact Us | Français|
Methylation of DNA, protein, and even RNA species are integral processes in epigenesis. Enzymes that catalyze these reactions using the donor S-adenosylmethionine fall into several structurally distinct classes. The members in each class share sequence similarity that can be used to identify additional methyltransferases. Here, we characterize these classes and in silico approaches to infer protein function. Computational methods such as hidden Markov model profiling and the Multiple Motif Scanning program can be used to analyze known methyltransferases and relay information into the prediction of new ones. In some cases, the substrate of methylation can be inferred from hidden Markov model sequence similarity networks. Functional identification of these candidate species is much more difficult; we discuss one biochemical approach.
Recent years have witnessed greatly expanded attention to the enzymes that catalyze the transfer of methyl groups from S-adenosylmethionine (AdoMet) to DNA, RNA, proteins, lipids, and small molecules . The central role of methyltransferases in epigenetics was first realized with enzymes modifying DNA [2, 3]; subsequent work has demonstrated the importance of a number of enzymes that modify lysine and arginine residues in histones [4-7]. RNA methyltransferases also appear to play a role. For example, microRNA species can be modified by methyltransferases such as HEN1 to affect DNA methylation in paramutation [8-11]. It is likely that additional methyltransferases for protein, RNA, DNA, and perhaps even lipids or small molecules, may be involved in epigenetic phenomena. For the human proteome, only a fraction of the potential methyltransferases has been functionally identified and enzymes of importance to epigenetics may be lurking among the unknown species. Thus, it is of interest to be able to identify the complete “cast” of methyltransferases and their substrates in the proteomes of various organisms.
In this paper, we review recent progress on the identification and characterization of new methyltransferases. We have focused our discussion largely on the situation in the budding yeast Saccharomyces cerevisiae, where the “methyltransferasome” has been most fully characterized [12, 13] (Table I). The successful identification of yeast methyltransferases will hopefully help pave the way to the identification of new enzymes in higher plants, mammals, and other organisms.
We will first introduce the different methyltransferase families and what bioinformatic methods have helped reveal about them, along with the limitations of these methods. We will then describe one biochemical approach to determining the function of candidate methyltransferases identified using bioinformatics methods.
The success of the identification of novel methyltransferases using bioinformatics methods ultimately lies on the information known about previously discovered methyltransferases. Each topologically-distinct family of methyltransferases is described along with computational methods for the identification and characterization of new family members. These topographical classes have been identified in references , , and .
Seven beta strand enzymes (also referred to as “Class I” methyltransferases) appear to make up the majority of methyltransferases in organisms [14, 16]. This group includes the mammalian de novo and maintenance DNA methyltransferases [3, 17, 18], the Dot1 histone lysine methyltransferase , and the HEN1 microRNA methyltransferase [8, 10], all known enzymes that play roles in epigenesis. Remarkably, sequence similarity is shared between methyltransferases ranging from the Saccharomyces cerevisiae enzyme active on small molecules (Tmt1), the Mycoplasma arthritidis enzyme active on DNA (HhaI), the Arabidopsis thaliana enzyme active on lipids (UbiE), the human enzyme active on protein (PCMT1), to even the Bos taurus enzyme active on inorganic arsenite (AS3MT). Despite vastly different substrates of methylation, primary sequence similarity was found in small regions of these proteins before any structural information was available .
In Fig. 1a, we give the histone lysine methyltransferase Dot1 as an example of this class of enzyme . Enzymes in this class of methyltransferases share a common seven strand twisted beta sheet with a C-terminal beta hairpin, sandwiched between alpha helices [15, 16]. Four signature motifs are present (I, Post I, II, and III; [13, 21]). Residues of Motif I and Motif Post I contact AdoMet. The conserved aspartate amino acids in these motifs are key in stabilizing charged AdoMet species as well as hydrogen bonding to two different locations of the cofactor for positioning the methyl group to transfer. The last residues of β4 and β5, which make up portions of Motifs II and III, respectively, form part of the catalytic domain and can bind the methyl-accepting substrate . A few enzymes in this methyltransferase class deviate from this structural core, most notably the protein arginine methyltransferase PRMT1 that lacks β6 and β7 [15,16] and the circularly permutated motifs in plant DRM enzymes .
Some proteins in this superfamily contain conserved sequences between Motifs II and III that are methyl-acceptor substrate specific. For instance, the “DPPY” motif is seen in several N-methyltransferases active on sp2-hybridized nitrogen atoms in adenosine or glutamine residues [15, 22], while the “EE” motif is present in protein arginine methyltransferases . Inserts and deletions to the core structure have also been found to reflect substrate identity . Yeast histone H3K79 methyltransferase Dot1, originally discovered for its role in telomeric silencing , has several basic residues in the N-terminal domain which bind nucleosomes [19, 25]; the same stretch is seen in the C-terminal domain of human Dot1 . To date, Dot1 is the only non-SET histone lysine methyltransferase (see below) and interestingly the only histone lysine methyltransferase which methylates in the globular domain of histones .
Initially, in silico searches for novel methyltransferases were performed using known methyltransferase sequences as probes against protein databases with BLAST. The discovery of Hmt1/Rmt1 protein arginine methyltransferase is an example of the success from this approach .
The shift from whole sequence comparisons to motif-based searches has led to the generation of a comprehensive list of putative seven beta strand (Class I) methyltransferases [21, 29]. Katz et al. used MEME  to build position-based amino acid frequency matrices, or profiles, of Motifs I and Post I  from multiple alignments of known methyltransferases and utilized these profiles in a comprehensive MAST  search of the genome . As a result, the search is based on information from multiple methyltransferases rather than simply amino acids similarity (as in the 20×20 matrix “BLOSUM 62” used in BLAST searching). Methyltransferase domain identification was further refined by Ansari et al., who aligned sequences through additional secondary structure information . In their database search, the authors used hidden Markov model (HMM) profiles that take into account not only the log-odds amino acids frequency but also the frequency of inserts and deletions to account for gaps in the alignment. HMM profiles can be created from large superfamily reference sets (such as all Class I methyltransferases) to identify a general list of proteins, or alternatively can be generated from a specific subclass of proteins to restrict the search. Ansari et al. specifically identified O-, N- and C- methyltransferases from a non-redundant database using HMM profiles from the methyltransferase domain of polyketide synthase (PKS) and nonribosomal peptide synthetase (NRPS) . However, such global search profiles spanning the entire methyltransferases domain assign penalties for mismatches between motifs that may leave true previously unidentified methyltransferases undetected.
More recently, a new approach involving motif-based searches along with HMM profile-profile local alignments were used to solve some of the computational hurdles of the past . Independent matrices describing all of the motifs, including II and III, were developed through better motif identification from either solved methyltransferase structures or from HMM profile primary and secondary structure prediction alignments  using the program HHpred . Additionally, a novel program, Multiple Motif Scanning (MMS) , was used to rank the yeast database of proteins to the sequence similarity of the methyltransferase motif profiles. Here, the position-based matrices were entered into MMS which includes a parameter for the conserved number of amino acids between the motifs and outputs the overall highest scoring combinations of multiple best-fit motifs . The success of this program relies on the input matrices; using matrices derived from different methyltransferase reference sets output slightly different rankings and putative methyltransferases. MMS is advantageous for proteins containing multiple ungapped motifs such as methyltransferases because it does not allow for inserts within a motif as do HMM profiles. For example, when we use HHpred to search against the yeast proteome with HMM profiles constructed from the identical motif sequences employed in Ref. 13 we did not detect a number of putative yeast methyltransferases (YMR209C, YLR063W, YKL155C and YNL092W) that were identified using MMS (see Table I). On the other hand, as shown in Table I, HHpred did find YKL162C and YLR137W that were overlooked by MMS and were not reported previously . Together, these results highlight the importance of combining these methodologies to create a comprehensive list of putative methyltransferases.
To determine the potential biological function of the candidate methyltransferases, identification of the methyl-accepting substrate would be valuable. Bioinformatic approaches for identifying substrates has so far resulted in a mixed record of success. A widespread approach has been comparing protein sequences across species to reveal homologs. In fact, databases of phylomes such as PhylomeDB contain compilations of phylogenetic trees which can be used to assess the orthological and paralogical relationships of a given protein . More recently, similarity sequence networks have been used as a high-throughput method for substrate predictions . Sequence similarities in or outside of methyltransferase motifs that reflect substrate recognition allow for enzymes acting on similar substrates to cluster with each other. HMM profile-profile comparisons with a stringent E value as cutoff (<10-20) separated yeast methyltransferases into protein arginine methyltransferases, protein glutamine methyltransferases, wybutosine-forming transferases, 2'-O-ribose methyltransferases, cytosine 5-methyltransferases, and small molecule/lipid methyltransferase clusters . Although not all yeast methyltransferases clustered in this protocol, one can gain useful information when a putative methyltransferase was found grouped with enzymes of known catalytic activity.
Sequence similarity between the plant Rubisco protein lysine methyltransferase and three Drosophila proteins involved in epigenetics - Suppressor of position-effect variegation 3-9 (Su(var) 3-9), Enhancer of zeste (E(z)), and Trithorax (Trx) - led to the discovery of the family of SET methyltransferases . This family includes a number of histone lysine methyltransferases involved in transcriptional control through chromatin structural modification . These proteins contain the SET domain consisting of eight curved beta strands arranged into three sheets and a characteristic pseudoknot structure. An example of this domain is shown for the MLL1 histone lysine methyltransferase in Fig. 1b . The SET proteins share sequence similarity in the N-terminal (N-SET) and C-terminal domains (C-SET) that contain residues responsible for catalysis, cofactor-binding, and substrate interaction. The first two motifs reside in N-SET while the last two motifs lie in C-SET and form the knot-like structure (Fig. 1b) [26, 40, 41]. AdoMet binds to Motif I, N-terminal residues of motif III, and tyrosine in Motif IV . Interestingly, the “GxG” sequence of Motif I interacts with AdoMet as does the seven beta strand Motif I “GxGxG” sequence despite the lack of any overall structural similarity between SET and seven-beta strand methyltransferases . The catalytic site is located on the opposite side of the enzyme and includes a key catalytic tyrosine residue in Motif II. Interactions with the lysine substrate occur in the hydrophobic pocket formed by the remaining portions of Motifs III and IV [26, 40]. It has been hypothesized that variability within this domain defines the substrate and in fact, residues in C-SET have been integral in the determining mono-, di-, or trimethylation of the enzyme. Point mutations in the “Y/F switch” have proved successful in converting SET7/9 to a di-/tri-methyltransferases , SET8 to a dimethyltransferase , and Dim5 to a mono-methyltransferase .
Amino acid sequences between the N-SET and C-SET domains are highly variable among the SET superfamily and have been dubbed the I-SET region. I-SET residues can interact with the substrate [46, 47]. In fact, several non-histone methyltransferases have a “SRA” motif in the I-SET domain . I-SET is not always indicative of the binding ligand; two pairs of enzymes – SUV39H1 and SETDB1, and SET7/9 and MLL – have non-homologous I-SET despite sharing identical substrates [46, 47]. Many SET proteins also have a Pre-SET and Post-SET domain composed of several conserved cysteine residues that coordinate zinc ions in triangular clusters. The function of these domains is not clear, although Post-SET seems to shape the channel for the lysine substrate. In enzymes that lack the cysteine-rich Post-SET, such as SET7/9 and Rubisco, additional alpha helices are oriented to create this channel [26, 40]. Several SET methyltransferases also have additional domains such as PWWP, PHD, and SANT that appear to recruit the chromatin substrate .
Yeast candidate SET-domain methyltransferases, including YHL039W and YBR030W (now Rkm3), were identified by Porras-Yakushi et al. through reiterative PSI- and PHI-BLAST searches . However, the inherent nature of BLAST searching does not lead to a single list of SET methyltransferases; instead, two self-contained “subfamilies” of proteins were found . When we now search with HHpred using SET protein sequences compiled from the SMART 6 database , we find that we can produce a single list of all of these methyltransferases proteins (Table I). Additionally, profiles obtained from the same reference dataset using only MEME-derived matrices of Motifs I-IV in MMS also identified all of these proteins, confirming their identification through the SET domain (data not shown).
The “subfamilies” described by Porras-Yakushi  ultimately differentiated between what has been described as Class I-VI histone and Class VII non-histone methyltransferases based on their substrate specificity (Table II). Initially, four classes of SET proteins were discovered through BLASTP searches and ClustalW clustering analysis of Arabidopsis genome using Drosophila genes E(z) (Class I, H3K27), Ash1 (Class II, H3K36), Trx (Class III, H3K4), Su(var)3-9 (Class V, H3K9) . Springer et al. expanded this analysis to include other genomes and with an updated SET protein list . These authors identified Class IV of proteins that contain PhD finger but lack Pre-SET and Post-SET, and several proteins whose I-SET domain was extended, dubbed the disrupted S-ET proteins. The S-ET proteins were later divided into two classes: Class VI histone and Class VII nonhistone proteins that includes Rubisco, cytochrome c, and ribosomal proteins [52,53]. This classification of SET proteins based on the families or substrates of methylation may need to be expanded in the future upon the discovery of new methylation sites and SET proteins. In fact, methylation on substrates H1K26 and H4K20 have been recently discovered in mammalian cells .
We have now created individual HMM profiles of each SET Class from the reference set of proteins in Ref. 53 and performed an HHpred search against the complete yeast protein database (Table II). Every one of the twelve yeast SET methyltransferases fit into its appropriate class: H3K36-methylating Set2 in Class II, H3K4-methylating Set1 in Class III, Set3 and Set4 (which contain PHD domains) in Class IV, the interrupted domain Set5 and Set6 in Class VI, and lastly ribosomal methyltransferases Rkm1-4, YHL039W and Ctm1 in Class VII (Table II). Interestingly, the substrates of yeast Set3, Set4, Set5, and Set6 proteins are not known. It appears likely that these will be histone lysine methyltransferases but it will be important to confirm this tentative identification experimentally.
The SPOUT methyltransferase family was first described based on the primary sequence and predicted secondary structural similarities of bacterial SpoU and TrmD methyltransferases . This new topology of methyltransferases became apparent with the solved structures of RrmA and RlmB [56, 57] revealing a characteristic knot distinct from SET methyltransferases. To date, SPOUT methyltransferases have been found to exclusively methylate RNA. Members of this family may thus possibly methylate RNA species involved in epigenesis. The core structure consists of a beta sheet with five parallel beta strands in a 5-3-4-1-2 orientation between two layers of helices. An example of this structure is given for TrmH in Fig. 1c . A partial Rossmann-like fold similar to that in the seven beta strand (Class I) methyltransferases is formed by the first two N-terminal strands; variability can exist with additional alpha/beta units in this region. Unlike the seven beta strand (Class I) enzymes, AdoMet binds to the C-terminal alpha-beta “trefoil” knot that characterizes the SPOUT superfamily .
Primary sequence similarity is not very strong among the members of this superfamily that is largely defined by its tertiary structure [12,58,59]. Nonetheless, common motifs have been described. Motif 1 is not widely conserved among all subclasses of SPOUT methyltransferases but contains amino acids integral for tRNA binding, the release of AdoHcy, and catalysis . The latter residues of β3 bind AdoMet (here termed Motif Post 1; Fig. 1c). Although the topology of SPOUT methyltransferases is unique from the seven beta strand (Class I) and SET enzymes, Motif 2 of the SPOUT domain has several shared residues with both of these classes: the glycine rich coil proceeding β4 binds both the tRNA substrate along with AdoMet and the catalytic glutamyl residue is catalytic much like the asparagine/aspartate in the seven beta strand (Class I) β4 and the asparagine in the SET Motif III [12, 42]. Motif 3, originally described as the coil preceding β5 , can be expanded to include an extended helix with a catalytic tyrosine, and is involved in AdoMet-binding and catalysis  ( Fig 1c). The active site is created upon dimerization, and additional catalytic residues for SPOUT methyltransferases are family specific and lie on the antiparallel or perpendicular mode of dimerization. Like the SET superfamily, several SPOUT methyltransferases have additional domains flanking the SPOUT domain including, not surprisingly, THUMP, OB-fold, L30e, and PUA domains that are associated with nucleic acid binding or modification .
Tkaczuk et al. have also used similar computational techniques to identify new SPOUT methyltransferases . Crystal structures of known SPOUT methyltransferases were collected and were used to search the PDB with DALI to find proteins with similar structures . PSI-BLAST searches using different members of COG families were performed on a non-redundant database to discover previously unidentified putative SPOUT methyltransferases, which were corroborated by secondary structural predictions . HMM profiles of aligned sequences were created and searched by HHpred to identify as many protein families with even remote similarities to the SPOUT domain, where proteins were further validated by reciprocal searches and fold-recognition methods . These methods identified known yeast methyltransferases Trm10, Mrm1, Trm3 as well as putative methyltransferases Emg1, YGR283C, YMR310C. The crystal structure of Emg1 later confirmed these predictions [60,61]. We have also used these methods to predict YOR021C as an additional yeast putative SPOUT methyltransferase (Table I).
The pairwise PSI-BLAST searches performed by Tkaczuk et al. revealed a core “supercluster” of five COG families along with four satellite clusters that are all 2’-O- methyltransferases . Therefore, proteins such as Escherichia coli YibK, LasT, and YfiF were predicted to be 2’-O-ribose methyltransferases . Interestingly, we find that yeast Tan1, currently annotated as a putative tRNA acetyltransferase, has high similarity by HHpred to the one of these satellite clusters (COG1818; e = 1.6-20, p = 2.8-24), indicating that it may be a 2’-O-ribose methyltransferase as well (Table I). Additionally, enzymes responsible for m1G and m3U methylation form independent clusters which were distinct from the other COG groups . These analyses may thus reveal the substrate specificity of a putative methyltransferase.
Although most methyltransferases are found in the seven beta strand, SET domain, and SPOUT families, there are, however, a number of these enzymes that have other types of structural folds. Interestingly, the crystal structure of a single enzyme, cobalamin-dependent methionine synthase (MetH), has given insight into three additional distinct classes of AdoMet-binding methyltransferases . This enzyme uses the methyl group of N-5-methyl-tetrahydrofolate to produce methionine from homocysteine through a methylcob(III)alamin intermediate. These classes include the MetH-reactivation domain, the homocysteine methyltransferases, and radical SAM methyltransferases .
AdoMet binds to the reactivation domain; its methyl group is then transferred to the oxidized B12 cofactor on a separate domain . The unique arrangement of this AdoMet-binding domain can be best described as a twisted center beta strand surrounded by several shorter antiparallel beta-strands forming two perpendicular sheets (Fig. 1d) . AdoMet binds to the helices and coils in the middle of this C-shaped structure, specifically the acidic residue of α2, RLAEAF in α6, the RPAPG coil following α7, and a C-terminal aromatic residue . Interestingly, we find that the AdoMet-binding domain of MetH does not show homology to any protein in yeast by sequence analysis using HHpred (Table I). It is presently unclear whether this domain architecture is utilized in any other methyltransferase reactions; although we did not find any homologs by fold recognition programs utilizing automated modeling (MODELLER ) or threading approaches (PHYRE ) (Table I).
The second methyltransferase domain illuminated by methionine synthase is the homocysteine-binding domain. Our HHpred searches using this N-terminal domain of MetH as a probe against the yeast protein database detects the yeast homocysteine methyltransferase family proteins Mht1 and Sam4 with very high sequence similarity (Table 1). Additional searches with the homocysteine COG group against the yeast proteome confirms this observation (Table I). Mht1 and Sam4 catalyze the same homocysteine to methionine reaction as MetH but utilize AdoMet or S-methylmethionine as methyl donors . The similarity in sequences of these enzymes suggest a similarity in overall structures as well. The homocysteine-binding domain of MetH is composed of a beta-barrel from eight parallel strands (Fig. 1e) . A zinc ion is also bound to the structure and functions in MetH to draw the cobalamin closer to the catalytic domain as well as activate the thiol for nucleophilic attack. The metal coordinates with tetrahedral geometry with three cysteines following β6 (GXNC) and β8 (GGCC) with the last binding partner being either substrate homocysteine or a nitrogen/oxygen containing side-chain residue of β7 (N in the case of MetH) [67, 68]. Interestingly, the latter half of this domain, is homologous to YMR321C. It is unclear whether YMR321C is a putative methyltransferases; the AdoMet-binding domain of these proteins remains to be determined.
Finally, the cobalamin-binding domain of MetH is often present in proteins that also include the “radical SAM” domain . Radical SAM enzymes generally form methionine and the deoxyadenosine radical from AdoMet, where crystal structure determinations have demonstrated a TIM barrel domain (Fig 1f) . These proteins are distinguished by their CxxxCxxC motif, which is used to bind an iron sulfur cluster necessary for radical generation. Although many of these family members catalyze non-methyltransferase reactions (typically involving the deoxyadenosyl radical formed by a one electron transfer to AdoMet), there are at least several members that are known to participate in methylation reactions despite the fact that the mechanisms of these transfers are still unclear [71, 72]. These include the florfenicol/chloramphenicol resistance protein (Cfr), the fortimicin methyltransferase (fmrO), and the fosfomycin methyltransferase (Fom3) . Radical SAM methyltransferases are difficult to distinguish from other radical SAM enzymes by sequence analysis. This was highlighted by our HHpred searches against the yeast proteome using multiple alignments of radical SAM methyltransferases found in the RefSeq database  (Table I). This search identifies apparent non-methyltransferases including the Bio2 biotin synthase, the C-terminal portion of Elp3 – a histone acetyltransferase thought to also be involved in histone demethylation initiated by 5′-deoxyadenosyl radical, Lip5 involved in biosynthesis of the lipoic acid, and Tyw1 in the wybutosine pathway (Table 1). Further work will be needed to ask if there are additional methyltransferases in the radical SAM family in other organisms.
Cobalamin is a link to yet another structurally distinct methyltransferase, this time not as a partner in methylation but as a necessity in its own biosynthesis. The structure of CbiF, a precorrin-4-C11 methyltransferase, revealed two asymmetric domains of a five beta-strand, four-helix structure – the first domain containing strands in parallel while the second are in antiparallel orientation (Fig.1g) . The GxGxG motif at the end of β1 residues on the beta sheet does not bind AdoMet in the absence of precorrin . Unlike the other classes of methyltransferases, AdoMet is distorted to 82° between the two beta sheets; it is thought that this orientation is favorable for the transfer of the methyl group to the bulky precorrin substrate in the active site . However, other less bulky substrates are methylated by this superfamily of enzymes. Dph5, involved in diphthamide biosynthesis, shows sequence similarity to CbiF (Table I), yet interestingly does not share the six amino acids known to bind the substrate precorrin-4 methyltransferase in CbiF .
Although their three-dimensional structures are currently unknown, membrane-bound methyltransferases share no sequence homology to structurally solved methyltransferases. Biochemical studies of the isoprenylcysteine carboxyl methyltransferases Ste14 has lead to a topology model describing its structure as six membrane spans, with two forming helical hairpin . The conserved region A contains motif RHPxYxG that is trailed by a hydrophobic stretch ending in two conserved adjacent glutamates in region B. This C-terminal domain, where five of six point mutations lead to a loss-of-function, is conserved not only in isoprenylcysteine carboxyl methyltransferases but also phospholipid methyltransferases. Interestingly, searches of Ste14 through BLAST  and HHpred yield yeast phospholipids methyltransferases Opi3 and Cho2 as well as several fatty acid/steroid reductases and the C-terminal residues of ergosterol biosynthetic enzymes Erg4p and Erg24p. However, when we searched the database using multiple alignments of the proteins in the PEMT family present in the Pfam database , this list only included Opi3, Ste14, and Cho2 (Table 1).
Evidence has been presented for a final class of methyltransferases represented by enzymes that modify the N-6 position of adenosine in mRNA . The yeast Ime4 protein appears to be in this group. Weak sequence similarity is found with the “DPPY” motif between Motifs II and III of some Class I seven beta strand N-methyltransferases. Our HMM analysis using the Ime4 protein sequence as probe against the yeast database indicates that this family includes members Kar4 and YGR001C (Table 1). Further work will be needed to confirm the methyltransferase activity of these proteins.
Although computational methods are very powerful, the functional identification of a methyltransferases can only be made with biochemical evidence. In some cases, bioinformatic approaches can suggest one or more specific functions that can be specifically tested. Often, however, this is not the case. Thus, general biochemical approaches that can at least confirm the binding of AdoMet become more important. A useful procedure here is to take advantage of the fact that many methyltransferases can covalently be linked to [3H-methyl]AdoMet after UV treatment and detected as a crosslinked product on SDS gels . An example of this is shown in Fig. 2, where the crosslinked proteins are detected by fluorography. Here AdoMet-binding was confirmed for the yeast YHR209W protein. It is also possible to cut out the Coomassie-stained band, dissolve the gel with hydrogen peroxide, and directly measure the radioactivity associated with the protein .
This method could be adapted to include high throughput methods by separating proteins from cell extracts, cross-linking to AdoMet, and separating by two-dimensional gel electrophoresis for identification of radioactive species . Advancements in proteome chip technologies can lead to the identification of methyltransferases by these crosslinking methods if sensitivity and resolution of tritium fluorography can be achieved . Recently, a number of new approaches in this area have been described .
Not all enzymes that bind and catalyze reactions with the cofactor AdoMet or its derivatives are methyltransferases. AdoMet or its decarboxylated derivative can also be used in reactions of adenosyltransfer, formation of the deoxyadenosyl radicals, aminotransfer, aminobutryltransfer, and aminopropyltransfer [14, 16]. For example, two clear yeast seven beta strand Class I “methyltransferases” are actually aminopropyltransferases - spermidine synthase (Spe3) and spermine synthase (Spe4).
Clearly, the crucial indicator of methyltransferase function is the sure identification of the methyl-accepting substrates and products. However, such identification can be difficult because most enzymes are very specific and substrates can be unique to each methyltransferase reaction. Some clues can be surmised from mutant phenotypes. However, many knockouts have either no phenotype or ones that are not readily interpreted in terms of specific methylation events. There is a large amount of high-throughput data in yeast, including localization and expression profiles; only in rare cases has this information been useful to date in identifying new methyltransferases. In fact, it is often hard to rationalize the data available for known methyltransferases. In the end, there is probably no substitute for direct biochemical assays of methylation!
It is of course hard to predict how this field will evolve in the next five to ten years. However, at the present rate of discovery, it seems likely that we will know the function of most of the methyltransferases of yeast and a good fraction of the human enzymes in this time frame. From the success of the bioinformatic approaches, the rate-limiting step in fully characterizing the biological function of the candidate methyltransferases is soon likely to be biochemical analysis. One question is whether high-throughput approaches will achieve more success than they have to date. Unfortunately, the information content derived from these approaches, even in species such as yeast that have been intensively studied, has been limited and has generally not permitted assignments of functions to candidate methyltransferases. However, advances here may allow identification of these roles in the next few years; alternatively the tried and true biochemical approaches on individual or small groups of proteins may remain the best approach.
bulleted summary points that illustrate the main topics or conclusions made under each of the main headings of the article
This work was supported by National Institutes of Health Grant GM026020. T.C.P. was supported by the UCLA Chemistry-Biology Interface Training Grant GM008496. We are grateful to Professor Christopher Lee for his comments on this work.