Search tips
Search criteria

Results 1-11 (11)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  Type II restriction endonuclease R.Eco29kI is a member of the GIY-YIG nuclease superfamily 
The majority of experimentally determined crystal structures of Type II restriction endonucleases (REases) exhibit a common PD-(D/E)XK fold. Crystal structures have been also determined for single representatives of two other folds: PLD (R.BfiI) and half-pipe (R.PabI), and bioinformatics analyses supported by mutagenesis suggested that some REases belong to the HNH fold. Our previous bioinformatic analysis suggested that REase R.Eco29kI shares sequence similarities with one more unrelated nuclease superfamily, GIY-YIG, however so far no experimental data were available to support this prediction. The determination of a crystal structure of the GIY-YIG domain of homing endonuclease I-TevI provided a template for modeling of R.Eco29kI and prompted us to validate the model experimentally.
Using protein fold-recognition methods we generated a new alignment between R.Eco29kI and I-TevI, which suggested a reassignment of one of the putative catalytic residues. A theoretical model of R.Eco29kI was constructed to illustrate its predicted three-dimensional fold and organization of the active site, comprising amino acid residues Y49, Y76, R104, H108, E142, and N154. A series of mutants was constructed to generate amino acid substitutions of selected residues (Y49A, R104A, H108F, E142A and N154L) and the mutant proteins were examined for their ability to bind the DNA containing the Eco29kI site 5'-CCGCGG-3' and to catalyze the cleavage reaction. Experimental data reveal that residues Y49, R104, E142, H108, and N154 are important for the nuclease activity of R.Eco29kI, while H108 and N154 are also important for specific DNA binding by this enzyme.
Substitutions of residues Y49, R104, H108, E142 and N154 predicted by the model to be a part of the active site lead to mutant proteins with strong defects in the REase activity. These results are in very good agreement with the structural model presented in this work and with our prediction that R.Eco29kI belongs to the GIY-YIG superfamily of nucleases. Our study provides the first experimental evidence for a Type IIP REase that does not belong to the PD-(D/E)XK or HNH superfamilies of nucleases, and is instead a member of the unrelated GIY-YIG superfamily.
PMCID: PMC1952068  PMID: 17626614
2.  Complete Cap 4 Formation Is Not Required for Viability in Trypanosoma brucei†  
Eukaryotic Cell  2006;5(6):905-915.
In kinetoplastids spliced leader (SL) RNA is trans-spliced onto the 5′ ends of all nuclear mRNAs, providing a universal exon with a unique cap. Mature SL contains an m7G cap, ribose 2′-O methylations on the first four nucleotides, and base methylations on nucleotides 1 and 4 (AACU). This structure is referred to as cap 4. Mutagenized SL RNAs that exhibit reduced cap 4 are trans-spliced, but these mRNAs do not associate with polysomes, suggesting a direct role in translation for cap 4, the primary SL sequence, or both. To separate SL RNA sequence alterations from cap 4 maturation, we have examined two ribose 2′-O-methyltransferases in Trypanosoma brucei. Both enzymes fall into the Rossmann fold class of methyltransferases and model into a conserved structure based on vaccinia virus homolog VP39. Knockdown of the methyltransferases individually or in combination did not affect growth rates and suggests a temporal placement in the cap 4 formation cascade: TbMT417 modifies A2 and is not required for subsequent steps; TbMT511 methylates C3, without which U4 methylations are reduced. Incomplete cap 4 maturation was reflected in substrate SL and mRNA populations. Recombinant methyltransferases bind to a methyl donor and show preference for m7G-capped RNAs in vitro. Both enzymes reside in the nucleoplasm. Based on the cap phenotype of substrate SL stranded in the cytosol, A2, C3, and U4 methylations are added after nuclear reimport of Sm protein-complexed substrate SL RNA. As mature cap 4 is dispensable for translation, cap 1 modifications and/or SL sequences are implicated in ribosomal interaction.
PMCID: PMC1489268  PMID: 16757738
3.  Phylogenomic analysis of the GIY-YIG nuclease superfamily 
BMC Genomics  2006;7:98.
The GIY-YIG domain was initially identified in homing endonucleases and later in other selfish mobile genetic elements (including restriction enzymes and non-LTR retrotransposons) and in enzymes involved in DNA repair and recombination. However, to date no systematic search for novel members of the GIY-YIG superfamily or comparative analysis of these enzymes has been reported.
We carried out database searches to identify all members of known GIY-YIG nuclease families. Multiple sequence alignments together with predicted secondary structures of identified families were represented as Hidden Markov Models (HMM) and compared by the HHsearch method to the uncharacterized protein families gathered in the COG, KOG, and PFAM databases. This analysis allowed for extending the GIY-YIG superfamily to include members of COG3680 and a number of proteins not classified in COGs and to predict that these proteins may function as nucleases, potentially involved in DNA recombination and/or repair. Finally, all old and new members of the GIY-YIG superfamily were compared and analyzed to infer the phylogenetic tree.
An evolutionary classification of the GIY-YIG superfamily is presented for the very first time, along with the structural annotation of all (sub)families. It provides a comprehensive picture of sequence-structure-function relationships in this superfamily of nucleases, which will help to design experiments to study the mechanism of action of known members (especially the uncharacterized ones) and will facilitate the prediction of function for the newly discovered ones.
PMCID: PMC1564403  PMID: 16646971
4.  Structural model for the multisubunit Type IC restriction–modification DNA methyltransferase M.EcoR124I in complex with DNA 
Nucleic Acids Research  2006;34(7):1992-2005.
Recent publication of crystal structures for the putative DNA-binding subunits (HsdS) of the functionally uncharacterized Type I restriction–modification (R-M) enzymes MjaXIP and MgeORF438 have provided a convenient structural template for analysis of the more extensively characterized members of this interesting family of multisubunit molecular motors. Here, we present a structural model of the Type IC M.EcoR124I DNA methyltransferase (MTase), comprising the HsdS subunit, two HsdM subunits, the cofactor AdoMet and the substrate DNA molecule. The structure was obtained by docking models of individual subunits generated by fold-recognition and comparative modelling, followed by optimization of inter-subunit contacts by energy minimization. The model of M.EcoR124I has allowed identification of a number of functionally important residues that appear to be involved in DNA-binding. In addition, we have mapped onto the model the location of several new mutations of the hsdS gene of M.EcoR124I that were produced by misincorporation mutagenesis within the central conserved region of hsdS, we have mapped all previously identified DNA-binding mutants of TRD2 and produced a detailed analysis of the location of surface-modifiable lysines. The model structure, together with location of the mutant residues, provides a better background on which to study protein–protein and protein–DNA interactions in Type I R-M systems.
PMCID: PMC1435980  PMID: 16614449
5.  MODOMICS: a database of RNA modification pathways 
Nucleic Acids Research  2005;34(Database issue):D145-D149.
MODOMICS is the first comprehensive database resource for systems biology of RNA modification. It integrates information about the chemical structure of modified nucleosides, their localization in RNA sequences, pathways of their biosynthesis and enzymes that carry out the respective reactions. MODOMICS also provides literature information, and links to other databases, including the available protein sequence and structure data. The current list of modifications and pathways is comprehensive, while the dataset of enzymes is limited to Escherichia coli and Saccharomyces cerevisiae and sequence alignments are presented only for tRNAs from these organisms. RNAs and enzymes from other organisms will be included in the near future. MODOMICS can be queried by the type of nucleoside (e.g. A, G, C, U, I, m1A, nm5s2U, etc.), type of RNA, position of a particular nucleoside, type of reaction (e.g. methylation, thiolation, deamination, etc.) and name or sequence of an enzyme of interest. Options for data presentation include graphs of pathways involving the query nucleoside, multiple sequence alignments of RNA sequences and tabular forms with enzyme and literature data. The contents of MODOMICS can be accessed through the World Wide Web at .
PMCID: PMC1347447  PMID: 16381833
6.  The PD-(D/E)XK superfamily revisited: identification of new members among proteins involved in DNA metabolism and functional predictions for domains of (hitherto) unknown function 
BMC Bioinformatics  2005;6:172.
The PD-(D/E)XK nuclease superfamily, initially identified in type II restriction endonucleases and later in many enzymes involved in DNA recombination and repair, is one of the most challenging targets for protein sequence analysis and structure prediction. Typically, the sequence similarity between these proteins is so low, that most of the relationships between known members of the PD-(D/E)XK superfamily were identified only after the corresponding structures were determined experimentally. Thus, it is tempting to speculate that among the uncharacterized protein families, there are potential nucleases that remain to be discovered, but their identification requires more sensitive tools than traditional PSI-BLAST searches.
The low degree of amino acid conservation hampers the possibility of identification of new members of the PD-(D/E)XK superfamily based solely on sequence comparisons to known members. Therefore, we used a recently developed method HHsearch for sensitive detection of remote similarities between protein families represented as profile Hidden Markov Models enhanced by secondary structure. We carried out a comparison of known families of PD-(D/E)XK nucleases to the database comprising the COG and PFAM profiles corresponding to both functionally characterized as well as uncharacterized protein families to detect significant similarities. The initial candidates for new nucleases were subsequently verified by sequence-structure threading, comparative modeling, and identification of potential active site residues.
In this article, we report identification of the PD-(D/E)XK nuclease domain in numerous proteins implicated in interactions with DNA but with unknown structure and mechanism of action (such as putative recombinase RmuC, DNA competence factor CoiA, a DNA-binding protein SfsA, a large human protein predicted to be a DNA repair enzyme, predicted archaeal transcription regulators, and the head completion protein of phage T4) and in proteins for which no function was assigned to date (such as YhcG, various phage proteins, novel candidates for restriction enzymes). Our results contributes to the reduction of "white spaces" on the sequence-structure-function map of the protein universe and will help to jump-start the experimental characterization of new nucleases, of which many may be of importance for the complete understanding of mechanisms that govern the evolution and stability of the genome.
PMCID: PMC1189080  PMID: 16011798
7.  Identification of a new family of putative PD-(D/E)XK nucleases with unusual phylogenomic distribution and a new type of the active site 
BMC Genomics  2005;6:21.
Prediction of structure and function for uncharacterized protein families by identification of evolutionary links to characterized families and known structures is one of the cornerstones of genomics. Theoretical assignment of three-dimensional folds and prediction of protein function even at a very general level can facilitate the experimental determination of the molecular mechanism of action and the role that members of a given protein family fulfill in the cell. Here, we predict the three-dimensional fold and study the phylogenomic distribution of members of a large family of uncharacterized proteins classified in the Clusters of Orthologous Groups database as COG4636.
Using protein fold-recognition we found that members of COG4636 are remotely related to Holliday junction resolvases and other nucleases from the PD-(D/E)XK superfamily. Structure modeling and sequence analyses suggest that most members of COG4636 exhibit a new, unusual variant of the putative active site, in which the catalytic Lys residue migrated in the sequence, but retained similar spatial position with respect to other functionally important residues. Sequence analyses revealed that members of COG4636 and their homologs are found mainly in Cyanobacteria, but also in other bacterial phyla. They undergo horizontal transfer and extensive proliferation in the colonized genomes; for instance in Gloeobacter violaceus PCC 7421 they comprise over 2% of all protein-encoding genes. Thus, members of COG4636 appear to be a new type of selfish genetic elements, which may fulfill an important role in the genome dynamics of Cyanobacteria and other species they invaded. Our analyses provide a platform for experimental determination of the molecular and cellular function of members of this large protein family.
After submission of this manuscript, a crystal structure of one of the COG4636 members was released in the Protein Data Bank (code 1wdj; Idaka, M., Wada, T., Murayama, K., Terada, T., Kuramitsu, S., Shirouzu, M., Yokoyama, S.: Crystal structure of Tt1808 from Thermus thermophilus Hb8, to be published). Our analysis of the Tt1808 structure reveals that we correctly predicted all functionally important features of the COG4636 family, including the membership in the PD-(D/E)xK superfamily of nucleases, the three-dimensional fold, the putative catalytic residues, and the unusual configuration of the active site.
PMCID: PMC551604  PMID: 15720711
8.  Sequence–structure–function studies of tRNA:m5C methyltransferase Trm4p and its relationship to DNA:m5C and RNA:m5U methyltransferases 
Nucleic Acids Research  2004;32(8):2453-2463.
Three types of methyltransferases (MTases) generate 5-methylpyrimidine in nucleic acids, forming m5U in RNA, m5C in RNA and m5C in DNA. The DNA:m5C MTases have been extensively studied by crystallographic, biophysical, biochemical and computational methods. On the other hand, the sequence–structure–function relationships of RNA:m5C MTases remain obscure, as do the potential evolutionary relationships between the three types of 5-methylpyrimidine-generating enzymes. Sequence analyses and homology modeling of the yeast tRNA:m5C MTase Trm4p (also called Ncl1p) provided a structural and evolutionary platform for identification of catalytic residues and modeling of the architecture of the RNA:m5C MTase active site. The analysis led to the identification of two invariant residues that are important for Trm4p activity in addition to the conserved Cys residues in motif IV and motif VI that were previously found to be critical. The newly identified residues include a Lys residue in motif I and an Asp in motif IV. A conserved Gln found in motif X was found to be dispensable for MTase activity. Locations of essential residues in the model of Trm4p are in very good agreement with the X-ray structure of an RNA:m5C MTase homolog PH1374. Theoretical and experimental analyses revealed that RNA:m5C MTases share a number of features with either RNA:m5U MTases or DNA:m5C MTases, which suggested a tentative phylogenetic model of relationships between these three classes of 5-methylpyrimidine MTases. We infer that RNA:m5C MTases evolved from RNA:m5U MTases by acquiring an additional Cys residue in motif IV, which was adapted to function as the nucleophilic catalyst only later in DNA:m5C MTases, accompanied by loss of the original Cys from motif VI, transfer of a conserved carboxylate from motif IV to motif VI and sequence permutation.
PMCID: PMC419452  PMID: 15121902
9.  Alanine-scanning mutagenesis of the predicted rRNA-binding domain of ErmC′ redefines the substrate-binding site and suggests a model for protein–RNA interactions 
Nucleic Acids Research  2003;31(16):4941-4949.
The Erm family of adenine-N6 methyltransferases (MTases) is responsible for the development of resistance to macrolide–lincosamide–streptogramin B antibiotics through the methylation of 23S ribosomal RNA. Hence, these proteins are important potential drug targets. Despite the availability of the NMR and crystal structures of two members of the family (ErmAM and ErmC′, respectively) and extensive studies on the RNA substrate, the substrate-binding site and the amino acids involved in RNA recognition by the Erm MTases remain unknown. It has been proposed that the small C-terminal domain functions as a target-binding module, but this prediction has not been tested experimentally. We have undertaken structure-based mutational analysis of 13 charged or polar residues located on the predicted rRNA-binding surface of ErmC′ with the aim to identify the area of protein–RNA interactions. The results of in vivo and in vitro analyses of mutant protein suggest that the key RNA-binding residues are located not in the small domain, but in the large catalytic domain, facing the cleft between the two domains. Based on the mutagenesis data, a preliminary three-dimensional model of ErmC′ complexed with the minimal substrate was constructed. The identification of the RNA-binding site of ErmC′ may be useful for structure-based design of novel drugs that do not necessarily bind to the cofactor-binding site common to many S-adenosyl-l- methionine-dependent MTases, but specifically block the substrate-binding site of MTases from the Erm family.
PMCID: PMC169915  PMID: 12907737
10.  Characterization of the cofactor-binding site in the SPOUT-fold methyltransferases by computational docking of S-adenosylmethionine to three crystal structures 
BMC Bioinformatics  2003;4:9.
There are several evolutionarily unrelated and structurally dissimilar superfamilies of S-adenosylmethionine (AdoMet)-dependent methyltransferases (MTases). A new superfamily (SPOUT) has been recently characterized on a sequence level and three structures of its members (1gz0, 1ipa, and 1k3r) have been solved. However, none of these structures include the cofactor or the substrate. Due to the strong evolutionary divergence and the paucity of experimental information, no confident predictions of protein-ligand and protein-substrate interactions could be made, which hampered the study of sequence-structure-function relationships in the SPOUT superfamily.
We used the computational docking program AutoDock to identify the AdoMet-binding site on the surface of three MTase structures. We analyzed the sequence divergence in two distinct lineages of the SPOUT superfamily in the context of surface features and preferred cofactor binding mode to propose specific function for the conserved residues.
Our docking analysis has confidently predicted the common AdoMet-binding site in three remotely related proteins structures. In the vicinity of the cofactor-binding site, subfamily-conserved grooves were identified on the protein surface, suggesting location of the target-binding/catalytic site. Functionally important residues were inferred and a general reaction mechanism, involving conformational change of a glycine-rich loop, was proposed.
PMCID: PMC153507  PMID: 12689347
11.  mRNA:guanine-N7 cap methyltransferases: identification of novel members of the family, evolutionary analysis, homology modeling, and analysis of sequence-structure-function relationships 
BMC Bioinformatics  2001;2:2.
The 5'-terminal cap structure plays an important role in many aspects of mRNA metabolism. Capping enzymes encoded by viruses and pathogenic fungi are attractive targets for specific inhibitors. There is a large body of experimental data on viral and cellular methyltransferases (MTases) that carry out guanine-N7 (cap 0) methylation, including results of extensive mutagenesis. However, a crystal structure is not available and cap 0 MTases are too diverged from other MTases of known structure to allow straightforward homology-based interpretation of these data.
We report a 3D model of cap 0 MTase, developed using sequence-to-structure threading and comparative modeling based on coordinates of the glycine N-methyltransferase. Analysis of the predicted structural features in the phylogenetic context of the cap 0 MTase family allows us to rationalize most of the experimental data available and to propose potential binding sites. We identified a case of correlated mutations in the cofactor-binding site of viral MTases that may be important for the rational drug design. Furthermore, database searches and phylogenetic analysis revealed a novel subfamily of hypothetical MTases from plants, distinct from "orthodox" cap 0 MTases.
Computational methods were used to infer the evolutionary relationships and predict the structure of Eukaryotic cap MTase. Identification of novel cap MTase homologs suggests candidates for cloning and biochemical characterization, while the structural model will be useful in designing new experiments to better understand the molecular function of cap MTases.
PMCID: PMC35267  PMID: 11472630

Results 1-11 (11)