Search tips
Search criteria

Results 1-8 (8)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  The RNase H-like superfamily: new members, comparative structural analysis and evolutionary classification 
Nucleic Acids Research  2014;42(7):4160-4179.
Ribonuclease H-like (RNHL) superfamily, also called the retroviral integrase superfamily, groups together numerous enzymes involved in nucleic acid metabolism and implicated in many biological processes, including replication, homologous recombination, DNA repair, transposition and RNA interference. The RNHL superfamily proteins show extensive divergence of sequences and structures. We conducted database searches to identify members of the RNHL superfamily (including those previously unknown), yielding >60 000 unique domain sequences. Our analysis led to the identification of new RNHL superfamily members, such as RRXRR (PF14239), DUF460 (PF04312, COG2433), DUF3010 (PF11215), DUF429 (PF04250 and COG2410, COG4328, COG4923), DUF1092 (PF06485), COG5558, OrfB_IS605 (PF01385, COG0675) and Peptidase_A17 (PF05380). Based on the clustering analysis we grouped all identified RNHL domain sequences into 152 families. Phylogenetic studies revealed relationships between these families, and suggested a possible history of the evolution of RNHL fold and its active site. Our results revealed clear division of the RNHL superfamily into exonucleases and endonucleases. Structural analyses of features characteristic for particular groups revealed a correlation between the orientation of the C-terminal helix with the exonuclease/endonuclease function and the architecture of the active site. Our analysis provides a comprehensive picture of sequence-structure-function relationships in the RNHL superfamily that may guide functional studies of the previously uncharacterized protein families.
PMCID: PMC3985635  PMID: 24464998
2.  MODOMICS: a database of RNA modification pathways—2013 update 
Nucleic Acids Research  2012;41(D1):D262-D267.
MODOMICS is a database of RNA modifications that provides comprehensive information concerning the chemical structures of modified ribonucleosides, their biosynthetic pathways, RNA-modifying enzymes and location of modified residues in RNA sequences. In the current database version, accessible at, we included new features: a census of human and yeast snoRNAs involved in RNA-guided RNA modification, a new section covering the 5′-end capping process, and a catalogue of ‘building blocks’ for chemical synthesis of a large variety of modified nucleosides. The MODOMICS collections of RNA modifications, RNA-modifying enzymes and modified RNAs have been also updated. A number of newly identified modified ribonucleosides and more than one hundred functionally and structurally characterized proteins from various organisms have been added. In the RNA sequences section, snRNAs and snoRNAs with experimentally mapped modified nucleosides have been added and the current collection of rRNA and tRNA sequences has been substantially enlarged. To facilitate literature searches, each record in MODOMICS has been cross-referenced to other databases and to selected key publications. New options for database searching and querying have been implemented, including a BLAST search of protein sequences and a PARALIGN search of the collected nucleic acid sequences.
PMCID: PMC3531130  PMID: 23118484
3.  Delineation of structural domains and identification of functionally important residues in DNA repair enzyme exonuclease VII 
Nucleic Acids Research  2012;40(16):8163-8174.
Exonuclease VII (ExoVII) is a bacterial nuclease involved in DNA repair and recombination that hydrolyses single-stranded DNA. ExoVII is composed of two subunits: large XseA and small XseB. Thus far, little was known about the molecular structure of ExoVII, the interactions between XseA and XseB, the architecture of the nuclease active site or its mechanism of action. We used bioinformatics methods to predict the structure of XseA, which revealed four domains: an N-terminal OB-fold domain, a middle putatively catalytic domain, a coiled-coil domain and a short C-terminal segment. By series of deletion and site-directed mutagenesis experiments on XseA from Escherichia coli, we determined that the OB-fold domain is responsible for DNA binding, the coiled-coil domain is involved in binding multiple copies of the XseB subunit and residues D155, R205, H238 and D241 of the middle domain are important for the catalytic activity but not for DNA binding. Altogether, we propose a model of sequence–structure–function relationships in ExoVII.
PMCID: PMC3439923  PMID: 22718974
4.  MODOMICS: a database of RNA modification pathways. 2008 update 
Nucleic Acids Research  2008;37(Database issue):D118-D121.
MODOMICS, a database devoted to the systems biology of RNA modification, has been subjected to substantial improvements. It provides comprehensive information on the chemical structure of modified nucleosides, pathways of their biosynthesis, sequences of RNAs containing these modifications and RNA-modifying enzymes. MODOMICS also provides cross-references to other databases and to literature. In addition to the previously available manually curated tRNA sequences from a few model organisms, we have now included additional tRNAs and rRNAs, and all RNAs with 3D structures in the Nucleic Acid Database, in which modified nucleosides are present. In total, 3460 modified bases in RNA sequences of different organisms have been annotated. New RNA-modifying enzymes have been also added. The current collection of enzymes includes mainly proteins for the model organisms Escherichia coli and Saccharomyces cerevisiae, and is currently being expanded to include proteins from other organisms, in particular Archaea and Homo sapiens. For enzymes with known structures, links are provided to the corresponding Protein Data Bank entries, while for many others homology models have been created. Many new options for database searching and querying have been included. MODOMICS can be accessed at
PMCID: PMC2686465  PMID: 18854352
5.  Structural and evolutionary bioinformatics of the SPOUT superfamily of methyltransferases 
BMC Bioinformatics  2007;8:73.
SPOUT methyltransferases (MTases) are a large class of S-adenosyl-L-methionine-dependent enzymes that exhibit an unusual alpha/beta fold with a very deep topological knot. In 2001, when no crystal structures were available for any of these proteins, Anantharaman, Koonin, and Aravind identified homology between SpoU and TrmD MTases and defined the SPOUT superfamily. Since then, multiple crystal structures of knotted MTases have been solved and numerous new homologous sequences appeared in the databases. However, no comprehensive comparative analysis of these proteins has been carried out to classify them based on structural and evolutionary criteria and to guide functional predictions.
We carried out extensive searches of databases of protein structures and sequences to collect all members of previously identified SPOUT MTases, and to identify previously unknown homologs. Based on sequence clustering, characterization of domain architecture, structure predictions and sequence/structure comparisons, we re-defined families within the SPOUT superfamily and predicted putative active sites and biochemical functions for the so far uncharacterized members. We have also delineated the common core of SPOUT MTases and inferred a multiple sequence alignment for the conserved knot region, from which we calculated the phylogenetic tree of the superfamily. We have also studied phylogenetic distribution of different families, and used this information to infer the evolutionary history of the SPOUT superfamily.
We present the first phylogenetic tree of the SPOUT superfamily since it was defined, together with a new scheme for its classification, and discussion about conservation of sequence and structure in different families, and their functional implications. We identified four protein families as new members of the SPOUT superfamily. Three of these families are functionally uncharacterized (COG1772, COG1901, and COG4080), and one (COG1756 represented by Nep1p) has been already implicated in RNA metabolism, but its biochemical function has been unknown. Based on the inference of orthologous and paralogous relationships between all SPOUT families we propose that the Last Universal Common Ancestor (LUCA) of all extant organisms contained at least three SPOUT members, ancestors of contemporary RNA MTases that carry out m1G, m3U, and 2'O-ribose methylation, respectively. In this work we also speculate on the origin of the knot and propose possible 'unknotted' ancestors. The results of our analysis provide a comprehensive 'roadmap' for experimental characterization of SPOUT MTases and interpretation of functional studies in the light of sequence-structure relationships.
PMCID: PMC1829167  PMID: 17338813
6.  The yfhQ gene of Escherichia coli encodes a tRNA:Cm32/Um32 methyltransferase 
Naturally occurring tRNAs contain numerous modified nucleosides. They are formed by enzymatic modification of the primary transcripts during the complex RNA maturation process. In model organisms Escherichia coli and Saccharomyces cerevisiae most enzymes involved in this process have been identified. Interestingly, it was found that tRNA methylation, one of the most common modifications, can be introduced by S-adenosyl-L-methionine (AdoMet)-dependent methyltransferases (MTases) that belong to two structurally and phylogenetically unrelated protein superfamilies: RFM and SPOUT.
As a part of a large-scale project aiming at characterization of a complete set of RNA modification enzymes of model organisms, we have studied the Escherichia coli proteins YibK, LasT, YfhQ, and YbeA for their ability to introduce the last unassigned methylations of ribose at positions 32 and 34 of the tRNA anticodon loop. We found that YfhQ catalyzes the AdoMet-dependent formation of Cm32 or Um32 in tRNASer1 and tRNAGln2 and that an E. coli strain with a disrupted yfhQ gene lacks the tRNA:Cm32/Um32 methyltransferase activity. Thus, we propose to rename YfhQ as TrMet(Xm32) according to the recently proposed, uniform nomenclature for all RNA modification enzymes, or TrmJ, according to the traditional nomenclature for bacterial tRNA MTases.
Our results reveal that methylation at position 32 is carried out by completely unrelated TrMet(Xm32) enzymes in eukaryota and prokaryota (RFM superfamily member Trm7 and SPOUT superfamily member TrmJ, respectively), mirroring the scenario observed in the case of the m1G37 modification (introduced by the RFM member Trm5 in eukaryota and archaea, and by the SPOUT member TrmD in bacteria).
PMCID: PMC1569432  PMID: 16848900
7.  Phylogenomic analysis of the GIY-YIG nuclease superfamily 
BMC Genomics  2006;7:98.
The GIY-YIG domain was initially identified in homing endonucleases and later in other selfish mobile genetic elements (including restriction enzymes and non-LTR retrotransposons) and in enzymes involved in DNA repair and recombination. However, to date no systematic search for novel members of the GIY-YIG superfamily or comparative analysis of these enzymes has been reported.
We carried out database searches to identify all members of known GIY-YIG nuclease families. Multiple sequence alignments together with predicted secondary structures of identified families were represented as Hidden Markov Models (HMM) and compared by the HHsearch method to the uncharacterized protein families gathered in the COG, KOG, and PFAM databases. This analysis allowed for extending the GIY-YIG superfamily to include members of COG3680 and a number of proteins not classified in COGs and to predict that these proteins may function as nucleases, potentially involved in DNA recombination and/or repair. Finally, all old and new members of the GIY-YIG superfamily were compared and analyzed to infer the phylogenetic tree.
An evolutionary classification of the GIY-YIG superfamily is presented for the very first time, along with the structural annotation of all (sub)families. It provides a comprehensive picture of sequence-structure-function relationships in this superfamily of nucleases, which will help to design experiments to study the mechanism of action of known members (especially the uncharacterized ones) and will facilitate the prediction of function for the newly discovered ones.
PMCID: PMC1564403  PMID: 16646971
8.  MODOMICS: a database of RNA modification pathways 
Nucleic Acids Research  2005;34(Database issue):D145-D149.
MODOMICS is the first comprehensive database resource for systems biology of RNA modification. It integrates information about the chemical structure of modified nucleosides, their localization in RNA sequences, pathways of their biosynthesis and enzymes that carry out the respective reactions. MODOMICS also provides literature information, and links to other databases, including the available protein sequence and structure data. The current list of modifications and pathways is comprehensive, while the dataset of enzymes is limited to Escherichia coli and Saccharomyces cerevisiae and sequence alignments are presented only for tRNAs from these organisms. RNAs and enzymes from other organisms will be included in the near future. MODOMICS can be queried by the type of nucleoside (e.g. A, G, C, U, I, m1A, nm5s2U, etc.), type of RNA, position of a particular nucleoside, type of reaction (e.g. methylation, thiolation, deamination, etc.) and name or sequence of an enzyme of interest. Options for data presentation include graphs of pathways involving the query nucleoside, multiple sequence alignments of RNA sequences and tabular forms with enzyme and literature data. The contents of MODOMICS can be accessed through the World Wide Web at .
PMCID: PMC1347447  PMID: 16381833

Results 1-8 (8)