PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-20 (20)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
2.  Best practices in bioinformatics training for life scientists 
Briefings in Bioinformatics  2013;14(5):528-537.
The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.
doi:10.1093/bib/bbt043
PMCID: PMC3771230  PMID: 23803301
bioinformatics; training; bioinformatics courses; training life scientists; train the trainers
3.  iAnn: an event sharing platform for the life sciences 
Bioinformatics  2013;29(15):1919-1921.
Summary: We present iAnn, an open source community-driven platform for dissemination of life science events, such as courses, conferences and workshops. iAnn allows automatic visualisation and integration of customised event reports. A central repository lies at the core of the platform: curators add submitted events, and these are subsequently accessed via web services. Thus, once an iAnn widget is incorporated into a website, it permanently shows timely relevant information as if it were native to the remote site. At the same time, announcements submitted to the repository are automatically disseminated to all portals that query the system. To facilitate the visualization of announcements, iAnn provides powerful filtering options and views, integrated in Google Maps and Google Calendar. All iAnn widgets are freely available.
Availability: http://iann.pro/iannviewer
Contact: manuel.corpas@tgac.ac.uk
doi:10.1093/bioinformatics/btt306
PMCID: PMC3712218  PMID: 23742982
4.  A toolbox for developing bioinformatics software 
Briefings in Bioinformatics  2011;13(2):244-257.
Creating useful software is a major activity of many scientists, including bioinformaticians. Nevertheless, software development in an academic setting is often unsystematic, which can lead to problems associated with maintenance and long-term availibility. Unfortunately, well-documented software development methodology is difficult to adopt, and technical measures that directly improve bioinformatic programming have not been described comprehensively. We have examined 22 software projects and have identified a set of practices for software development in an academic environment. We found them useful to plan a project, support the involvement of experts (e.g. experimentalists), and to promote higher quality and maintainability of the resulting programs. This article describes 12 techniques that facilitate a quick start into software engineering. We describe 3 of the 22 projects in detail and give many examples to illustrate the usage of particular techniques. We expect this toolbox to be useful for many bioinformatics programming projects and to the training of scientific programmers.
doi:10.1093/bib/bbr035
PMCID: PMC3294241  PMID: 21803787
software development; programming; project management; software quality
5.  CompaRNA: a server for continuous benchmarking of automated methods for RNA secondary structure prediction 
Nucleic Acids Research  2013;41(7):4307-4323.
We present a continuous benchmarking approach for the assessment of RNA secondary structure prediction methods implemented in the CompaRNA web server. As of 3 October 2012, the performance of 28 single-sequence and 13 comparative methods has been evaluated on RNA sequences/structures released weekly by the Protein Data Bank. We also provide a static benchmark generated on RNA 2D structures derived from the RNAstrand database. Benchmarks on both data sets offer insight into the relative performance of RNA secondary structure prediction methods on RNAs of different size and with respect to different types of structure. According to our tests, on the average, the most accurate predictions obtained by a comparative approach are generated by CentroidAlifold, MXScarna, RNAalifold and TurboFold. On the average, the most accurate predictions obtained by single-sequence analyses are generated by CentroidFold, ContextFold and IPknot. The best comparative methods typically outperform the best single-sequence methods if an alignment of homologous RNA sequences is available. This article presents the results of our benchmarks as of 3 October 2012, whereas the rankings presented online are continuously updated. We will gladly include new prediction methods and new measures of accuracy in the new editions of CompaRNA benchmarks.
doi:10.1093/nar/gkt101
PMCID: PMC3627593  PMID: 23435231
6.  RNAmap2D – calculation, visualization and analysis of contact and distance maps for RNA and protein-RNA complex structures 
BMC Bioinformatics  2012;13:333.
Background
The structures of biological macromolecules provide a framework for studying their biological functions. Three-dimensional structures of proteins, nucleic acids, or their complexes, are difficult to visualize in detail on flat surfaces, and algorithms for their spatial superposition and comparison are computationally costly. Molecular structures, however, can be represented as 2D maps of interactions between the individual residues, which are easier to visualize and compare, and which can be reconverted to 3D structures with reasonable precision. There are many visualization tools for maps of protein structures, but few for nucleic acids.
Results
We developed RNAmap2D, a platform-independent software tool for calculation, visualization and analysis of contact and distance maps for nucleic acid molecules and their complexes with proteins or ligands. The program addresses the problem of paucity of bioinformatics tools dedicated to analyzing RNA 2D maps, given the growing number of experimentally solved RNA structures in the Protein Data Bank (PDB) repository, as well as the growing number of tools for RNA 2D and 3D structure prediction. RNAmap2D allows for calculation and analysis of contacts and distances between various classes of atoms in nucleic acid, protein, and small ligand molecules. It also discriminates between different types of base pairing and stacking.
Conclusions
RNAmap2D is an easy to use method to visualize, analyze and compare structures of nucleic acid molecules and their complexes with other molecules, such as proteins or ligands and metal ions. Its special features make it a very useful tool for analysis of tertiary structures of RNAs. RNAmap2D for Windows/Linux/MacOSX is freely available for academic users at http://iimcb.genesilico.pl/rnamap2d.html
doi:10.1186/1471-2105-13-333
PMCID: PMC3556492  PMID: 23259794
Contact maps; Distance maps; RNA secondary structure; RNA base pairing; RNA stacking; Protein-RNA complex; Docking
7.  Voronoia4RNA—a database of atomic packing densities of RNA structures and their complexes 
Nucleic Acids Research  2012;41(D1):D280-D284.
Voronoia4RNA (http://proteinformatics.charite.de/voronoia4rna/) is a structural database storing precalculated atomic volumes, atomic packing densities (PDs) and coordinates of internal cavities for currently 1869 RNAs and RNA–protein complexes. Atomic PDs are a measure for van der Waals interactions. Regions of low PD, containing water-sized internal cavities, refer to local structure flexibility or compressibility. RNA molecules build up the skeleton of large molecular machineries such as ribosomes or form smaller flexible structures such as riboswitches. The wealth of structural data on RNAs and their complexes allows setting up representative data sets and analysis of their structural features. We calculated atomic PDs from atomic volumes determined by the Voronoi cell method and internal cavities analytically by Delaunay triangulation. Reference internal PD values were derived from a non-redundant sub-data set of buried atoms. Comparison of internal PD values shows that RNA is more tightly packed than proteins. Finally, the relation between structure size, resolution and internal PD of the Voronoia4RNA entries is discussed. RNA, protein structures and their complexes can be visualized by the Jmol-based viewer Provi. Variations in PD are depicted by a color code. Internal cavities are represented by their molecular boundaries or schematically as balls.
doi:10.1093/nar/gks1061
PMCID: PMC3531177  PMID: 23161674
8.  RNApathwaysDB—a database of RNA maturation and decay pathways 
Nucleic Acids Research  2012;41(D1):D268-D272.
Many RNA molecules undergo complex maturation, involving e.g. excision from primary transcripts, removal of introns, post-transcriptional modification and polyadenylation. The level of mature, functional RNAs in the cell is controlled not only by the synthesis and maturation but also by degradation, which proceeds via many different routes. The systematization of data about RNA metabolic pathways and enzymes taking part in RNA maturation and degradation is essential for the full understanding of these processes. RNApathwaysDB, available online at http://iimcb.genesilico.pl/rnapathwaysdb, is an online resource about maturation and decay pathways involving RNA as the substrate. The current release presents information about reactions and enzymes that take part in the maturation and degradation of tRNA, rRNA and mRNA, and describes pathways in three model organisms: Escherichia coli, Saccharomyces cerevisiae and Homo sapiens. RNApathwaysDB can be queried with keywords, and sequences of protein enzymes involved in RNA processing can be searched with BLAST. Options for data presentation include pathway graphs and tables with enzymes and literature data. Structures of macromolecular complexes involving RNA and proteins that act on it are presented as ‘potato models’ using DrawBioPath—a new javascript tool.
doi:10.1093/nar/gks1052
PMCID: PMC3531052  PMID: 23155061
9.  MODOMICS: a database of RNA modification pathways—2013 update 
Nucleic Acids Research  2012;41(D1):D262-D267.
MODOMICS is a database of RNA modifications that provides comprehensive information concerning the chemical structures of modified ribonucleosides, their biosynthetic pathways, RNA-modifying enzymes and location of modified residues in RNA sequences. In the current database version, accessible at http://modomics.genesilico.pl, we included new features: a census of human and yeast snoRNAs involved in RNA-guided RNA modification, a new section covering the 5′-end capping process, and a catalogue of ‘building blocks’ for chemical synthesis of a large variety of modified nucleosides. The MODOMICS collections of RNA modifications, RNA-modifying enzymes and modified RNAs have been also updated. A number of newly identified modified ribonucleosides and more than one hundred functionally and structurally characterized proteins from various organisms have been added. In the RNA sequences section, snRNAs and snoRNAs with experimentally mapped modified nucleosides have been added and the current collection of rRNA and tRNA sequences has been substantially enlarged. To facilitate literature searches, each record in MODOMICS has been cross-referenced to other databases and to selected key publications. New options for database searching and querying have been implemented, including a BLAST search of protein sequences and a PARALIGN search of the collected nucleic acid sequences.
doi:10.1093/nar/gks1007
PMCID: PMC3531130  PMID: 23118484
10.  Bioinformatics Training Network (BTN): a community resource for bioinformatics trainers 
Briefings in Bioinformatics  2011;13(3):383-389.
Funding bodies are increasingly recognizing the need to provide graduates and researchers with access to short intensive courses in a variety of disciplines, in order both to improve the general skills base and to provide solid foundations on which researchers may build their careers. In response to the development of ‘high-throughput biology’, the need for training in the field of bioinformatics, in particular, is seeing a resurgence: it has been defined as a key priority by many Institutions and research programmes and is now an important component of many grant proposals. Nevertheless, when it comes to planning and preparing to meet such training needs, tension arises between the reward structures that predominate in the scientific community which compel individuals to publish or perish, and the time that must be devoted to the design, delivery and maintenance of high-quality training materials. Conversely, there is much relevant teaching material and training expertise available worldwide that, were it properly organized, could be exploited by anyone who needs to provide training or needs to set up a new course. To do this, however, the materials would have to be centralized in a database and clearly tagged in relation to target audiences, learning objectives, etc. Ideally, they would also be peer reviewed, and easily and efficiently accessible for downloading. Here, we present the Bioinformatics Training Network (BTN), a new enterprise that has been initiated to address these needs and review it, respectively, to similar initiatives and collections.
doi:10.1093/bib/bbr064
PMCID: PMC3357490  PMID: 22110242
Bioinformatics; training; end users; bioinformatics courses; learning bioinformatics
11.  MetalionRNA: computational predictor of metal-binding sites in RNA structures 
Bioinformatics  2011;28(2):198-205.
Motivation: Metal ions are essential for the folding of RNA molecules into stable tertiary structures and are often involved in the catalytic activity of ribozymes. However, the positions of metal ions in RNA 3D structures are difficult to determine experimentally. This motivated us to develop a computational predictor of metal ion sites for RNA structures.
Results: We developed a statistical potential for predicting positions of metal ions (magnesium, sodium and potassium), based on the analysis of binding sites in experimentally solved RNA structures. The MetalionRNA program is available as a web server that predicts metal ions for RNA structures submitted by the user.
Availability: The MetalionRNA web server is accessible at http://metalionrna.genesilico.pl/.
Contact: iamb@genesilico.pl
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btr636
PMCID: PMC3259437  PMID: 22110243
12.  Databases and Bioinformatics Tools for the Study of DNA Repair 
DNA is continuously exposed to many different damaging agents such as environmental chemicals, UV light, ionizing radiation, and reactive cellular metabolites. DNA lesions can result in different phenotypical consequences ranging from a number of diseases, including cancer, to cellular malfunction, cell death, or aging. To counteract the deleterious effects of DNA damage, cells have developed various repair systems, including biochemical pathways responsible for the removal of single-strand lesions such as base excision repair (BER) and nucleotide excision repair (NER) or specialized polymerases temporarily taking over lesion-arrested DNA polymerases during the S phase in translesion synthesis (TLS). There are also other mechanisms of DNA repair such as homologous recombination repair (HRR), nonhomologous end-joining repair (NHEJ), or DNA damage response system (DDR). This paper reviews bioinformatics resources specialized in disseminating information about DNA repair pathways, proteins involved in repair mechanisms, damaging agents, and DNA lesions.
doi:10.4061/2011/475718
PMCID: PMC3200286  PMID: 22091405
13.  ModeRNA: a tool for comparative modeling of RNA 3D structure 
Nucleic Acids Research  2011;39(10):4007-4022.
RNA is a large group of functionally important biomacromolecules. In striking analogy to proteins, the function of RNA depends on its structure and dynamics, which in turn is encoded in the linear sequence. However, while there are numerous methods for computational prediction of protein three-dimensional (3D) structure from sequence, with comparative modeling being the most reliable approach, there are very few such methods for RNA. Here, we present ModeRNA, a software tool for comparative modeling of RNA 3D structures. As an input, ModeRNA requires a 3D structure of a template RNA molecule, and a sequence alignment between the target to be modeled and the template. It must be emphasized that a good alignment is required for successful modeling, and for large and complex RNA molecules the development of a good alignment usually requires manual adjustments of the input data based on previous expertise of the respective RNA family. ModeRNA can model post-transcriptional modifications, a functionally important feature analogous to post-translational modifications in proteins. ModeRNA can also model DNA structures or use them as templates. It is equipped with many functions for merging fragments of different nucleic acid structures into a single model and analyzing their geometry. Windows and UNIX implementations of ModeRNA with comprehensive documentation and a tutorial are freely available.
doi:10.1093/nar/gkq1320
PMCID: PMC3105415  PMID: 21300639
14.  RNA and protein 3D structure modeling: similarities and differences 
Journal of Molecular Modeling  2011;17(9):2325-2336.
In analogy to proteins, the function of RNA depends on its structure and dynamics, which are encoded in the linear sequence. While there are numerous methods for computational prediction of protein 3D structure from sequence, there have been very few such methods for RNA. This review discusses template-based and template-free approaches for macromolecular structure prediction, with special emphasis on comparison between the already tried-and-tested methods for protein structure modeling and the very recently developed “protein-like” modeling methods for RNA. We highlight analogies between many successful methods for modeling of these two types of biological macromolecules and argue that RNA 3D structure can be modeled using “protein-like” methodology. We also highlight the areas where the differences between RNA and proteins require the development of RNA-specific solutions.
FigureApproaches for predicting RNA structure. Top: Template-free modeling. Bottom: Template-based modeling
doi:10.1007/s00894-010-0951-x
PMCID: PMC3168752  PMID: 21258831
Assessment; Prediction; RNA; Structure; Tertiary
15.  REPAIRtoire—a database of DNA repair pathways 
Nucleic Acids Research  2010;39(Database issue):D788-D792.
REPAIRtoire is the first comprehensive database resource for systems biology of DNA damage and repair. The database collects and organizes the following types of information: (i) DNA damage linked to environmental mutagenic and cytotoxic agents, (ii) pathways comprising individual processes and enzymatic reactions involved in the removal of damage, (iii) proteins participating in DNA repair and (iv) diseases correlated with mutations in genes encoding DNA repair proteins. REPAIRtoire provides also links to publications and external databases. REPAIRtoire contains information about eight main DNA damage checkpoint, repair and tolerance pathways: DNA damage signaling, direct reversal repair, base excision repair, nucleotide excision repair, mismatch repair, homologous recombination repair, nonhomologous end-joining and translesion synthesis. The pathway/protein dataset is currently limited to three model organisms: Escherichia coli, Saccharomyces cerevisiae and Homo sapiens. The DNA repair and tolerance pathways are represented as graphs and in tabular form with descriptions of each repair step and corresponding proteins, and individual entries are cross-referenced to supporting literature and primary databases. REPAIRtoire can be queried by the name of pathway, protein, enzymatic complex, damage and disease. In addition, a tool for drawing custom DNA–protein complexes is available online. REPAIRtoire is freely available and can be accessed at http://repairtoire.genesilico.pl/.
doi:10.1093/nar/gkq1087
PMCID: PMC3013684  PMID: 21051355
16.  HepatoNet1: a comprehensive metabolic reconstruction of the human hepatocyte for the analysis of liver physiology 
We present HepatoNet1, a manually curated large-scale metabolic network of the human hepatocyte that encompasses >2500 reactions in six intracellular and two extracellular compartments.Using constraint-based modeling techniques, the network has been validated to replicate numerous metabolic functions of hepatocytes corresponding to a reference set of diverse physiological liver functions.Taking the detoxification of ammonia and the formation of bile acids as examples, we show how these liver-specific metabolic objectives can be achieved by the variable interplay of various metabolic pathways under varying conditions of nutrients and oxygen availability.
The liver has a pivotal function in metabolic homeostasis of the human body. Hepatocytes are the principal site of the metabolic conversions that underlie diverse physiological functions of the liver. These functions include provision and homeostasis of carbohydrates, amino acids, lipids and lipoproteins in the systemic blood circulation, biotransformation, plasma protein synthesis and bile formation, to name a few. Accordingly, hepatocyte metabolism integrates a vast array of differentially regulated biochemical activities and is highly responsive to environmental perturbations such as changes in portal blood composition (Dardevet et al, 2006). The complexity of this metabolic network and the numerous physiological functions to be achieved within a highly variable physiological environment necessitate an integrated approach with the aim of understanding liver metabolism at a systems level. To this end, we present HepatoNet1, a stoichiometric network of human hepatocyte metabolism characterized by (i) comprehensive coverage of known biochemical activities of hepatocytes and (ii) due representation of the biochemical and physiological functions of hepatocytes as functional network states. The network comprises 777 metabolites in six intracellular (cytosol, endoplasmic reticulum and Golgi apparatus, lysosome, mitochondria, nucleus, and peroxisome) and two extracellular compartments (bile canaliculus and sinusoidal space) and 2539 reactions, including 1466 transport reactions. It is based on the manual evaluation of >1500 original scientific research publications to warrant a high-quality evidence-based model. The final network is the result of an iterative process of data compilation and rigorous computational testing of network functionality by means of constraint-based modeling techniques. We performed flux-balance analyses to validate whether for >300 different metabolic objectives a non-zero stationary flux distribution could be established in the network. Figure 1 shows one such functional flux mode associated with the synthesis of the bile acid glycochenodeoxycholate, one important hepatocyte-specific physiological liver function. Besides those pathways directly linked to the synthesis of the bile acid, the mevalonate pathway and the de novo synthesis of cholesterol, the flux mode comprises additional pathways such as gluconeogenesis, the pentose phosphate pathway or the ornithine cycle because the calculations were routinely performed on a minimal set of exchangeable metabolites, that is all reactants were forced to be balanced and all exportable intermediates had to be catabolized into non-degradable end products. This example shows how HepatoNet1 under the challenges of limited exchange across the network boundary can reveal numerous cross-links between metabolic pathways traditionally perceived as separate entities. For example, alanine is used as gluconeogenic substrate to form glucose-6-phosphate, which is used in the pentose phosphate pathway to generate NADPH. The glycine moiety for bile acid conjugation is derived from serine. Conversion of ammonia into non-toxic nitrogen compounds is one central homeostatic function of hepatocytes. Using the HepatoNet1 model, we investigated, as another example of a complex metabolic objective dependent on systemic physiological parameters, how the consumption of oxygen, glucose and palmitate is affected when an external nitrogen load is converted in varying proportions to the non-toxic nitrogen compounds: urea, glutamine and alanine. The results reveal strong dependencies between the available level of oxygen and the substrate demand of hepatocytes required for effective ammonia detoxification by the liver.
Oxygen demand is highest if nitrogen is exclusively transformed into urea. At lower fluxes into urea, an intriguing pattern for oxygen demand is predicted: oxygen demand attains a minimum if the nitrogen load is directed to urea, glutamine and alanine with relative fluxes of 0.17, 0.43 and 0.40, respectively (Figure 2A). Oxygen demand in this flux distribution is four times lower than for the maximum (100% urea) and still 77 and 33% lower than using alanine and glutamine as exclusive nitrogen compounds, respectively. This computationally predicted tendency is consistent with the notion that the zonation of ammonia detoxification, that is the preferential conversion of ammonia to urea in periportal hepatocytes and to glutamine in perivenous hepatocytes, is dictated by the availability of oxygen (Gebhardt, 1992; Jungermann and Kietzmann, 2000). The decreased oxygen demand in flux distributions using higher proportions of glutamine or alanine is accompanied by increased uptake of the substrates glucose and palmitate (Figure 2B). This is due to an increased demand of energy and carbon for the amidation and transamination of glutamate and pyruvate to discharge nitrogen in the form of glutamine and alanine, respectively. In terms of both scope and specificity, our model bridges the scale between models constructed specifically to examine distinct metabolic processes of the liver and modeling based on a global representation of human metabolism. The former include models for the interdependence of gluconeogenesis and fatty-acid catabolism (Chalhoub et al, 2007), impairment of glucose production in von Gierke's and Hers' diseases (Beard and Qian, 2005) and other processes (Calik and Akbay, 2000; Stucki and Urbanczik, 2005; Ohno et al, 2008). The hallmark of these models is that each of them focuses on a small number of reactions pertinent to the metabolic function of interest embedded in a customized representation of the principal pathways of central metabolism. HepatoNet1, currently, outperforms liver-specific models computationally predicted (Shlomi et al, 2008) on the basis of global reconstructions of human metabolism (Duarte et al, 2007; Ma and Goryanin, 2008). In contrast to either of the aforementioned modeling scales, HepatoNet1 provides the combination of a system-scale representation of metabolic activities and representation of the cell type-specific physical boundaries and their specific transport capacities. This allows for a highly versatile use of the model for the analysis of various liver-specific physiological functions. Conceptually, from a biological system perspective, this type of model offers a large degree of comprehensiveness, whereas retaining tissue specificity, a fundamental design principle of mammalian metabolism. HepatoNet1 is expected to provide a structural platform for computational studies on liver function. The results presented herein highlight how internal fluxes of hepatocyte metabolism and the interplay with systemic physiological parameters can be analyzed with constraint-based modeling techniques. At the same time, the framework may serve as a scaffold for complementation of kinetic and regulatory properties of enzymes and transporters for analysis of sub-networks with topological or kinetic modeling methods.
We present HepatoNet1, the first reconstruction of a comprehensive metabolic network of the human hepatocyte that is shown to accomplish a large canon of known metabolic liver functions. The network comprises 777 metabolites in six intracellular and two extracellular compartments and 2539 reactions, including 1466 transport reactions. It is based on the manual evaluation of >1500 original scientific research publications to warrant a high-quality evidence-based model. The final network is the result of an iterative process of data compilation and rigorous computational testing of network functionality by means of constraint-based modeling techniques. Taking the hepatic detoxification of ammonia as an example, we show how the availability of nutrients and oxygen may modulate the interplay of various metabolic pathways to allow an efficient response of the liver to perturbations of the homeostasis of blood compounds.
doi:10.1038/msb.2010.62
PMCID: PMC2964118  PMID: 20823849
computational biology; flux balance; liver; minimal flux
17.  Voronoia: analyzing packing in protein structures 
Nucleic Acids Research  2008;37(Database issue):D393-D395.
The packing of protein atoms is an indicator for their stability and functionality, and applied in determining thermostability, in protein design, ligand binding and to identify flexible regions in proteins. Here, we present Voronoia, a database of atomic-scale packing data for protein 3D structures. It is based on an improved Voronoi Cell algorithm using hyperboloid interfaces to construct atomic volumes, and to resolve solvent-accessible and -inaccessible regions of atoms. The database contains atomic volumes, local packing densities and interior cavities calculated for 61 318 biological units from the PDB. A report for each structure summarizes the packing by residue and atom types, and lists the environment of interior cavities. The packing data are compared to a nonredundant set of structures from SCOP superfamilies. Both packing densities and cavities can be visualized in the 3D structures by the Jmol plugin. Additionally, PDB files can be submitted to the Voronoia server for calculation. This service performs calculations for most full-atomic protein structures within a few minutes. For batch jobs, a standalone version of the program with an optional PyMOL plugin is available for download. The database can be freely accessed at: http://bioinformatics.charite.de/voronoia.
doi:10.1093/nar/gkn769
PMCID: PMC2686436  PMID: 18948293
18.  MODOMICS: a database of RNA modification pathways. 2008 update 
Nucleic Acids Research  2008;37(Database issue):D118-D121.
MODOMICS, a database devoted to the systems biology of RNA modification, has been subjected to substantial improvements. It provides comprehensive information on the chemical structure of modified nucleosides, pathways of their biosynthesis, sequences of RNAs containing these modifications and RNA-modifying enzymes. MODOMICS also provides cross-references to other databases and to literature. In addition to the previously available manually curated tRNA sequences from a few model organisms, we have now included additional tRNAs and rRNAs, and all RNAs with 3D structures in the Nucleic Acid Database, in which modified nucleosides are present. In total, 3460 modified bases in RNA sequences of different organisms have been annotated. New RNA-modifying enzymes have been also added. The current collection of enzymes includes mainly proteins for the model organisms Escherichia coli and Saccharomyces cerevisiae, and is currently being expanded to include proteins from other organisms, in particular Archaea and Homo sapiens. For enzymes with known structures, links are provided to the corresponding Protein Data Bank entries, while for many others homology models have been created. Many new options for database searching and querying have been included. MODOMICS can be accessed at http://genesilico.pl/modomics.
doi:10.1093/nar/gkn710
PMCID: PMC2686465  PMID: 18854352
19.  SuperHapten: a comprehensive database for small immunogenic compounds 
Nucleic Acids Research  2006;35(Database issue):D906-D910.
The immune system protects organisms from foreign proteins, peptide epitopes and a multitude of chemical compounds. Among these, haptens are small molecules, eliciting an immune response when conjugated with carrier molecules. Known haptens are xenobiotics or natural compounds, which can induce a number of autoimmune diseases like contact dermatitis or asthma. Furthermore, haptens are utilized in the development of biosensors, immunomodulators and new vaccines. Although hapten-induced allergies account for 6–10% of all adverse drug effects, the understanding of the correlation between structural and haptenic properties is rather fragmentary. We have developed a manually curated hapten database, SuperHapten, integrating information from literature and web resources. The current version of the database compiles 2D/3D structures, physicochemical properties and references for about 7500 haptens and 25,000 synonyms. The commercial availability is documented for about 6300 haptens and 450 related antibodies, enabling experimental approaches on cross-reactivity. The haptens are classified regarding their origin: pesticides, herbicides, insecticides, drugs, natural compounds, etc. Queries allow identification of haptens and associated antibodies according to functional class, carrier protein, chemical scaffold, composition or structural similarity. SuperHapten is available online at .
doi:10.1093/nar/gkl849
PMCID: PMC1669746  PMID: 17090587
20.  Columba: an integrated database of proteins, structures, and annotations 
BMC Bioinformatics  2005;6:81.
Background
Structural and functional research often requires the computation of sets of protein structures based on certain properties of the proteins, such as sequence features, fold classification, or functional annotation. Compiling such sets using current web resources is tedious because the necessary data are spread over many different databases. To facilitate this task, we have created COLUMBA, an integrated database of annotations of protein structures.
Description
COLUMBA currently integrates twelve different databases, including PDB, KEGG, Swiss-Prot, CATH, SCOP, the Gene Ontology, and ENZYME. The database can be searched using either keyword search or data source-specific web forms. Users can thus quickly select and download PDB entries that, for instance, participate in a particular pathway, are classified as containing a certain CATH architecture, are annotated as having a certain molecular function in the Gene Ontology, and whose structures have a resolution under a defined threshold. The results of queries are provided in both machine-readable extensible markup language and human-readable format. The structures themselves can be viewed interactively on the web.
Conclusion
The COLUMBA database facilitates the creation of protein structure data sets for many structure-based studies. It allows to combine queries on a number of structure-related databases not covered by other projects at present. Thus, information on both many and few protein structures can be used efficiently. The web interface for COLUMBA is available at .
doi:10.1186/1471-2105-6-81
PMCID: PMC1087474  PMID: 15801979

Results 1-20 (20)