PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-14 (14)
 

Clipboard (0)
None

Select a Filter Below

Journals
Authors
Year of Publication
Document Types
1.  The pathway ontology – updates and applications 
Background
The Pathway Ontology (PW) developed at the Rat Genome Database (RGD), covers all types of biological pathways, including altered and disease pathways and captures the relationships between them within the hierarchical structure of a directed acyclic graph. The ontology allows for the standardized annotation of rat, and of human and mouse genes to pathway terms. It also constitutes a vehicle for easy navigation between gene and ontology report pages, between reports and interactive pathway diagrams, between pathways directly connected within a diagram and between those that are globally related in pathway suites and suite networks. Surveys of the literature and the development of the Pathway and Disease Portals are important sources for the ongoing development of the ontology. User requests and mapping of pathways in other databases to terms in the ontology further contribute to increasing its content. Recently built automated pipelines use the mapped terms to make available the annotations generated by other groups.
Results
The two released pipelines – the Pathway Interaction Database (PID) Annotation Import Pipeline and the Kyoto Encyclopedia of Genes and Genomes (KEGG) Annotation Import Pipeline, make available over 7,400 and 31,000 pathway gene annotations, respectively. Building the PID pipeline lead to the addition of new terms within the signaling node, also augmented by the release of the RGD “Immune and Inflammatory Disease Portal” at that time. Building the KEGG pipeline lead to a substantial increase in the number of disease pathway terms, such as those within the ‘infectious disease pathway’ parent term category. The ‘drug pathway’ node has also seen increases in the number of terms as well as a restructuring of the node. Literature surveys, disease portal deployments and user requests have contributed and continue to contribute additional new terms across the ontology. Since first presented, the content of PW has increased by over 75%.
Conclusions
Ongoing development of the Pathway Ontology and the implementation of pipelines promote an enriched provision of pathway data. The ontology is freely available for download and use from the RGD ftp site at ftp://rgd.mcw.edu/pub/ontology/pathway/ or from the National Center for Biomedical Ontology (NCBO) BioPortal website at http://bioportal.bioontology.org/ontologies/PW.
doi:10.1186/2041-1480-5-7
PMCID: PMC3922094  PMID: 24499703
Biological pathway; Ontology; Pipeline; Pathway annotations; Pathway diagrams
3.  The clinical measurement, measurement method and experimental condition ontologies: expansion, improvements and new applications 
Background
The Clinical Measurement Ontology (CMO), Measurement Method Ontology (MMO), and Experimental Condition Ontology (XCO) were originally developed at the Rat Genome Database (RGD) to standardize quantitative rat phenotype data in order to integrate results from multiple studies into the PhenoMiner database and data mining tool. These ontologies provide the framework for presenting what was measured, how it was measured, and under what conditions it was measured.
Results
There has been a continuing expansion of subdomains in each ontology with a parallel 2–3 fold increase in the total number of terms, substantially increasing the size and improving the scope of the ontologies. The proportion of terms with textual definitions has increased from ~60% to over 80% with greater synchronization of format and content throughout the three ontologies. Representation of definition source Uniform Resource Identifiers (URI) has been standardized, including the removal of all non-URI characters, and systematic versioning of all ontology files has been implemented. The continued expansion and success of these ontologies has facilitated the integration of more than 60,000 records into the RGD PhenoMiner database. In addition, new applications of these ontologies, such as annotation of Quantitative Trait Loci (QTL), have been added at the sites actively using them, including RGD and the Animal QTL Database.
Conclusions
The improvements to these three ontologies have been substantial, and development is ongoing. New terms and expansions to the ontologies continue to be added as a result of active curation efforts at RGD and the Animal QTL database. Use of these vocabularies to standardize data representation for quantitative phenotypes and quantitative trait loci across databases for multiple species has demonstrated their utility for integrating diverse data types from multiple sources. These ontologies are freely available for download and use from the NCBO BioPortal website at http://bioportal.bioontology.org/ontologies/1583 (CMO), http://bioportal.bioontology.org/ontologies/1584 (MMO), and http://bioportal.bioontology.org/ontologies/1585 (XCO), or from the RGD ftp site at ftp://rgd.mcw.edu/pub/ontology/.
doi:10.1186/2041-1480-4-26
PMCID: PMC3882879  PMID: 24103152
4.  Analysis of disease-associated objects at the Rat Genome Database 
The Rat Genome Database (RGD) is the premier resource for genetic, genomic and phenotype data for the laboratory rat, Rattus norvegicus. In addition to organizing biological data from rats, the RGD team focuses on manual curation of gene–disease associations for rat, human and mouse. In this work, we have analyzed disease-associated strains, quantitative trait loci (QTL) and genes from rats. These disease objects form the basis for seven disease portals. Among disease portals, the cardiovascular disease and obesity/metabolic syndrome portals have the highest number of rat strains and QTL. These two portals share 398 rat QTL, and these shared QTL are highly concentrated on rat chromosomes 1 and 2. For disease-associated genes, we performed gene ontology (GO) enrichment analysis across portals using RatMine enrichment widgets. Fifteen GO terms, five from each GO aspect, were selected to profile enrichment patterns of each portal. Of the selected biological process (BP) terms, ‘regulation of programmed cell death’ was the top enriched term across all disease portals except in the obesity/metabolic syndrome portal where ‘lipid metabolic process’ was the most enriched term. ‘Cytosol’ and ‘nucleus’ were common cellular component (CC) annotations for disease genes, but only the cancer portal genes were highly enriched with ‘nucleus’ annotations. Similar enrichment patterns were observed in a parallel analysis using the DAVID functional annotation tool. The relationship between the preselected 15 GO terms and disease terms was examined reciprocally by retrieving rat genes annotated with these preselected terms. The individual GO term–annotated gene list showed enrichment in physiologically related diseases. For example, the ‘regulation of blood pressure’ genes were enriched with cardiovascular disease annotations, and the ‘lipid metabolic process’ genes with obesity annotations. Furthermore, we were able to enhance enrichment of neurological diseases by combining ‘G-protein coupled receptor binding’ annotated genes with ‘protein kinase binding’ annotated genes.
Database URL: http://rgd.mcw.edu
doi:10.1093/database/bat046
PMCID: PMC3689439  PMID: 23794737
5.  InterMOD: integrated data and tools for the unification of model organism research 
Scientific Reports  2013;3:1802.
Model organisms are widely used for understanding basic biology, and have significantly contributed to the study of human disease. In recent years, genomic analysis has provided extensive evidence of widespread conservation of gene sequence and function amongst eukaryotes, allowing insights from model organisms to help decipher gene function in a wider range of species. The InterMOD consortium is developing an infrastructure based around the InterMine data warehouse system to integrate genomic and functional data from a number of key model organisms, leading the way to improved cross-species research. So far including budding yeast, nematode worm, fruit fly, zebrafish, rat and mouse, the project has set up data warehouses, synchronized data models, and created analysis tools and links between data from different species. The project unites a number of major model organism databases, improving both the consistency and accessibility of comparative research, to the benefit of the wider scientific community.
doi:10.1038/srep01802
PMCID: PMC3647165  PMID: 23652793
6.  The Rat Genome Database 2013—data, tools and users 
Briefings in Bioinformatics  2013;14(4):520-526.
The Rat Genome Database (RGD) was started >10 years ago to provide a core genomic resource for rat researchers. Currently, RGD combines genetic, genomic, pathway, phenotype and strain information with a focus on disease. RGD users are provided with access to structured and curated data from the molecular level through the organismal level. Those users access RGD from all over the world. End users are not only rat researchers but also researchers working with mouse and human data. Translational research is supported by RGD’s comparative genetics/genomics data in disease portals, in GBrowse, in VCMap and on gene report pages. The impact of RGD also goes beyond the traditional biomedical researcher, as the influence of RGD reaches bioinformaticians, tool developers and curators. Import of RGD data into other publicly available databases expands the influence of RGD to a larger set of end users than those who avail themselves of the RGD website. The value of RGD continues to grow as more types of data and more tools are added, while reaching more types of end users.
doi:10.1093/bib/bbt007
PMCID: PMC3713714  PMID: 23434633
database; genome; rat; disease; human
7.  Whole-exome sequencing supports genetic heterogeneity in childhood apraxia of speech 
Background
Childhood apraxia of speech (CAS) is a rare, severe, persistent pediatric motor speech disorder with associated deficits in sensorimotor, cognitive, language, learning and affective processes. Among other neurogenetic origins, CAS is the disorder segregating with a mutation in FOXP2 in a widely studied, multigenerational London family. We report the first whole-exome sequencing (WES) findings from a cohort of 10 unrelated participants, ages 3 to 19 years, with well-characterized CAS.
Methods
As part of a larger study of children and youth with motor speech sound disorders, 32 participants were classified as positive for CAS on the basis of a behavioral classification marker using auditory-perceptual and acoustic methods that quantify the competence, precision and stability of a speaker’s speech, prosody and voice. WES of 10 randomly selected participants was completed using the Illumina Genome Analyzer IIx Sequencing System. Image analysis, base calling, demultiplexing, read mapping, and variant calling were performed using Illumina software. Software developed in-house was used for variant annotation, prioritization and interpretation to identify those variants likely to be deleterious to neurodevelopmental substrates of speech-language development.
Results
Among potentially deleterious variants, clinically reportable findings of interest occurred on a total of five chromosomes (Chr3, Chr6, Chr7, Chr9 and Chr17), which included six genes either strongly associated with CAS (FOXP1 and CNTNAP2) or associated with disorders with phenotypes overlapping CAS (ATP13A4, CNTNAP1, KIAA0319 and SETX). A total of 8 (80%) of the 10 participants had clinically reportable variants in one or two of the six genes, with variants in ATP13A4, KIAA0319 and CNTNAP2 being the most prevalent.
Conclusions
Similar to the results reported in emerging WES studies of other complex neurodevelopmental disorders, our findings from this first WES study of CAS are interpreted as support for heterogeneous genetic origins of this pediatric motor speech disorder with multiple genes, pathways and complex interactions. We also submit that our findings illustrate the potential use of WES for both gene identification and case-by-case clinical diagnostics in pediatric motor speech disorders.
doi:10.1186/1866-1955-5-29
PMCID: PMC3851280  PMID: 24083349
Apraxia of speech; Developmental verbal dyspraxia; Next-generation sequencing; Speech disorder; Whole-exome sequencing
9.  The Rat Genome Database 2009: variation, ontologies and pathways 
Nucleic Acids Research  2008;37(Database issue):D744-D749.
The Rat Genome Database (RGD, http://rgd.mcw.edu) was developed to provide a core resource for rat researchers combining genetic, genomic, pathway, phenotype and strain information with a focus on disease. RGD users are provided with access to structured and curated data from the molecular level through to the level of the whole organism, including the variations associated with disease phenotypes. To fully support use of the rat as a translational model for biological systems and human disease, RGD continues to curate these datasets while enhancing and developing tools to allow efficient and effective access to the data in a variety of formats including linear genome viewers, pathway diagrams and biological ontologies. To support pathophysiological analysis of data, RGD Disease Portals provide an entryway to integrated gene, QTL and strain data specific to a particular disease. In addition to tool and content development and maintenance, RGD promotes rat research and provides user education by creating and disseminating tutorials on the curated datasets, submission processes, and tools available at RGD. By curating, storing, integrating, visualizing and promoting rat data, RGD ensures that the investment made into rat genomics and genetics can be leveraged by all interested investigators.
doi:10.1093/nar/gkn842
PMCID: PMC2686558  PMID: 18996890
10.  Structure of Lmaj006129AAA, a hypothetical protein from Leishmania major  
The crystal structure of a conserved hypothetical protein from L. major, Pfam sequence family PF04543, structural genomics target ID Lmaj006129AAA, has been determined at a resolution of 1.6 Å.
The gene product of structural genomics target Lmaj006129 from Leishmania major codes for a 164-residue protein of unknown function. When SeMet expression of the full-length gene product failed, several truncation variants were created with the aid of Ginzu, a domain-prediction method. 11 truncations were selected for expression, purification and crystallization based upon secondary-structure elements and disorder. The structure of one of these variants, Lmaj006129AAH, was solved by multiple-wavelength anomalous diffraction (MAD) using ELVES, an automatic protein crystal structure-determination system. This model was then successfully used as a molecular-replacement probe for the parent full-length target, Lmaj006129AAA. The final structure of Lmaj006129AAA was refined to an R value of 0.185 (R free = 0.229) at 1.60 Å resolution. Structure and sequence comparisons based on Lmaj006129AAA suggest that proteins belonging to Pfam sequence families PF04543 and PF01878 may share a common ligand-binding motif.
doi:10.1107/S1744309106005902
PMCID: PMC2197200  PMID: 16511295
Lmaj006129AAA
11.  The Genome of the Kinetoplastid Parasite, Leishmania major 
Ivens, Alasdair C. | Peacock, Christopher S. | Worthey, Elizabeth A. | Murphy, Lee | Aggarwal, Gautam | Berriman, Matthew | Sisk, Ellen | Rajandream, Marie-Adele | Adlem, Ellen | Aert, Rita | Anupama, Atashi | Apostolou, Zina | Attipoe, Philip | Bason, Nathalie | Bauser, Christopher | Beck, Alfred | Beverley, Stephen M. | Bianchettin, Gabriella | Borzym, Katja | Bothe, Gordana | Bruschi, Carlo V. | Collins, Matt | Cadag, Eithon | Ciarloni, Laura | Clayton, Christine | Coulson, Richard M. R. | Cronin, Ann | Cruz, Angela K. | Davies, Robert M. | Gaudenzi, Javier De | Dobson, Deborah E. | Duesterhoeft, Andreas | Fazelina, Gholam | Fosker, Nigel | Frasch, Alberto Carlos | Fraser, Audrey | Fuchs, Monika | Gabel, Claudia | Goble, Arlette | Goffeau, André | Harris, David | Hertz-Fowler, Christiane | Hilbert, Helmut | Horn, David | Huang, Yiting | Klages, Sven | Knights, Andrew | Kube, Michael | Larke, Natasha | Litvin, Lyudmila | Lord, Angela | Louie, Tin | Marra, Marco | Masuy, David | Matthews, Keith | Michaeli, Shulamit | Mottram, Jeremy C. | Müller-Auer, Silke | Munden, Heather | Nelson, Siri | Norbertczak, Halina | Oliver, Karen | O'Neil, Susan | Pentony, Martin | Pohl, Thomas M. | Price, Claire | Purnelle, Bénédicte | Quail, Michael A. | Rabbinowitsch, Ester | Reinhardt, Richard | Rieger, Michael | Rinta, Joel | Robben, Johan | Robertson, Laura | Ruiz, Jeronimo C. | Rutter, Simon | Saunders, David | Schäfer, Melanie | Schein, Jacquie | Schwartz, David C. | Seeger, Kathy | Seyler, Amber | Sharp, Sarah | Shin, Heesun | Sivam, Dhileep | Squares, Rob | Squares, Steve | Tosato, Valentina | Vogt, Christy | Volckaert, Guido | Wambutt, Rolf | Warren, Tim | Wedler, Holger | Woodward, John | Zhou, Shiguo | Zimmermann, Wolfgang | Smith, Deborah F. | Blackwell, Jenefer M. | Stuart, Kenneth D. | Barrell, Bart | Myler, Peter J.
Science (New York, N.Y.)  2005;309(5733):436-442.
doi:10.1126/science.1112680
PMCID: PMC1470643  PMID: 16020728
12.  Taking U out, with two nucleases? 
BMC Bioinformatics  2006;7:305.
Background
REX1 and REX2 are protein components of the RNA editing complex (the editosome) and function as exouridylylases. The exact roles of REX1 and REX2 in the editosome are unclear and the consequences of the presence of two related proteins are not fully understood. Here, a variety of computational studies were performed to enhance understanding of the structure and function of REX proteins in Trypanosoma and Leishmania species.
Results
Sequence analysis and homology modeling of the Endonuclease/Exonuclease/Phosphatase (EEP) domain at the C-terminus of REX1 and REX2 highlights a common active site shared by all EEP domains. Phylogenetic analysis indicates that REX proteins contain a distinct subfamily of EEP domains. Inspection of three-dimensional models of the EEP domain in Trypanosoma brucei REX1 and REX2, and Leishmania major REX1 suggests variations of previously characterized key residues likely to be important in catalysis and determining substrate specificity.
Conclusion
We have identified features of the REX EEP domain that distinguish it from other family members and hence subfamily specific determinants of catalysis and substrate binding. The results provide specific guidance for experimental investigations about the role(s) of REX proteins in RNA editing.
doi:10.1186/1471-2105-7-305
PMCID: PMC1525001  PMID: 16780580
13.  Comparative analysis of the kinomes of three pathogenic trypanosomatids: Leishmania major, Trypanosoma brucei and Trypanosoma cruzi 
BMC Genomics  2005;6:127.
Background
The trypanosomatids Leishmania major, Trypanosoma brucei and Trypanosoma cruzi cause some of the most debilitating diseases of humankind: cutaneous leishmaniasis, African sleeping sickness, and Chagas disease. These protozoa possess complex life cycles that involve development in mammalian and insect hosts, and a tightly coordinated cell cycle ensures propagation of the highly polarized cells. However, the ways in which the parasites respond to their environment and coordinate intracellular processes are poorly understood. As a part of an effort to understand parasite signaling functions, we report the results of a genome-wide analysis of protein kinases (PKs) of these three trypanosomatids.
Results
Bioinformatic searches of the trypanosomatid genomes for eukaryotic PKs (ePKs) and atypical PKs (aPKs) revealed a total of 176 PKs in T. brucei, 190 in T. cruzi and 199 in L. major, most of which are orthologous across the three species. This is approximately 30% of the number in the human host and double that of the malaria parasite, Plasmodium falciparum. The representation of various groups of ePKs differs significantly as compared to humans: trypanosomatids lack receptor-linked tyrosine and tyrosine kinase-like kinases, although they do possess dual-specificity kinases. A relative expansion of the CMGC, STE and NEK groups has occurred. A large number of unique ePKs show no strong affinity to any known group. The trypanosomatids possess few ePKs with predicted transmembrane domains, suggesting that receptor ePKs are rare. Accessory Pfam domains, which are frequently present in human ePKs, are uncommon in trypanosomatid ePKs.
Conclusion
Trypanosomatids possess a large set of PKs, comprising approximately 2% of each genome, suggesting a key role for phosphorylation in parasite biology. Whilst it was possible to place most of the trypanosomatid ePKs into the seven established groups using bioinformatic analyses, it has not been possible to ascribe function based solely on sequence similarity. Hence the connection of stimuli to protein phosphorylation networks remains enigmatic. The presence of numerous PKs with significant sequence similarity to known drug targets, as well as a large number of unusual kinases that might represent novel targets, strongly argue for functional analysis of these molecules.
doi:10.1186/1471-2164-6-127
PMCID: PMC1266030  PMID: 16164760
14.  SURVEY AND SUMMARY: Comparative analysis of editosome proteins in trypanosomatids 
Nucleic Acids Research  2003;31(22):6392-6408.
Detailed comparisons of 16 editosome proteins from Trypanosoma brucei, Trypanosoma cruzi and Leishmania major identified protein motifs associated with catalysis and protein or nucleic acid interactions that suggest their functions in RNA editing. Five related proteins with RNase III-like motifs also contain a U1-like zinc finger and either dsRBM or Pumilio motifs. These proteins may provide the endoribonuclease function in editing. Two other related proteins, at least one of which is associated with U-specific 3′ exonuclease activity, contain two putative nuclease motifs. Thus, editosomes contain a plethora of nucleases or proteins presumably derived from nucleases. Five additional related proteins, three of which have zinc fingers, each contain a motif associated with an OB fold; the TUTases have C-terminal folds reminiscent of RNA binding motifs, thus indicating the presence of numerous nucleic acid and/or protein binding domains, as do the two RNA ligases and a RNA helicase, which provide for additional catalytic steps in editing. These data indicate that trypanosomatid RNA editing is orchestrated by a variety of domains for catalysis, molecular interaction and structure. These domains are generally conserved within other protein families, but some are found in novel combinations in the editosome proteins.
doi:10.1093/nar/gkg870
PMCID: PMC275564  PMID: 14602897

Results 1-14 (14)