The major new addition to the RGD site in the past year was guided by three user-centric goals—(i) providing easy access to data related to diseases, (ii) allowing multiple perspectives of RGD data according to the needs of the user, and (iii) presenting a broader overview of data and allowing users to zoom in, filter the data and then drill down to the details as required.
Analysis of rat publications and trends in rat research demonstrate that much rat research is done in the context of disease-related studies. This is borne out by the types of grants that are funded and the types of searches that are undertaken on RGD. shows a tag cloud view of a typical month's top searches on RGD. The search keywords of interest to our users are primarily disease terms, centered on cardiovascular, autoimmune and neurological disease areas. Based on this demonstrated need, we have introduced a variety of disease-centric resources to the database.
Top search terms from the RGD web logs (June 1–July 4, 2006). The individual search terms are shown with the number of times that term was searched shown in parentheses. Font sizes are proportional to search frequency.
A general disease portal page was released to enable ‘one-click’ access to some of the most popular disease areas (http://rgd.mcw.edu/tools/diseases/disease_search.cgi
). For each disease this provides preconfigured links into our ontology annotation tools and the genome browser to enable users to quickly find gene lists and see genomic locations for genes and QTLs.
Building from this, we identified a subset of diseases that were clearly of high interest to the research community and developed a two-pronged approach to providing enhanced support for research in these areas. For each disease area, two methods were utilized to identify data for inclusion. Disease-related genes were identified from existing sources such as Online Mendelian Inheritance in Man (OMIM) (9
), the Genetic Association Database (10
), GeneCards (11
) and NCBI's Genes and Disease database. In addition, genes at the Mouse Genome Database (MGD) (12
) annotated with related phenotypes were included. These genes were prioritized and targeted searches of rat and human literature were undertaken to provide comprehensive annotations for functional, disease, phenotype and pathway information. In a complimentary approach, focused literature searches were conducted to identify additional genes, QTLs and strains related to the disease area. To facilitate translational studies Human and Mouse data was also included. As there is limited data on human QTL available electronically, a similar strategy was followed to identify relevant Human QTL papers for inclusion in the portal. Mouse and Human gene orthologs are already curated as part of the normal RGD gene curation process. By following this targeted approach, all rat genes, strains, QTL related to a disease area could be added to the database, along with their functional annotations (GO, pathway, phenotype, disease). Similarly, the Human and Mouse gene orthologs and Human QTL were also identified to provide comparative resources for the database. To complement the dedicated curation effort, an online portal was created to provide access to this information. To date, one disease portal has been released for neurological diseases; a second for Cardiovascular Diseases will be released in the autumn of 2006. The portal combines text data with visual elements to allow the user to quickly get an overview of knowledge in a disease area while providing hyperlinks to more details as desired. A screenshot of the Neurological portal is shown in and the main elements are described below:
- Disease Category Selection—Based on the disease ontology structure, three levels of disease specificity are provided. When the page is first opened data is presented for All Neurological Diseases. Using the two dropdown menus a disease category and, optionally, a disease from this category can be selected to further narrow down the data displayed. A summary table lists the number of genes, QTL and strains for the three species, rat, mouse and human for the selected disease.
- Genome view using Flash Gviewer—This provides a graphical overview of the chromosomal locations of all genes and QTL annotated to the selected disease category. Maps for rat, mouse and human genomes are provided and syntenic maps are also available. An enlarged map can also be selected showing the chromosomes spread over two rows. The Gviewer package is written using Adobe Flash and allows zooming to view individual chromosomes and their features. Hyperlinks are available to jump to gene and QTL reports. The zoomed view also allows chromosomal regions to be selected, providing a dynamic link out to visualize the selected region in a genome browser. GViewer is freely available via the GMOD project (http://www.gmod.org/flashgviewer/)
- Gene, QTL and Strain lists—The symbols for the genes, QTL and strains associated with the selected disease are shown in tabular form in the center of the page. These allow easy browsing of the data related to the disease and each symbol is a hyperlink to the appropriate object report in RGD. The Strain table provides quick access to the rat models used to study the selected disease.
- Gene Ontology Overview—Three bar charts provide an overview of the prominent gene ontology annotations available for the Rat genes annotated to the selected disease. The individual GO annotations for each rat gene are converted to the corresponding GO Slim annotation and graphed to provide a visual indication of popular GO categories relevant to the selected disease.
Screenshot of the RGD Neurological Disease portal showing the combined data for all neurological diseases. The various subsections of the page (A–D) are described in more detail in the text.
Complementing the Disease section of the portal are similar views for Phenotype, Biological Process and Pathways. These list selected phenotype ontology terms, biological process terms or pathway ontology terms for genes that have been linked to Neurological or Cardiovascular disease. These allow the scientist to view the disease data from alternative perspectives, to quickly ask questions like ‘what pathways are involved in neurological disease, where are these genes on the genome, what cellular location do they typically occupy?’ The strain models section provides a comprehensive background on specific disease models and the strains used to study these diseases. It includes information on the experimental model, how it can be induced, the disease course, phenotypic indices and strains that are susceptible or resistant to the induction of the disease phenotypes.
The disease portal approach utilizes in-depth curation and dedicated web tools to provide detailed coverage for specific disease areas. Upcoming portals will cover Autoimmune Diseases, Cancer, Metabolic and Nutritional Diseases, Renal Diseases, Respiratory Diseases and other high priority research areas for the rat model. Until these become available the general disease portal page and search tool (http://rgd.mcw.edu/tools/diseases/disease_search.cgi
) does provide a convenient way to find genes, orthologs and QTL associated with any disease that are curated by the regular RGD literature curation effort.