|Home | About | Journals | Submit | Contact Us | Français|
Summary: The R/Bioconductor package RamiGO is an R interface to AmiGO that enables visualization of Gene Ontology (GO) trees. Given a list of GO terms, RamiGO uses the AmiGO visualize API to import Graphviz-DOT format files into R, and export these either as images (SVG, PNG) or into Cytoscape for extended network analyses. RamiGO provides easy customization of annotation, highlighting of specific GO terms, colouring of terms by P-value or export of a simplified summary GO tree. We illustrate RamiGO functionalities in a genome-wide gene set analysis of prognostic genes in breast cancer.
Availability and implementation: RamiGO is provided in R/Bioconductor, is open source under the Artistic-2.0 License and is available with a user manual containing installation, operating instructions and tutorials. It requires R version 2.15.0 or higher. URL: http://bioconductor.org/packages/release/bioc/html/RamiGO.html
Supplementary information: Supplementary data are available at Bioinformatics online.
The Gene Ontology (GO) is a controlled vocabulary of gene annotation that was developed to provide consistent functional classification for genes across species; GO is also widely used in gene set enrichment analyses (Gene Ontology Consortium, 2012). It is organized as a directed acyclic graph with top-level ontologies molecular function, biological process and cellular component. Although several web-based or standalone tools are available for visualization of lists of GO terms as GO tree structures, these are not easily accessible through R/Bioconductor, where one has to either rebuild the GO tree using R packages such as GO.db or GOstats, or copy and paste the GO terms of interest into an external web service such as AmiGO Visualize (Carbon et al., 2009, http://amigo.geneontology.org) to display the GO tree.
The free open source web application AmiGO provides users with access to ontology and gene annotation data. Visualization of the queried ontology data is provided by AmiGO Visualize, which uses Graphviz libraries to create GO trees. Graphviz is a collection of software for viewing and manipulating abstract graphs (Gansner and North, 2000). We developed RamiGO to provide R functions that connect directly with the AmiGO Visualize API and retrieve GO trees in various formats. The most common format being PNG or SVG image file, but a file representation of the GO tree in the Graphviz-DOT format is also possible. RamiGO provides a parser for the Graphviz-DOT format that returns an R S4 object, called AmigoDot, that includes (i) the tree as a graph object, (ii) the adjacency matrix of the tree, (iii) the annotation of the nodes, (iv) the relation between the nodes and (v) a list of the leaves of the tree. The GO tree is displayed with the set of input GO IDs and all parents of those GO IDs to the root of each GO category where relations between the nodes are represented using the same colour palette as AmiGO Visualize; green, red, black, blue and light blue represent ‘positively regulates’, ‘negatively regulates’, ‘regulates’, ‘is a’ and ‘part of’, respectively. In addition using RamiGO, one can display and interactively modify GO tree colours, annotations and relationships between nodes in Cytoscape directly from R.
The main functions within the RamiGO package are as follows:
which retrieve a GO tree from AmiGO Visualize in the preferred format type, read Graphviz-DOT format files and communicate with Cytoscape, respectively. The AmigoDot.to.Cyto() function displays the tree from an AmigoDot object in Cytoscape using the RCytoscape package. RCytoscape requires the Cytoscape plugin CytoscapeRPC, and the installation of this is described in more detail in the RCytoscape manual. Additional functions to convert the AmigoDot S4 objects to graphAM and graphNEL formats are also provided (AmigoDot.to.graphAM, AmigoDot.to.graphNEL).
To showcase an analysis pipeline (Supplementary Material) that integrates the RamiGO package, we used the Bioconductor data package breastCancerVDX, which includes Affymetrix Human U133A microarray profiles from 344 breast cancer patients (Wang et al., 2005; Minn et al., 2007). We ranked the genes based on their prognostic value by computing the concordance index to estimate the association of each gene with distant metastasis-free survival using the package survcomp (Schröder et al., 2011). This ranked list was then used to run a pre-ranked Gene Set Enrichment Analysis (GSEA) (Subramanian et al., 2005), using the C5 (GO) subset of the Molecular Signature Database (v3.0) gene sets (http://broadinstitute.org/gsea/msigdb/collections.jsp). Figure 1 shows a subset of the GO cellular component gene sets that were reported with a false discovery rate <0.15. Using the customized colouring option in RamiGO, the nodes for gene sets with a positive Normalized Enrichment Score (NES) are red and nodes for gene sets with a negative NES are blue. The clustering of red and blue nodes is clearly visible and would have been missed when only looking at the GO gene set terms alone.
Figure 1 shows the PNG image file of the tree from AmiGO Visualize. By changing the picType parameter of the getAmigoTree() function, we can also specify SVG and DOT for the requested file type or it can be viewed in Cytoscape by exporting AmigoDot object using AmigoDot.to.Cyto (Fig. 2).
The R/Bioconductor package RamiGO provides an easy-to-use R interface to the AmiGO Visualize web server. It provides a simple and elegant way to retrieve Graphviz trees that display hundreds of GO IDs at once and efficiently study clusters or subcomponents of the GO tree in graph form. RamiGO provides functions to convert a GO tree into different formats and display it in Cytoscape without leaving the R environment. RamiGO is therefore a perfect companion to GSEA and GO analyses in R, as it helps one better analyse and interpret the long, and sometimes complex, lists of GO identifiers that these analyses produce.
Funding: This work was supported by the National Human Genome Research Institute (1P50 HG004233 to M.S.S.); Fulbright Commission for Educational Exchange to (B.H.-K.); US National Institutes of Health (#1U01CA151118-01A1 to J.Q., R01 LM010129-01 to B.H.-K. and J.Q.); Claudia Adams Barr Program in Innovative Basic Cancer Research (A.C.C. and J.Q.); Career Development grant from DFCI Breast Cancer SPORE: CA089393, Dana-Farber Cancer Institute Women’s Cancer Program (to A.C.C).
Conflict of Interest: none declared.