The
Candida Biochemical Pathways in CGD were created using Pathway Tools, a software suite developed by the Bioinformatics Research Group at SRI International (
2). Each pathway, reaction, enzyme and compound in CGD has its own web page report. The Pathway page contains a diagram of the pathway, with a user-selectable level of detail: at the most detailed level, the structures of each intermediate and all cofactors are displayed on the pathway diagram, as well as the gene and enzyme names. It also contains a summary, written by CGD curators, that describes what is known in
Candida species (with a focus on
C. albicans) about the pathway and the enzymes that participate in it, and contains a list of published references for the pathway information ().
The Pathway Tools software suite contains modules for generating organism-specific databases of compounds, enzymes, reactions and metabolic pathways as well as tools for data visualization and analysis. The initial set of predicted biochemical pathways was generated with the PathoLogic module, which used
C. albicans enzymatic activities identified by Gene Ontology curation in CGD in conjunction with two reference datasets: SRI's; reference database of biochemical reaction and pathway information, MetaCyc (
3), and the set of pathways curated at the
Saccharomyces Genome Database (SGD) (
4). The software predicted that a pathway existed in
C. albicans if at least one enzyme from that pathway in MetaCyc or SGD was identifiable among the
C. albicans gene products. Since not all of the enzymes were found in many of the predicted pathways, another module, the Pathway Hole Filler, was used to identify genes encoding candidate enzymes for the missing reactions (the ‘pathway holes’). The Pathway Hole Filler was configured to compare GenBank sequences associated with each of the enzymes known to carry out the reaction in other organisms to the ORF sequences from CGD, and where significant similarity was found, it assigned candidate
C. albicans genes to these activities, thus predicting which gene might fill the ‘pathway hole’ (
5).
The parameters for the automatic pathway generation were intentionally set at a relatively low stringency so that borderline predictions could be subjected to curatorial review rather than being automatically rejected. Consequently, the initial pathway set also contained a number of spurious and redundant pathways. CGD curators reviewed the pathway list, identified relevant literature for the pathways, removed spurious predictions and collected lists of relevant citations that are displayed on each pathway page in the database. A number of new Candida pathways were added, such as those for farnesol, oxylipin, selenocysteine, xylose/xylitol and glucosylceramide metabolism. In an ongoing effort, CGD curators are reviewing each pathway in detail, making updates to the pathway structure or reactions where necessary and linking the CGD Pathway page to the corresponding pathway(s) in SGD. The literature relevant to the pathway in C. albicans and other related species is reviewed and summarized on the Pathway page. In many cases, information about a pathway is synthesized from a broad-based survey of the literature that includes characterization performed in C. albicans and Candida-related species, as indicated in the text of the summary on the Pathway page. In total, 181 pathways were added to CGD from the initial predicted set of 408 pathways, an additional 15 pathways were added de novo, and subsequent curation has refined the list to 159 pathways that are currently represented in CGD as of September 2009.
The Biochemical Pathways in CGD can be accessed via the Pathways link under the Search Options section on the CGD home page. This link opens the main Pathway Tools Query Page (
http://pathway.candidagenome.org/), which provides tools for searching and browsing pathway data. The Query box allows searching for a pathway, a protein name, a reaction or a compound; reactions and proteins can be searched by name or E.C. number. The Browse Ontology box allows browsing of the pathways, E.C. numbered reactions and compounds in the Pathway Tools, and the hierarchical structures in which they are organized. For example, the pathways are classified into categories including Biosynthesis, Degradation/Utilization/Assimilation and Generation of Precursor Metabolites and energy, with each of these classes being broken down further into more specific subclasses. The query page also provides an option to choose from an alphabetical list of all the pathways, proteins or compounds present in the database.
Any specific pathway page can easily be found by a name-based query using CGD's; Quick Search box, which is present at the top of most pages in CGD. This tool performs keyword searches through major categories of information, including pathway names. Individual pathways are also accessible via hyperlinks from the Locus Summary pages of the participating genes.
The ability to analyze gene functions in the context of biochemical pathways is particularly important in Candida research because of the major focus in this field on finding drug targets and investigating mechanisms of drug resistance. The Pathway Tools suite provides a module for such analyses, the Pathway Tools Omics Viewer, which is accessible from the main Pathway Tools Query Page. This tool allows the results of large-scale experiments (e.g. microarray expression, proteomics) to be superimposed on the diagram of biochemical pathways, thus presenting a graphical overview of the response of the entire Candida metabolic profile to a particular condition or treatment. In addition, a collection of datasets can be used and the diagram can be animated to show, for instance, changes in gene expression over time.