A central goal of integrative systems biology is the accurate representation of molecular interaction networks. Ultimately, such networks can be used to underpin mathematical models, consisting of stochastic or ordinary differential equations that permit the simulation of biological behaviour. The first step in generating such models is constructing a network of biochemical reactions and interactions between molecular components of the system to form a qualitative (unparameterised) model. Several groups have reconstructed the metabolic network of baker's yeast from genomic and literature data [1
]. Variation in the approaches used, and contradictory interpretations of the available literature, mean that most reconstructions differ considerably. To resolve these problems, a cohort of the yeast systems biology community collaborated to create a consensus reconstruction. In April 2007, a large focused meeting brought together experts from various groups and disciplines in order to resolve discrepancies between the various reactions and metabolites described by other available reconstructions and form a consensus. The resultant reconstruction [4
], subsequently referred to as "Yeast 1.0", removed the ambiguities inherent in its predecessors through the use of principled and computer-readable annotations. Whilst previous reconstructions had defined entities using subjective names, which lacked precision and resulted in ambiguities, Yeast 1.0 directly referenced chemical and protein descriptions to persistent databases or used standardised, database-independent, computer-readable representations. This removed the ambiguities and allowed the new reconstruction to be used effectively as the basis for automated analyses.
A limitation of Yeast 1.0 came about through the very generation of the consensus; the network became considerably fragmented as reactions that could not be readily annotated (due to the presence of structural ambiguities) were removed. This led to underrepresentation of a number of pathways, particularly those involved in lipid biosynthesis. Since Yeast 1.0, many improvements have been made to the reconstruction. The latest release, described here, is considerably larger (in terms of numbers of metabolites and reactions), of higher quality (by reference to literature evidence), exhibits greater coverage of known metabolic enzymes, and is better connected than all previous efforts.
The reconstruction is described and made available in Systems Biology Markup Language (SBML) [5
], an established community XML format for the mark-up of biochemical models. With the introduction of SBML Level 2, specific model entities, such as species or reactions, can be annotated using ontological terms. These annotations, encoded using the resource description framework (RDF) [6
], provide the facility to assign definitive terms to individual components, allowing the software to identify such components unambiguously and thus link model components to existing data resources [7
]. Minimum Information Requested in the Annotation of Models (MIRIAM) [8
] -compliant annotations have been used to identify components unambiguously by associating them with one or more terms from publicly available databases registered in MIRIAM Resources [9
]. An example of such an annotation is presented in Figure , where an enzyme is identified by MIRIAM-compliant references to the UniProt [10
], SGD [11
], and PubMed [12
] databases. Metabolites are annotated with reference to the ChEBI (Chemical Entities of Biological Interest) database [13
]. Whilst SBML is the primary format for dissemination of the reconstruction, we also make the reconstruction available in an online database [14
], B-Net, that enables easy searching of the content. B-Net [15
] is able to represent all of the SBML features utilised in the current reconstruction. Searches can be performed using synonyms and the user is also able to navigate through the network from any point (e.g. a metabolite, reaction or enzyme) to its connected neighbours. Query results can also be exported in SBML and this is an effective mechanism to extract subsets of the entire model in this exchange format.
SBML example. Simplified example of MIRIAM-compliant SBML, whereby an enzyme is annotated with reference to the databases UniProt, SGD and PubMed, respectively.