|Home | About | Journals | Submit | Contact Us | Français|
EuroPhenome (http://www.europhenome.org) and EMPReSS (http://empress.har.mrc.ac.uk/) form an integrated resource to provide access to data and procedures for mouse phenotyping. EMPReSS describes 96 Standard Operating Procedures for mouse phenotyping. EuroPhenome contains data resulting from carrying out EMPReSS protocols on four inbred laboratory mouse strains. As well as web interfaces, both resources support web services to enable integration with other mouse phenotyping and functional genetics resources, and are committed to initiatives to improve integration of mouse phenotype databases. EuroPhenome will be the repository for a recently initiated effort to carry out large-scale phenotyping on a large number of knockout mouse lines (EUMODIC).
With an extensive knowledge of the gene content of mammalian genomes becoming a reality through the completion of a number of genome sequences including mouse (1), focus has shifted to the study of gene function in these organisms. The aim in this postgenomic era is to link genomic and phenotype information systematically to allow a deeper understanding of the processes leading from genomic changes to altered phenotype and disease. Mouse mutants represent one of the most powerful tools in this endeavour (2). To facilitate these aims, a number of projects are underway ranging from the production of large collections of mouse point and knockout mutants (3,4) through to the establishment of large-scale phenotype characterization projects that aim to provide a phenotype assessment of these lines (5) (EUMODIC—http://www.eumodic.eu). The ultimate challenge will be to interpret this data using computational methods.
A number of critical steps must be achieved for the comprehensive analysis and interpretation of this data to be possible. First, phenotype data on both normal inbred strains and mutant strains must be collected in community databases with open access. Second, the phenotype data must be generated using comprehensive phenotyping platforms that provide standardized methods, so that the results can be compared between laboratories and across time (5). Third, structured descriptions of the phenotypes must be used to allow the data to be interpreted in a consistent manner (6–8).
EMPReSS (European Mouse Phenotyping Resource for Standardized Screens) (5,9) is the product of Eumorphia (http://www.eumorphia.org), the largest programme to date to develop standardized phenotyping protocols. EMPReSS is a comprehensive database of validated Standard Operating Procedures (SOPs) for screens to determine the phenotype of a mouse. It incorporates 96 SOPs that cover all of the main body systems including: clinical chemistry, hormonal and metabolic systems, cardiovascular, allergy and infection, sensory function, neurological and behavioural function, cancer, and bone and cartilage systems. In addition, there are generic SOPs for histology, necropsy, pathology and gene expression. EMPReSS is a platform of individual tests, but these can also be grouped together into phenotyping pipelines.
EuroPhenome (http://www.europhenome.org) was instigated as an online mouse phenotyping resource to store baseline data generated from the application of EMPReSS SOPs. Data was collected by individual work packages in the Eumorphia project for purposes of SOP validation and to provide baseline information against which phenotypes of experimental animals could be compared. It currently includes data from 24 EMPReSS SOPs, representing the measurement of 132 parameters across four inbred mouse strains in up to four different laboratories.
EuroPhenome is a mySQL relational database (http://www.mysql.com/) providing baseline data on four inbred mouse strains (C57BL/6J, C3H/HeBFeJ, BALB/cByJ and 129/SvPas). Data were obtained by researchers working in the Eumorphia project. The data are subdivided into five phenotyping domains corresponding to body systems (see Table 1).
The EuroPhenome data browser utilizes PHP and AJAX and gives the user a variety of ways to visualize and search the data, allowing assessment of inter-strain, -gender and -laboratory variation for the various SOPs. The web-accessible data browser (http://www.europhenome.org/browser) allows users to browse through the data via the SOP. SOPs are presented in a tree structure on the left-hand side of the browser. SOPs are clustered under five phenotyping domains, corresponding to the classification in EMPReSS. Opening up a folder corresponding to one of these domains reveals a list of SOP names. Clicking on an SOP name opens a summary page in the main frame of the browser that lists the parameters measured under the selected SOP and mean, median and variation measures over the entire dataset for each parameter (Figure 1). Users can also link directly to the description of the SOP in EMPReSS, to similar measurements in the Mouse Phenome Database (10), and download the data to Excel format, allowing them to carry out their own analyses of the data and to combine it with their own in-house data. Clicking on an individual parameter reveals a more detailed breakdown of parameter means and SDs broken down by age and sex. These values are also represented graphically for easy visual comparison (Figure 2). The left-hand panel also provides a link to Phenostat, which can be used for statistical analysis of datasets (11).
In addition to this browsing interface, we have also implemented a MartView/BioMart (http://www.biomart.org/) interface to the database, which can be accessed from a link on the left-hand panel of the browser. This allows export of selected fields to HTML pages, and in comma or tab separated formats.
Open standard web services technology such as Simple Object Access Protocol (SOAP, http://www.w3.org/TR/soap) and Web Service Definition Language (WSDL, http://www.w3.org/TR/wsdl) enable programmers to build complex applications without the need to install and maintain the database, and facilitates integration and interoperability between bioinformatics applications and the data they require.
Web services have been implemented in EuroPhenome to expose its data in a programmatically accessible manner. We are currently using RPC/Encoding style WSDL and will be providing Document/Literal style soon, as recommended by the Web Services Interoperability Organization (WS-I, http://www.ws-i.org). The URL for accessing the EuroPhenome WSDL file is http://www.europhenome.org/europhenome.wsdl.
The original implementation of the EMPReSS SOP site is described in Green et al. (9) and its scientific genesis in (5). The implementation of EMPReSS has subsequently been redeveloped and the browser is now a web 2.0 resource combining JAVA, XML, AJAX and PHP for the visualization and searching of the SOPs. This functionality now enables bioinformaticians to access the data in the eXist XML database (http://exist.sourceforge.net/) via web services. The EMPReSS browser has been extended to allow users to select data generated from SOPs in the two EMPReSSslim phenotyping pipelines (see below). The pipelines are comprehensive sequences of SOPs that are being utilized for high-throughput phenotyping in the EUMODIC project. The interface also allows direct access to available data generated using a particular SOP in EuroPhenome (Figure 3).
The EMPReSS SOPs have also been annotated with high level Mammalian Phenotype (MP) ontology terms (8) and an ontology tree browser interface developed to facilitate the identification of the annotated SOPs. This annotation currently uses top-level MP terms, but we aim to develop it further in depth and diversity.
The major future objective for EuroPhenome is to move from the collection of phenotyping data on inbred strains to mutant strains. EuroPhenome will be the primary repository and query tool for data generated from the EUMODIC project, which will phenotype up to 650 knockout lines generated by the EUCOMM project (http://www.eucomm.org). EUMODIC has developed a phenotyping screen called EMPReSSslim, which is structured for comprehensive, primary, high-throughput phenotyping of large numbers of mice. The primary phenotype assessment using EMPReSSslim will be undertaken in four large-scale phenotyping centres at the GSF, Germany; ICS, France; MRC Harwell, UK and the Sanger Institute, UK. Data generated in the centres will be submitted to EuroPhenome as XML files. It is envisaged that the EuroPhenomeXML Schema could be used more generally for the exchange of phenotyping data.
EuroPhenome is a core member of the Mouse Phenotype Database Integration Consortium (MPDIC) (13) and CASIMIR (Coordination and Sustainability of International Mouse Informatics Resources; http://www.casimir.org.uk/). As such, one of its primary aims over the next 1–2 years is to integrate with many other data sources including Ensembl (http://www.ensembl.org), MGI (http://www.informatics.jax.org/), RIKEN (http://www.gsc.riken.go.jp/Mouse/), EUCOMM (http://www.eucomm.org), EURExpress (http://www.eurexpress.org) and ArrayExpress (http://www.ebi.ac.uk/arrayexpress/) to enable intelligent querying of the data. We already support some integration with the Mouse Phenome Database (see above) and have shared our SOPML with RIKEN. To develop this integration, we will in part utilize existing technologies such as BioMart and DAS, but will also investigate alternative methods. Of particular importance in this effort will be the development of new standards, as described elsewhere (13). Particular areas for development include:
A longer-term aim is to open up EuroPhenome-EMPReSS to the broad community as a repository for phenotype data via a linked submission procedure that would acquire information about phenotyping procedure, data, and ontological markup.
We thank Simon Greenaway for initial work on the EuroPhenome database schema and Aadya Shukla for coordinating the collection of raw data and for a preliminary draft of the database. We also thank the members of the EUMORPHIA consortium for their work in drafting the SOPs and generating the baseline data. Finally, we thank Hilary Gates and Mandy Studley for outstanding support during the development of these projects. European Commission (QLG2-CT-2002-00930, LSHG-CT-2006-037188); UK Medical Research Council. Funding to pay the Open Access publication charges for this article was provided by UK Medical Research Council.
Conflict of interest statement. None declared.