Support for gene expression data is the largest expansion of Xenbase. Gene expression data can be accessed in two ways: via the expression tab of the gene catalog and a new expression search. The expression tab lists the anatomy terms and expression stages where evidence of expression exists for each gene. Any expression thumbnail images are organized into categories by their source and are ordered from the latest to the earliest development stage. Clicking on an expression thumbnail will display a modal window with the standard-sized version of the image (). This window also contains a chart summarizing any associated annotations such as the development stages (5
) when expression occurs and XAO terms for tissues where expression occurs. Additional information may include a caption, copyright information, a link to a large image and a permanently linkable page. Indeed, clicking on an image anywhere in Xenbase will pull up a modal image box with this information. Users can navigate directly to the previous or next image in the series from the expression tab by clicking on the right or left side of the modal dialog box.
Modal dialog of gene expression information. When a thumbprint image is selected by the user, a modal dialog box is launched illustrating an enlarged view and additional data.
An expression search feature complements the expression tab by allowing users to search for expression patterns using a broad criteria and find expression patterns for clones not yet mapped to a gene. Users can search expression by a combination of gene name, development stages, XAO anatomy terms, data source type, experimenter and assay type or source type ().
Expression search interface. Over 500 000 gene expression objects can be searched using a variety of criteria.
To enhance usability, users can select XAO search terms by clicking checkboxes of commonly searched terms or by using a suggestion box that allows users to search for tissues by their XAO term or common synonyms. A suggestion box lists possible match terms as the user types. When the user selects an anatomy term, it appears in a list to the right of the search box. Users can drill-down to more specific terms by clicking the plus sign to the left of a tissue and examine parts of the tissue. In this manner, the user can find specific search terms without strong knowledge of the anatomy ontology.
After executing the search, the user is presented with a list of expression pattern results organized by the gene (or clone, if unmapped). The results include which stages and XAO terms were matched for each gene or clone. Searching for a particular tissue will return items annotated to parts of that tissue. For example, a search for pronephric kidney will also return results annotated to the glomerulus. From the initial search results, the user can drill-down to experiments matching their search criteria and then detailed information on each experiment.
Xenbase gene expression evidence is drawn from expressed sequence tags (ESTs), in situ hybridization and immunohistochemistry assays. For EST evidence, genes are aligned to ESTs from particular tissues at specific development stages. Gene expression images come from three sources: literature, large scale screens and community submissions. Images extracted from papers with the permission of publishers are manually curated and associated with genes, tissues and development stages based on information in the image caption and article content.
For older literature that we have scraped images from, we have performed an automated first pass of curation. Using the link-matching system, we identify gene names and synonyms in the captions. If a single gene is mentioned in the caption we also search for the use of XAO tissue terms or their synonyms. We then infer that the single gene mentioned must be expressed in all of the identified tissues. Development stage descriptions vary too much to extract them using automated means. Therefore, stages for these expression patterns are set to ‘unspecified’. While imperfect, this process allows initial associations between uncurated literature gene expression images and genes (and possibly XAO tissues). These literature images will be manually curated as time permits.
The largest block of expression images come from two large in situ
screens AxelDB (6
) and XDB3 (Naoto Ueno, NIBB, Okazaki, Japan), consisting of 2600 and 18 600 images, respectively. The Axeldb images are annotated by the development stage and often contain tissue annotation. The XDB3 images were only annotated with development stages. For both of these screens the gene involved has been determined by aligning the sequences for the probes used with mRNA for Xenbase gene catalog entries.
The last source of images is community-submitted images, of which there are currently over 2000. These images are submitted with at least curation of stage and accessions for the sequences used to generate probes. These clone sequences are aligned with mRNAs in our gene catalog to create gene associations. We would like to encourage users to submit their gene expression image data. To this end, there is a submit data button on each Xenbase page where users can upload data files—this includes an optional template spreadsheet for entering image annotation.
The alignment threshold used by Xenbase to align clone sequences from expression experiments (e.g. ESTs) to gene catalog mRNAs is a blast hit with a maximum e-value of 1e−20, a minimum 90% identity and 65% alignment. This methodology does leave the possibility that a single probe may be incorrectly associated with two genes with similar sequences. In these rare cases, a curator chooses the correct assignment.