Annotation pages and Search
The front page of the database contains a list of experiments that are available in the database, the search tools and links to the data mining tools. If you select any of the experiments from the list the basic experiment page is shown.
On the top of the experiment page is an abstract for the experiment, and contact details for the experimenter. Beneath this is a list of the GeneChips (referred to as ‘slides’) used in the experiment. Each slide has information about how the sample that was hybridized to the slide was prepared by the user, and how the slide was handled when it was processed at NASC.
There are currently two searches available from the front page. The experiment search provides a keyword search of experiments like a search engine. The slide search selects slides that match criteria entered exactly—for instance all slides that were produced from samples treated in a certain way.
At many points in the database, the output data from the microarrays can be downloaded. NASC uses Affymetrix MAS 5.0 software for scanning and analysis of Affymetrix microarrays. NASCArrays stores four data points for each gene per GeneChip from this software: Signal, StatPairsUsed, PresentCall and Detection P-value. These are reproduced with some rudimentary annotation for the probes on the GeneChip when users download data from the database.
Data can be downloaded for one or many slides at once. Data are supplied as a comma separated values (CSV) file that can be read by many spreadsheet programs. Data can be downloaded over the web, or emailed to the user.
As an option, users can download data ‘for clustering’. These are data specially formatted for EPCLUST (7
). Using this feature allows users to easily perform clustering analysis using data from NASCArrays.
Data mining tools
In NASCArrays, a series of ‘Data mining tools’ are available. These allow researchers to use a ‘gene-centric’ rather than ‘array-centric’ approach to finding data of interest. Many researchers have ‘genes of interest’—these tools allow researchers to find experiments that are related to their gene of interest. The NASCArrays tools allow users to pick a gene of interest using the probe set reference number as given by Affymetrix, a gene symbol, an Arabidopsis Genome Initiative (AGI) identifier for a gene or a Complete Arabidopsis Transcriptome MicroArray (CATMA) code.
The spot history is a tool that is available in other microarray databases (8
). It shows the distribution of expression of a gene of interest over all experiments in the database. Results are displayed in the form of a histogram. Each bar in the histogram can be selected, and the slides that made up that range in the histogram will be shown. Researchers can thus easily locate slides that have unusual values of expression for a given gene. (Fig. is a histogram from the spot history tool. This histogram shows the distribution of gene expression for At3g08580.)
A histogram from the spot history tool. This histogram shows the distribution of gene expression for At3g08580.
Two-gene scatter plot
The two-gene scatter plot is a tool for rudimentary comparison of the expression profile of two genes. It takes the form of a scatter plot, with the expression values of each gene along the two axes. Using this tool allows researchers to quickly see if there are trends between the two genes. Any given point on the plot can be selected, and the slide corresponding to that point will be shown (Fig. ).
A sample scatter plot from the two-gene scatter plot.
The ‘gene swinger’ is a tool that allows researchers to sort all experiments based on which experiments a gene of interest shows most variability in. Upon choosing a gene of interest, the tool calculates the standard error for that gene for each experiment, and displays the experiments starting with the highest variability first. Experiments that are highly variable with respect to a gene should be of interest to researchers working on that gene.
Bulk gene download
If researchers have a series of genes they are interested in, they can download the expression over all experiments for these genes using Bulk gene download.
The selection is a feature analogous to the shopping cart on a web shopping site. It allows users to choose an arbitrary selection of slides and perform actions on them. For instance, both the spot history and two-gene scatter plot can operate on just the slides held in the selection. Data can be downloaded for just the slides in the selection—for instance a user could fill the selection with slides based on a certain ecotype, and then download all of these data as one file.