DNA microarrays are commonly used for identifying the genes differentially expressed between two or more different tissue types or experimental conditions. The class comparison tool of BRB-ArrayTools provides powerful methods for finding differentially expressed genes while controlling either the number or proportion of false discoveries. The multivariate permutation tests used, described in detail in (Korn et al. 2004
; Simon et al. 2003a
), enable the user to specify, for example, that there should be 90% confidence that the resulting gene list contains no more than 10% false discoveries. This method is similar to the popular Statistical Analysis of Microarrays (SAM) method (Tusher et al. 2001
) but provides greater probabilistic control of the false discovery rate. The individual gene test statistics are based on a hierarchical model that enables within-class variance information to be shared among genes in a manner that does not assume that all genes have the same variance (Wright and Simon, 2003
). The multivariate permutation test is fully non-parametric and is much more powerful than standard univariate permutation tests, particularly when the number of biological replicates within classes is small. The method is more robust than parametric t or F tests used in the analysis of variance. The multivariate permutation test also takes advantage of the correlation structure of the genes. The SAM method (Tusher et al. 2001
) is also available within BRB-ArrayTools, and is implemented as a compiled FORTRAN program that runs several times faster than the other versions of SAM.
The basic class comparison tool has several options including the ability to stratify the analysis by a potentially confounding variable and the ability to do paired analyses. BRB-ArrayTools contains several other tools for finding differentially expressed or prognostic genes in more complex settings. For example, the quantitative trait tool finds genes significantly correlated with a quantitative phenotype such as age. The survival tool finds genes significantly associated with right censored survival data. In all of these tools, the multivariate permutation approach is used to provide a confidence-specific control on the number or proportion of false discoveries. BRB-Array Tools also provide analysis of variance tools for time-course analysis, for settings with numerous phenotypic factors of interest, fixed effect models, mixed models with random effects, and models for analysis of complex dual label hybridization designs that do not use a common reference sample.
The output of each tool is a list of significant genes, with numerous annotations for the genes and links to websites containing additional information. Included in the annotations are Gene Ontology categories and an analysis of which categories are over-represented in the gene list relative to the prevalence of Gene Ontology categories on the array. Chromosome location and pathway analyses are also provided.
In addition to using Gene Ontology descriptors to annotate a gene list, BRB-ArrayTools provides a tool for directly evaluating differential expression of Gene Ontology categories, Kegg or Biocarta pathways, Broad Institute signatures or user specified gene lists. This gene set comparison analysis is similar to the gene set enrichment analysis described by Subramanian et al. (Subramanian et al. 2005
). It reduces the number of comparisons, and thereby the multiple testing penalty, since inference is not made for individual genes. It also enables the discovery of differentially regulated pathways even where individual genes do not have large enough fold differences to be individually identified (Mootha et al. 2003