Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2994029

Formats

Article sections

- Abstract
- 1 Introduction
- 2 Bayesian hierarchical cross-study model
- 3 Posterior simulation
- 4 Estimation of differential expression
- 5 Datasets and Software
- 6 Validation
- 7 Experimental data example
- 8 Closing remarks
- Supplementary Material
- References

Authors

Related links

J Am Stat Assoc. Author manuscript; available in PMC 2010 November 30.

Published in final edited form as:

J Am Stat Assoc. 2009; 104(488): 1295–1310.

doi: 10.1198/jasa.2009.ap07611PMCID: PMC2994029

NIHMSID: NIHMS174344

See other articles in PMC that cite the published article.

In this paper we define a hierarchical Bayesian model for microarray expression data collected from several studies and use it to identify genes that show differential expression between two conditions. Key features include shrinkage across both genes and studies, and flexible modeling that allows for interactions between platforms and the estimated effect, as well as concordant and discordant differential expression across studies. We evaluated the performance of our model in a comprehensive fashion, using both artificial data, and a “split-study” validation approach that provides an agnostic assessment of the model's behavior not only under the null hypothesis, but also under a realistic alternative. The simulation results from the artificial data demonstrate the advantages of the Bayesian model. The 1 – AUC values for the Bayesian model are roughly half of the corresponding values for a direct combination of *t*- and SAM-statistics. Furthermore, the simulations provide guidelines for when the Bayesian model is most likely to be useful. Most noticeably, in small studies the Bayesian model generally outperforms other methods when evaluated by AUC, FDR, and MDR across a range of simulation parameters, and this difference diminishes for larger sample sizes in the individual studies. The split-study validation illustrates appropriate shrinkage of the Bayesian model in the absence of platform-, sample-, and annotation-differences that otherwise complicate experimental data analyses. Finally, we fit our model to four breast cancer studies employing different technologies (cDNA and Affymetrix) to estimate differential expression in estrogen receptor positive tumors versus negative ones. Software and data for reproducing our analysis are publicly available.

Microarray technologies that simultaneously measure transcriptional activity in a very large number of genes have been widely used in biology and medicine in the last decade, and the resulting data is often publicly available. To increase the reliability and efficiency of biological investigations, it can be critical to combine data from several studies. However, when considering multiple studies, variation in the measured gene expression levels is caused not only by the biological differences of interest and natural variation in gene expression within a phenotype, but also by technological and laboratory-based differences between studies (Irizarry et al., 2005; Consortium et al., 2006; Kerr, 2007). Two of the most important difficulties are the presence of both absolute and relative expression measurements, depending on the technology, and the challenges associated with cross referencing measurements made by different technologies to the genome and to each other (Zhong et al., 2007). Despite these difficulties, the results of combined analysis clearly demonstrate the potential for increased statistical power and novel discovery by combining data from several studies (Wu et al., 2002; Rhodes et al., 2002; Tomlins et al., 2005).

Most statistical work to date on combining microarray studies has focused on identifying genes that exhibit differential expression across two experimental conditions or phenotypes. We consider this problem here as well. There is now a substantial literature on Bayesian approaches to assessing differential expression across two or more experimental conditions within a single study. Both empirical and fully Bayesian models have been proposed, including parametric (Baldi and Long, 2001; Newton et al., 2001; Lönnstedt and Speed, 2002; Pan, 2002; Tseng et al., 2001; Bröet et al., 2002; Ibrahim et al., 2002; Townsend and Hartl, 2002; Gottardo et al., 2003; Ishwaran and Rao, 2003; Kendziorski et al., 2003; Ishwaran and Rao, 2005), semi-parametric (Newton et al., 2004) and non-parametric (Efron et al., 2001; Do et al., 2005) models. In each of these papers, a critical issue is shrinkage, and in particular borrowing strength across genes when estimating the gene-specific variance across samples. It is well established that shrinkage of the variance estimates provides worthwhile enhancements to single study analysis of differential expression (Liu et al., 2004).

There are several natural approaches for combining information from multiple microarray studies. One is to compute, separately for each study, statistics that summarize the relationship between each gene and the phenotype of interest. These may then be combined using methodologies such as those originally devised to integrate published results in meta-analysis (Hedges and Olkin, 1985). While initial efforts in this direction have considered combination of p-values (Rhodes et al., 2002), subsequent papers have focused on the more efficient strategy of combining effect sizes (Ghosh et al., 2003; Wang et al., 2004; Garrett-Mayer et al., 2007). At the opposite extreme of study combination are cross-study normalization methods (Wu et al., 2002; Parmigiani, 2002; Shen et al., 2004; Rhodes et al., 2004; Hayes et al., 2006; Johnson et al., 2007; Choi et al., 2007; Shabalin et al., 2008) that consider directly the sample-level measurements within each study, and merge these into a single data set, to which standard single-study analysis can be applied. A third approach, intermediate between the two above, is to integrate information about differential expression from the available studies using a joint stochastic model for all the available data (Choi et al., 2003; Conlon et al., 2006; Jung et al., 2006; Conlon, 2007; Conlon et al., 2007), in which only selected features of each study, such as parameters that capture the relationship between genes and phenotypes, are assumed to be related across studies. This perspective has the potential to offer additional efficiency over integration of summary statistics, and to allow for a more comprehensive treatment of uncertainty. At the same time it models the cross-study integration in a way that is tailored to the problem of interest, and potentially relies on fewer assumptions than direct data integration.

In this article we adopt this latter, intermediate approach, and propose a fully Bayesian hierarchical model to identify genes that exhibit differential expression between two experimental conditions, and across multiple studies. In this context, use of a fully Bayesian model has several desirable features. The model borrows strength across both genes and studies and can thereby provide better estimates of the gene-specific means, variances and effects. The model yields, through simulation, posterior probability distributions for all unobserved quantities. These distributions can be used to quantify the uncertainty of any parameter in the model, or to make joint inferences about multiple genes. Lastly, for each gene, the model yields the posterior probability that the gene is differentially expressed.

While the work of Conlon and colleagues considers several of these issues, its primary strength is in the combination of multiple studies from the same technology. We expand this and related work to address multi-platform analysis via several technical generalizations that are described in detail in the methods, and reviewed in the discussion. These include: modeling of study-specific indicators of differential expression; modeling of overall cross-platform correlations to allow shrinkage to be stronger across pairs of studies that are generally more concordant; modeling of both the mean of a gene and its phenotypic interactions with sufficient flexibility to avoid distributional modeling of the main effects; allowing for interactions between the technology and the effect; modeling of an adaptive, smooth dependence between effects and the variance terms.

Perhaps the most radical difference between our approach and its predecessors is the attention given to discordant differential expression. This occurs when a gene is more highly expressed in one phenotype than the other in some studies, while the opposite is observed in other studies. Earlier approaches would discount the gene: the high cross-study variance and cancellation of overall effects would likely position it with the uninteresting genes. However, across many meta-analyses, we have observed an excess of these discordant genes compared to what would have been predicted by chance alone, as captured by permutation of phenotype labels. When implementing shrinkage strategies, reliable assessments of concordant differential expression, which is typically of primary interest, must therefore account for the possibility of discordant differential expression across studies. We implement this by introducing a gene-specific indicator of whether a gene is different across conditions in all of the studies, but then we allow these differences to be gene and study specific.

While concordant differentially expressed genes remain the primary focus of the analysis, discordant genes can reveal important biological or technological information, and it is useful to identify them and report them. This is for at least two reasons: first, given the heterogeneous experimental designs that are encountered in microarrays, a discordant effect for a set of important genes may be the result of genetic heterogeneity of the samples across studies. For a simple example, consider the comparison of administering or not administering a certain drug in two studies which, unbeknown to the investigators, use strains of animals where the sets of biochemical pathways activated by the drug are not the same. Then certain genes' expression may be increased by the drug in one strain and decreased in the other. Another reason discordance is important is that the cross-referencing of transcripts across studies is typically gene-centric. However, as many as 40 to 60% of genes are able to produce multiple alternative transcripts (Modrek and Lee, 2002), whose expression may be positively or, as is common, negatively correlated. For example one transcript may be made primarily under normal conditions, while the other may be made mainly in response to stress. When two technologies measure a gene's expression by targeting portions of that gene that are associated with different transcripts that are negatively correlated with each other, discordant effects will be observed. In either case, important insight about technology, study designs and potentially the genetics of alternative splicing can be gained by following up on discordant genes.

The paper is organized as follows. In Section 2 we describe our Bayesian model and in Section 3 we define a Markov chain Monte Carlo algorithm for simulation from the resulting posterior distribution. Section 4 describes statistics from the Bayesian model that can be used to quantify differential expression, as well as alternative approaches for quantifying differential expression in the context of multiple studies. The datasets used in the simulation and experimental data example are described in Section 5. Sections 6 and 7 present results when applying our model to simulated and real data with comparisons to alternative methods. Concluding remarks are in Section 8. The software for fitting our Bayesian model is open source and freely available from Bioconductor (Gentleman et al., 2004).

In this section we introduce some basic notation and our Bayesian model. We refer to the resulting method for cross-study assessment of differential expression by the R package name, *XDE*.

In what follows we use *p* and *q* to index studies (i.e. data sets), *g* to index genes, and *s* to index samples (arrays) within each study. Let *x _{gsp}* denote the observed expression value for gene

$$\{{x}_{\mathit{gsp}}:g=1,\dots ,G;s=1,\dots ,{S}_{p};p=1,\dots ,P\}.$$

We assume that each study has been suitably normalized (and if necessary log-transformed) so that the mean expression value for each study is zero and the expression values for a given gene are approximately Gaussian under each condition. We restrict our analysis to the set of common genes in the available studies, though our model formulation can easily be extended to a situation in which there is substantial overlap, but not complete agreement, between the gene sets in different studies.

In the model described below, each sample is assumed to belong to one of two possible conditions or phenotypes. Let *ψ _{sp}* {0, 1} denote the phenotype of sample

We define a hierarchical Bayesian model for the expression values *x _{gsp}*. In the following discussion, the graphical model representation in Figure 1 can be used as a reference.

A graphical model representation of the hierarchical Bayesian model defined for the microarray data sets.

At the lowest level we assume the expression values *x _{gsp}*, conditional on some unobserved parameters, are independent and have a Gaussian distribution. For genes that are not differentially expressed,

$${x}_{\mathit{gsp}}\mid {\nu}_{\mathit{gp}},{\delta}_{\mathit{gp}},{\Delta}_{\mathit{gp}},{\sigma}_{g0p}^{2},{\sigma}_{g1p}^{2}\phantom{\rule{thickmathspace}{0ex}}~\phantom{\rule{thickmathspace}{0ex}}\mathrm{N}\left({\nu}_{\mathit{gp}}+{\delta}_{\mathit{gp}}(2{\psi}_{\mathit{sp}}-1){\Delta}_{\mathit{gp}},{\sigma}_{g{\psi}_{\mathit{sp}}p}^{2}\right).$$

(1)

At the next level in the model specification, prior distributions are selected for the model parameters. We begin by discussing the priors for

$${\nu}_{g}={\left({\nu}_{g1},\dots ,{\nu}_{\mathit{gP}}\right)}^{T}\phantom{\rule{1em}{0ex}}\text{and}\phantom{\rule{1em}{0ex}}{\Delta}_{g}={\left({\Delta}_{g1},\dots ,{\Delta}_{\mathit{gP}}\right)}^{T}$$

which represent, respectively, the means and offsets for gene *g*. The vectors * ν_{g}* and

$${\nu}_{g}\phantom{\rule{thickmathspace}{0ex}}~\phantom{\rule{thickmathspace}{0ex}}\mathrm{N}(0,{\Sigma}_{g})\phantom{\rule{1em}{0ex}}\text{and}\phantom{\rule{1em}{0ex}}{\Delta}_{g}\phantom{\rule{thickmathspace}{0ex}}~\phantom{\rule{thickmathspace}{0ex}}\mathrm{N}(0,{R}_{g}).$$

(2)

To specify the covariance matrices of Σ* _{g}* and

When modeling normal location and scale parameters in a hierarchical way, as we do here, two modeling choices are common. One is independence between scale and location, and the other is conjugacy. The latter is computationally more convenient, as it allows analytical expressions for the full conditional distribution. However, which of these two specifications fits better can vary from experiment to experiment, and in microarray analysis the fit is sensitive to the technology and normalization method used, see Liu et al. (2004). Recently, Caffo et al. (2004) proposed a more general family of models, one that encompasses both independence and conjugacy, by including a single parameter that indexes the distribution of location given scale. Here we extend this idea by introducing separate parameters for each individual study, and setting a common relative scale for Σ* _{g}* and

$${\left({\Sigma}_{g}\right)}_{\mathit{pp}}={\gamma}^{2}{\tau}_{p}^{2}{\sigma}_{\mathit{gp}}^{2{a}_{p}}\phantom{\rule{1em}{0ex}}\text{and}\phantom{\rule{1em}{0ex}}{\left({R}_{g}\right)}_{\mathit{pp}}={c}^{2}{\tau}_{p}^{2}{\sigma}_{\mathit{gp}}^{2{b}_{p}}\phantom{\rule{1em}{0ex}}p=1,\dots ,P.$$

(3)

Here ${\sigma}_{gp}^{2}=\sqrt{{\sigma}_{g0p}^{2}{\sigma}_{g1p}^{2},}$, the parameters *a _{p}*,

The correlation structure of Σ* _{g}* (and

At the next level in the hierarchical model specification, priors are placed on the hyperparameters *γ*^{2}, *c*^{2}, ${\tau}_{p}^{2}$, *a _{p}* and

$$\mathrm{P}({a}_{p}=0)={p}_{a}^{0},\phantom{\rule{thickmathspace}{0ex}}\mathrm{P}({a}_{p}=1)={p}_{a}^{1},\phantom{\rule{thickmathspace}{0ex}}{a}_{p}\mid {a}_{p}\in (0,1)\phantom{\rule{thickmathspace}{0ex}}~\phantom{\rule{thickmathspace}{0ex}}\mathrm{Beta}({\alpha}_{a},{\beta}_{a})$$

(4)

and

$$\mathrm{P}({b}_{p}=0)={p}_{b}^{0},\phantom{\rule{thickmathspace}{0ex}}\mathrm{P}({b}_{p}=1)={p}_{b}^{1},\phantom{\rule{thickmathspace}{0ex}}{b}_{p}\mid {b}_{p}\in (0,1)\phantom{\rule{thickmathspace}{0ex}}~\phantom{\rule{thickmathspace}{0ex}}\mathrm{Beta}({\alpha}_{b},{\beta}_{b}).$$

(5)

Independent vague priors are assigned to the remaining hyper-parameters. For *γ*^{2} we use an (improper) uniform distribution on (0, ∞), and for *c*^{2} a uniform distribution on (0, ${c}_{\mathrm{max}}^{2}$). Note that an improper prior can not be used for *c*^{2} as this may result in an improper posterior distribution. For ${\tau}_{1}^{2},\dots ,{\tau}_{P}^{2}$ we assign a joint (improper) uniform distribution under the natural restrictions ${\tau}_{p}^{2}>0$, *p* = 1, … , *P* and ${\Pi}_{p=1}^{P}{\tau}_{p}^{2}=1$.

In order to have a fully defined Bayesian model, it remains to specify prior distributions for the differential expression indicators *δ _{gp}*, and for the variances ${\sigma}_{g\psi p}^{2}$ used to define ${\sigma}_{gp}^{2}$. For the indicators

$$\mathrm{P}({\delta}_{\mathit{gp}}=1)={\xi}_{p}\phantom{\rule{thickmathspace}{0ex}}\text{and}\phantom{\rule{thickmathspace}{0ex}}{\xi}_{p}\phantom{\rule{thickmathspace}{0ex}}~\phantom{\rule{thickmathspace}{0ex}}\mathrm{Beta}({\alpha}_{{\xi}_{p}},{\beta}_{{\xi}_{p}}).$$

(6)

independently. In prior model B, we set the restriction *δ*_{g1} = … = *δ _{gP}* =

$$\mathrm{P}({\delta}_{g}=1)=\xi \phantom{\rule{thickmathspace}{0ex}}\text{and}\phantom{\rule{thickmathspace}{0ex}}\xi \phantom{\rule{thickmathspace}{0ex}}~\phantom{\rule{thickmathspace}{0ex}}\mathrm{Beta}({\alpha}_{\xi},{\beta}_{\xi}).$$

The variances ${\sigma}_{g\psi p}^{2}$ are assumed to be independent for different genes *g* and studies *p*, given the other hyperparameters. However, ${\sigma}_{g0p}^{2}$ and ${\sigma}_{g1p}^{2}$ should be correlated for the same gene *g* and study *p*. To obtain this, we set

$${\sigma}_{g0p}^{2}={\sigma}_{\mathit{gp}}^{2}{\phi}_{\mathit{gp}\phantom{\rule{1em}{0ex}}\text{and}}\phantom{\rule{1em}{0ex}}{\sigma}_{g1p}^{2}=\frac{{\sigma}_{\mathit{gp}}^{2}}{{\phi}_{\mathit{gp}}},$$

(7)

where ${\sigma}_{gp}^{2}$ and * _{gp}* have independent gamma prior distributions with $E\left[{\sigma}_{gp}^{2}\right]={l}_{p}$, $Var\left[{\sigma}_{gp}^{2}\right]={t}_{p}$, E[

The above prescriptions fully define the hierarchical Bayesian model visualized in Figure 1. The observed quantities are the expression values *x*_{gsp} and the conditions *ψ _{sp}*. Conditioning on the observed values we get a posterior distribution for the unobserved parameters

Each row in the table displays parameters used to simulate an artificial dataset of three studies with S samples in each. From Equation 11, *k** and *c** control the location and scale of the simulated Gaussian offsets, respectively. Together, **r***, *k**, and **...**

In order to evaluate the properties of the resulting posterior distribution, we adopt the Metropolis–Hastings algorithm (Metropolis et al., 1953; Hastings, 1970) to generate realizations from it. Nice introductions to the Metropolis–Hastings algorithm can be found in Smith and Roberts (1993) and Dellaportas and Roberts (2003). The algorithm is iterative, and each iteration consists of two parts. First, potential new values for one or a number of parameters are proposed according to a proposal distribution. Second, the proposed values are accepted with a specified probability. Depending on the mathematical form of the distribution of interest, different proposal mechanisms can be employed. In our posterior distribution we have seventeen types of parameters and to update these we combine seven types of proposal mechanisms. In the following we specify each of the proposal strategies used. In the description we use tilde to denote potential new values, e.g. *δ _{gp}* and

- The full conditionals for
,**ν**_{g}**Δ**_{g},*γ*^{2}and*ξ*have standard forms and we therefore use Gibbs steps (see the references given above) to update each of these separately. The full conditionals for_{p}and**ν**_{g}**Δ**_{g}both are multiGaussians, the full conditional for*γ*^{2}is an inverse gamma distribution, and for*ξ*it is a beta distribution._{p} - Separately, for each of the parameters
*a*and_{p}*b*, we use a “truncated” random walk proposal. In particular, for_{p}*a*we do the following: if_{p}*a*= 0 we draw_{p}*ã*from a uniform distribution on [0, ε]; if_{p}*a*= 1 we draw_{p}*ã*uniformly on [1 − ε, 1]; and if_{p}*a*(0, 1) we draw_{p}*U*from a uniform distribution on [*a*− ε,_{p}*a*+ ε] and set_{p}*ã*= min(1, max(0,_{p}*U*)). We note that this is a reversible jump type of proposal, and to e get the correct acceptance probability, one needs to use the theory introduced in Green (1995). - Separately, for each of the parameters ${\sigma}_{gp}^{2}$,
,_{gp}*l*,_{p}*t*,_{p}*λ*and_{p}*θ*, we propose a multiplicative change. In particular, for ${\sigma}_{gp}^{2}$ we set ${\stackrel{~}{\sigma}}_{gp}^{2}=U{\sigma}_{gp}^{2}$, where_{p}*U*is sampled from a uniform distribution on the interval [1/(1 + ε), 1 + ε]. - When updating $({\tau}_{1}^{2},\dots ,{\tau}_{P}^{2})$ we must ensure that the product of the proposed new values equals unity. We do this by randomly selecting two of the components,
*p*and*q*say, drawing*U*from a uniform distribution on [1/(1 + ε), 1 + ε], and setting ${\stackrel{~}{\tau}}_{p}^{2}=U{\tau}_{p}^{2}$ and ${\stackrel{~}{\tau}}_{q}^{2}={\tau}_{q}^{2}\u2215U$ - A block Gibbs update is used for
*c*^{2}and all the**Δ**_{g}'s for genes that have*δ*= 0._{gp} - Separately for each
*g*= 1, …,*G*, a block update is used for (*δ*_{g1}, …,*δ*_{gP}) and*Δ*. First, potential new values for_{g}*δ*_{g1}, …,*δ*are set. For prior model A we do this by inverting the current value of_{gP}*δ*for a randomly chosen study_{gp}*p*, i.e.= 1 −_{gp}*δ*, and keeping the other indicators unchanged. For prior model B (_{gp}*δ*_{g1}= … =*δ*=_{gP}*δ*) we invert all the indicators. Second, a potential new value for_{g}**Δ**_{g}is sampled from the associated full conditional (given the potential new values_{g1}, …,). The proposed values are then accepted or rejected jointly._{gP} - A block update is used for [
*ρ*] and_{pq}*γ*^{2}.

A similar block update is used for [*r _{pq}*] and

$${\stackrel{~}{\rho}}_{\mathit{pq}}=(1-\epsilon ){\rho}_{\mathit{pq}}+\epsilon {T}_{\mathit{pq}}.$$

Here [*T _{pq}*] is a correlation matrix which with probability one half is generated from the prior for [

In assessing the differential expression of genes across multiple studies, one naturally encounters a difficulty that is not present in single study analyses. This difficulty arises from the fact that a single differentially expressed gene *g* may be up-regulated in one or more studies, and down-regulated in others. When this occurs, we say that *g* is discordantly differentially expressed. If *g* is up-regulated in every study, or down-regulated in every study, we say that *g* is concordantly differentially expressed. Although concordant differential expression is the norm, discordance can arise from biological differences in the sample populations of each study, or from technological effects related to the design and implementation of specific array technologies. Discordance appears to be an unavoidable (and inconvenient) feature of multi-study analyses, one that comprehensive multi-study analyses should take into account.

We have developed two implementations of our Bayesian model: a single indicator model that assumes differential expression occurs in all of the studies or in none of the studies, and a multiple indicator model that allows study specific indicators, *δ _{gp}*, of differential expression.

In the following discussion we discuss the single indicator implementation of the Bayesian model for differential expression. Note that the indicator *δ _{g}* summarizes information across studies. The basis for our cross-platform analysis of differential expression is the posterior mean of

Let PM_{ε}(*g*) denote the posterior mean of *δ _{g}*. We view PM

$${\mathcal{C}}_{g}=\{\begin{array}{cc}1\hfill & \text{if}\phantom{\rule{thickmathspace}{0ex}}{\delta}_{g}=1\phantom{\rule{thickmathspace}{0ex}}\text{and all}\phantom{\rule{thickmathspace}{0ex}}{\Delta}_{\mathit{gp}},p=1,\dots ,P\phantom{\rule{thickmathspace}{0ex}}\text{have the same sign},\hfill \\ 0\hfill & \text{otherwise}\hfill \end{array}\phantom{\}}$$

(8)

and

$${\mathcal{D}}_{g}=\{\begin{array}{cc}1\hfill & \text{if}\phantom{\rule{thickmathspace}{0ex}}{\delta}_{g}=1\phantom{\rule{thickmathspace}{0ex}}\text{and the}\phantom{\rule{thickmathspace}{0ex}}{\Delta}_{\mathit{gp}},p=1,\dots ,P\phantom{\rule{thickmathspace}{0ex}}\text{do not all have the same sign},\hfill \\ 0\hfill & \text{otherwise},\hfill \end{array}\phantom{\}}$$

(9)

respectively.

For each gene, we compute the number of positive (*N*^{+}) and negative (*N*^{−}) signed offsets. Specifically, ${N}_{g}^{+}\equiv {\Sigma}_{p}^{P}{I}_{\{{\delta}_{\mathit{gp}}{\Delta}_{\mathit{gp}}>0\}}$ and ${N}_{g}^{-}\equiv {\Sigma}_{p}^{P}{I}_{\{{\delta}_{\mathit{gp}}{\Delta}_{\mathit{gp}}<0\}}$, where *I*_{{·}} takes the value 1 when {·} is true and 0 otherwise. Then concordant and discordant indicators for differential expression are given by the relationships

$${\mathcal{C}}_{g}=\{\begin{array}{cc}1\hfill & \phantom{\rule{thickmathspace}{0ex}}\text{if}\phantom{\rule{thickmathspace}{0ex}}{N}_{g}^{+}\times {N}_{g}^{-}=0\phantom{\rule{thickmathspace}{0ex}}\text{and}\phantom{\rule{thickmathspace}{0ex}}{N}_{g}^{+}+{N}_{g}^{-}\ge m\hfill \\ 0\hfill & \text{otherwise}\hfill \end{array}\phantom{\}}\phantom{\rule{1em}{0ex}}\text{and}\phantom{\rule{1em}{0ex}}{\mathcal{D}}_{g}=\{\begin{array}{cc}1\hfill & \phantom{\rule{thickmathspace}{0ex}}\text{if}\phantom{\rule{thickmathspace}{0ex}}{N}_{g}^{+}\times {N}_{g}^{-}\ne 0\hfill \\ 0\hfill & \text{otherwise},\hfill \end{array}\phantom{\}}$$

where *m* is the minimum number of studies for which the gene is differentially expressed. The posterior mean of each indicator can again be estimated by the empirical mean of the corresponding simulated quantities. Let PM_{}(*g*) and PM_{}(*g*) denote the corresponding posterior mean values. Then a gene *g* may be classified as concordantly or discordantly differentially expressed whenever PM_{}(*g*) > *a* or PM_{}(*g*) > *a* for some threshold *a* > 0.

We consider three alternatives to *XDE* for estimating differential expression: the implementation of the Choi et al. (2003) random effects model in the **R** package *GeneMeta* (Gentleman et 2005), and cross-study summaries of *t*- and *SAM*-statistics. While a comparison with the Conlon et al. (2006) paper would be interesting, software for fitting this model to expression data is not readily available.

The study-specific statistics from which we derive cross-study summaries of differential expression are the Welch *t*-statistic (*t _{gp}*),

For evaluating overall differential expression, we follow the discussion of Garrett-Mayer et al. (2007), and combine the elements of U_{g} in a linear fashion to obtain a statistic suitable for assessing differential expression:

$$\begin{array}{c}\hfill {\mathrm{u}}_{\U0001d4d4}\left(g\right)\equiv {\alpha}_{1}\left|{U}_{g1}\right|+\dots +{\alpha}_{P}\left|{U}_{\mathit{gP}}\right|,\phantom{\rule{thickmathspace}{0ex}}\text{where}\hfill \\ \hfill {\alpha}_{p}\equiv \frac{{\mathrm{L}}_{p}\sqrt{{S}_{p}}}{\underset{i=1}{\overset{P}{\Sigma}}{\mathrm{L}}_{i}\sqrt{{S}_{i}}}\phantom{\rule{thickmathspace}{0ex}}\text{for}\phantom{\rule{thickmathspace}{0ex}}p\in \{1,\dots ,P\},\hfill \end{array}$$

(10)

Here *L* is the covariance loading from the first principal component of the vectors U_{g}, and *S _{p}* is the number of samples in Study

$${\mathrm{u}}_{\mathcal{C}}\left(g\right)\equiv |{\alpha}_{1}{U}_{g1}+\dots +{\alpha}_{P}{U}_{\mathit{gP}}|.$$

As an alternative, we also used the combined (across studies) estimate of effect size from the random effects model proposed by Choi et al. (2003) directly. We denote this statistic by z_{}(*g*). The ‘borrowing of strength’ in the estimation of z_{}(*g*) is strictly across studies (as opposed to across genes and studies), as the study-specific effect sizes for a given gene are assumed to be a draw from a Gaussian distribution in the second level of the random effects model. Assessments of discordant differential expression using the *t _{gp}*,

$${\mathrm{u}}_{\mathcal{D}}\left(g\right)\equiv \{\begin{array}{cc}\phantom{\rule{1em}{0ex}}\phantom{\rule{thickmathspace}{0ex}}{\mathrm{u}}_{\U0001d4d4}\left(g\right)\hfill & \text{sign}\left({U}_{g1}\right)=\cdots =\text{sign}\left({U}_{\mathit{gP}}\right)\hfill \\ -1\times {\mathrm{u}}_{\U0001d4d4}\left(g\right)\hfill & \text{otherwise}.\hfill \end{array}\phantom{\}}$$

The **R** package *XDE* contains the software used to fit the Bayesian hierarchical model, as well as convenient methods to compute the alternative statistics described in this paper.

The following lung cancer studies are referred to by institution: Harvard (Bhattacharjee et al., 2001) (203 samples on the Affymetrix Human Genome 95A platform containing 12,453 probesets), Michigan (Beer et al., 2002) (108 samples on the Affymetrix HuGeneFL Genome Array platform containing 6663 probesets), and Stanford (Garber et al., 2001) (68 samples on the cDNA platform containing 23,100 probes). The simulation described in Section 6 uses the normalized and merged platform annotations made publicly available from the authors of a previous cross-study analysis (Parmigiani et al., 2004). Briefly, Parmigiani et al. (2004) applied a robust multichip average (Irizarry et al. (2003)) separately to the two Affymetrix platforms (Harvard and Michigan). Intensity ratios from the Cy5 and Cy3 channels for the Stanford (Garber et al., 2001) dataset (23,100 features on the cDNA platform) were log-transformed. Following normalization, probesets (Affymetrix) and image clone identifiers (cDNA) in each platform were mapped to UniGene identifiers. Many-to-one mappings (multiple probes map to one UniGene identifier) were averaged and one-to-many mappings were excluded. The studies were then merged by Uni-Gene identifiers, resulting in a common set of 3,171 features. The normalized and merged datasets were obtained from the **R** package *lungExpression* available on the Bioconductor website (http://www.bioconductor.org).

Four breast cancer studies containing phenotypic data on estrogen receptor (ER) status (Sorlie et al. (2001), Huang et al. (2003), Hedenfalk et al. (2001), and Farmer et al. (2005)) were normalized according to platform type. In particular, Affymetrix platforms (the Farmer and Huang datasets) were normalized by RMA, whereas cDNA platforms (Sorlie and Hedenfalk) were normalized using the methods described in Smyth and Speed (2003) and implemented in the **R** package LIMMA. Following normalization, platform-specific annotations were mapped to Entrez-gene identifiers and the resulting lists merged to obtain a set of 2064 genes.

This section, comprised of two parts, extensively evaluates two implementations of the Bayesian model. In the first part of this section, we assess the single indicator and multiple indicator implementations using two simulation scenarios: one that simulates differential expression in all of the studies or none of the studies through a single indicator ${\delta}_{g}^{\star}$, and a second that simulates differential expression in a subset of the studies through study-specific indicators of differential expression, ${\delta}_{\mathit{gp}}^{\star}$. As the set of genes that are differentially expressed is known through simulation, we assess the performance using diagnostics such as the area under the ROC curve (AUC). In the second part of this section, we evaluate the shrinkage properties of the Bayesian model by applying *XDE* to multiple splits of a single study. Comparisons of *XDE* to alternative methods for cross-platform analysis are discussed throughout.

Our simulations are based on three publicly available lung cancer datasets that we refer to by institution: Harvard (Bhattacharjee et al., 2001) , Michigan (Beer et al., 2002), and Stanford (Garber et al., 2001). See Section 5 for a brief description of these datasets. We begin by describing an approach for generating artificial datasets for which the true set of differentially expressed genes is known. We append the superscript ‘^{}’ to parameters used in the simulation to distinguish the *true* values from the corresponding variables in the Bayesian model. The simulation uses only stage I or II adenocarcinomas in the Harvard (n=83), Stanford (n=11), and Michigan (n=61) studies. Late stage adenocarcinomas were excluded, as the heterogeneity of these tumors is typically much greater. From each available study we randomly select S samples, and then randomly assign the clinical variable ^{} = 0 to half of the samples in the study, and ^{} = 1 to the remaining half. Although there is non-trivial heterogeneity within the adenocarcinomas, these differences become small (on average) after random assignment into classes, and provide a background noise that would be difficult to simulate *de novo*.

Independently for each gene, we simulate ${\delta}_{g}^{\star}\in \{0,1\}$ from a Bernoulli distribution with parameter ξ^{} that is common to all genes. For genes with ${\delta}_{g}^{\star}=1$ we thereafter generate “true” offsets $({\Delta}_{g1}^{\star},{\Delta}_{g2}^{\star},{\Delta}_{g3}^{\star})$ from a multivariate normal distribution

$$\left[\begin{array}{c}\hfill {\Delta}_{g1}^{\ast}\hfill \\ \hfill {\Delta}_{g2}^{\ast}\hfill \\ \hfill {\Delta}_{g3}^{\ast}\hfill \end{array}\right]\phantom{\rule{thickmathspace}{0ex}}~\phantom{\rule{thickmathspace}{0ex}}\mathrm{N}\left({k}^{\ast}\left[\begin{array}{c}\hfill {s}_{g1}\hfill \\ \hfill {s}_{g2}\hfill \\ \hfill {s}_{g3}\hfill \end{array}\right],\frac{1}{{c}^{\ast}}\left[\begin{array}{ccc}\hfill {s}_{g1}^{2}\hfill & \hfill {r}_{1}^{\ast}{s}_{g1}{s}_{g2}\hfill & \hfill {r}_{2}^{\ast}{s}_{g1}{s}_{g3}\hfill \\ \hfill {r}_{1}^{\ast}{s}_{g2}{s}_{g1}\hfill & \hfill {s}_{g2}^{2}\hfill & \hfill {r}_{3}^{\ast}{s}_{g2}{s}_{g3}\hfill \\ \hfill {r}_{2}^{\ast}{s}_{g3}{s}_{g1}\hfill & \hfill {r}_{3}^{\ast}{s}_{g3}{s}_{g2}+\hfill & \hfill {s}_{g3}^{2}\hfill \end{array}\right]\right),$$

(11)

where *s*_{g1}, *s*_{g2} and *s*_{g3} are the empirical standard deviations for the adenocarcinoma samples in Harvard, Michigan, and Stanford, respectively, and *k*^{}, *c*^{} and **r**^{} are parameters in the simulation procedure. Letting *x _{gsp}* denote the original adenocarcinoma expression values, we generate the corresponding artificial data as

$${x}_{\mathit{gsp}}^{\ast}=\{\begin{array}{cc}{x}_{\mathit{gsp}}+(2{\psi}_{\mathit{sp}}^{\ast}-1){\Delta}_{\mathit{gp}}^{\ast}\hfill & \text{if}\phantom{\rule{thickmathspace}{0ex}}{\delta}_{g}^{\ast}=1,\hfill \\ {x}_{\mathit{gsp}}\hfill & \text{otherwise}.\hfill \end{array}\phantom{\}}$$

(12)

We consider a gene *g* as differentially expressed if ${\delta}_{g}^{\star}=1$. Differential expression is concordant if ${\Delta}_{g}^{\star}$ have the same sign in all studies and discordant if ${\Delta}_{g}^{\star}$ have opposing signs. Concordant and discordant differential expression are special cases of differential expression that we consider separately. Note that the simulation parameters **r**^{}, *c*^{}, and *k*^{} control the proportion of differentially expressed genes that are concordant in the simulation. For instance, increasing **r**^{} and *c*^{} has the effect of increasing the percentage of concordantly differentially expressed genes. Table 1 provides a complete listing of the simulation settings evaluated. Table 2 illustrates the possible patterns of differential expression for *P* = 2 studies and *G* = 4 genes.

We modified the algorithm to simulate study-specific differential expression as follows. First, we simulated **Δ**^{} for *all* of the genes. Secondly, for genes with $\left|{\Delta}_{\mathit{gp}}^{\star}\right|$ greater than the 0.9 quantile of the $\left|{\Delta}_{\cdot p}^{\star}\right|$ distribution, we set ${\delta}_{\mathit{gp}}^{\star}=1$. Notice that correlation between the elements of ${\Delta}_{g\cdot}^{\star}$ induces a correlation between the elements of ${\delta}_{g\cdot}^{\star}$. Finally, we simulated **Δ**^{} a second time to obtain **Δ**^{} independent of **δ**^{}. The simulated expression data was generated as in Equation 12, replacing ${\delta}_{g}^{\star}$ with study-specific indicators, ${\delta}_{\mathit{gp}}^{\star}$. Following the above algorithm, we generated four datasets using the settings of Simulations A, F, J, and O in Table 1.

For each of the Simulations A-R in Table 1, we develop summary measures, referred to as scores, to quantify concordant (), discordant (), or the union of (differentially) expressed () genes. Section 4 discusses the summary statistics proposed for the Bayesian model, as well as alternative methods for summarizing differential expression. We emphasize that for a gene *g*, _{g}, _{g}, and _{g} are defined on a set of studies, as opposed to differential expression in a single study.

Let score_{*}(*g*) denote any of the scores defined in Sections 4.1 (PM_{*}) and 4.2 (u_{*}) for a gene *g*. If, for a fixed threshold *a* > 0, we classify each *g* as being (overall, concordantly or discordantly) differentially expressed if score_{*}(*g*) > *a*, then we obtain a standard two-by-two table containing the number of false negatives FN_{*}(*a*), false positives FP_{*}(*a*), true negatives TN_{*}(*a*), and true positives TP_{*}(*a*) for that particular value of *a*. For example, in the case of differential expression () the number of true negatives is given by

$${\mathrm{TN}}_{\U0001d4d4}\left(a\right)=\underset{g=1}{\overset{G}{\Sigma}}I({\text{score}}_{\U0001d4d4}\left(g\right)\le a\phantom{\rule{thickmathspace}{0ex}}\text{and}\phantom{\rule{thickmathspace}{0ex}}{\delta}_{g}^{\ast}=0),$$

(13)

and the remaining entries of the table for differential expression are defined in a similar fashion. The false positive and true positive rates associated with the statistics score_{*}(*g*) and threshold *a* are given by

$${\mathrm{FPR}}_{\ast}\left(a\right)=\frac{{\mathrm{FP}}_{\ast}\left(a\right)}{{\mathrm{TN}}_{\ast}\left(a\right)+{\mathrm{FP}}_{\ast}\left(a\right)}\phantom{\rule{1em}{0ex}}\text{and}\phantom{\rule{1em}{0ex}}{\mathrm{TPR}}_{\ast}\left(a\right)=\frac{{\mathrm{TP}}_{\ast}\left(a\right)}{{\mathrm{FN}}_{\ast}\left(a\right)+{\mathrm{TP}}_{\ast}\left(a\right)},$$

(14)

respectively. Plotting FPR_{*}(*a*) against TPR_{*}(*a*) as *a* varies produces the standard receiver operating characteristic (ROC) curve associated with the statistics score_{*}(*g*). The area under the ROC curve, AUC, is a nonparametric measure of the quality of the statistic, with values close to unity (*i.e*., a statistic that simultaneously achieves FPR close to zero and TPR close to one) being the best.

As an alternative to ROC curves, which are based on false and true positive rates, we also considered the false discovery rate (FDR) of the statistics score_{*}(*g*) as a function of the number of genes determined to be differentially expressed. Specifically, for each threshold a, we plotted the number of discoveries, ${\Sigma}_{g=1}^{G}I({\text{score}}_{\ast}\left(g\right)>a)$, against

$${\mathrm{FDR}}_{\ast}\left(a\right)=\frac{{\mathrm{FP}}_{\ast}\left(a\right)}{{\mathrm{FP}}_{\ast}\left(a\right)+{\mathrm{TP}}_{\ast}\left(a\right)}.$$

As expected, the FDR increases as the number of overall discoveries increases. Curves close to the horizontal axis are preferable to those having a more rapid increase of FDR with the number of discoveries. Similarly, we plotted the number of non-differentially expressed genes, ${\Sigma}_{g=1}^{G}I({\text{score}}_{\ast}\left(g\right)<a)$, against the missed discovery rate

$${\mathrm{MDR}}_{\ast}=\frac{{\mathrm{TN}}_{\ast}\left(a\right)}{{\mathrm{FN}}_{\ast}\left(a\right)+{\mathrm{TN}}_{\ast}\left(a\right)}.$$

Again, curves close to the horizontal axis are preferable to those having a more rapid increase of MDR with the number of negative discoveries.

Following the algorithm for simulating ${\delta}_{g}^{\star}$, we generated artificial datasets for Simulations A - R in Table 1. Initial model parameter values for single indicator implementation of *XDE* were chosen to specify little prior knowledge: *α _{a}* =

For each simulated dataset, performance of the cross-study scores were assessed by the AUC, FDR, and MDR criteria. A graphical display of the results for Simulation A is shown in Figure 2. The Bayesian model has a higher AUC (panel 1), as well as a lower FDR and MDR than the alternative scores over a range of cut-offs for evaluating (panels 2 and 3, respectively). Because of the extensive nature of the simulations, we visually assess the relative performance of the Bayesian method to the alternative methods via scatterplots of the AUC (e.g., Figure 3). Points beneath the identity line are simulations in which the Bayesian score had a higher AUC than an alternative method evaluated on the same dataset. Figure 3 plots the AUCs for in Simulations A-R. The corresponding AUC statistics for and are provided in Supplementary Figures 3(a) and 3(b). To assess the sensitivity of the AUC to the random number seed used for simulating ${\delta}_{g}^{\star}$ and ${\Delta}_{g}^{\star}$, each panel of Supplementary Figure 2 displays a scatterplot of the AUCs for multiple datasets generated from the same simulation parameters.

Performance diagnostics for scores quantifying in Simulation A. The letter *d* in the legend corresponds to the Bayesian score. Although the SAM-score does markedly better than the t-score when the the individual studies are small (S = 4), considerable **...**

The AUC for concordant differential expression in Simulations A - D (top left), E - I (top right), J - M (bottom left), and O - R (bottom right) was calculated for each of the alternative methods (t, SAM, and z) and plotted against the AUC obtained from **...**

To evaluate the relative performance of the single and multiple indicator implementations of the Bayesian model, we generated 8 datasets using the settings of Simulation A, F, J, and O in Table 1 and the ${\delta}_{g}^{\star}$ and ${\delta}_{\mathit{gp}}^{\star}$ simulation algorithms. The single indicator implementation generally had higher sensitivity and specificity for assessing concordant differential expression than the multiple indicator implementation when differential expression was simulated for all of the studies or in none of the studies (Row 1, Figure 4). Row 2 of Figure 4 displays a scatterplot of the AUCs when differential expression was simulated in a subset of the studies (${\delta}_{\mathit{gp}}^{\star}$). Note that both the single and multiple indicator implementations have higher AUCs than the alternative methods, irrespective of sample size, for concordant-, discordant-, and differential-expression when differential expression was simulated in a subset of the studies.

We assessed the relative performance of the single and multiple indicator implementations of the Bayesian model through simulation using AUC as a measure of performance. Row 1:. Differential expression was simulated using a single indicator of differential **...**

In general, the Bayesian model outperforms the three alternative methods for cross-study analysis of differential gene expression across a range of simulated parameters (Figure 3). Our overall assessment does not appear to be sensitive to the random quantities simulated in these datasets (Figure 2). As the sample sizes of the individual studies increase, the relative benefit of borrowing strength across genes and studies in the hierarchical model diminishes. Instances in which the z-score has a better AUC than the corresponding Bayesian statistic (e.g., panel (2,1) in Figure 3) occurred only when differential expression was simulated in all of the studies through a single indicator, ${\delta}_{g}^{\star}$, and most often occurred when the simulated data was particularly noisy and the AUC from all methods were at the low-end of the range. In such instances, scatterplots of the study-specific effect sizes were largely uncorrelated (data not shown). Scatterplots of a study-specific statistic for effect-size, such as t, may be a useful indicator of whether the Bayesian model is likely to improve on simpler alternatives. In instances where the data is negatively correlated across studies, this may induce the ‘wrong’ borrowing of strength. When simulating study-specific differential expression, ${\delta}_{\mathit{gp}}^{\star}$, both implementations of the Bayesian model performed better than the alternative methods over the range of simulations evaluated. In simulated datasets with no signal (data not shown), gene-specific posterior probabilities in the Bayesian model were approximately zero.

To assess the baseline behavior of *XDE*, we split the Huang study into four disjoint parts, treating each part as an independent study. We randomly assigned 5 estrogen receptor (ER) negative and 16 ER positive samples to each split. In this simplified setting, we avoid the potential difficulties of cross-platform analyses that can arise from technological and/or biological differences between studies. For instance, differences in the annotation of the probes or ethnic composition of the study populations may each contribute to discrepant results in a meta-analysis, but such concerns are reduced when splitting a single study. Split study validation has been used by others to assess meta-analytic methodologies for gene expression analysis. In particular, Gentleman et al. (2005) use split study validation to illustrate their implementation of the cross-platform statistic introduced by Choi et al. (2003).

After fitting the single indicator implementation of the Bayesian model to the four splits, traceplots for the parameters *a*, *b*, *l*, *t*, *γ*_{2}, *c*_{2}, *τ*_{2}, *ξ*, *ρ*, and *r* (each of which are updated by Metropolis-Hastings proposals) were used to evaluate convergence (see Supplementary Figure 4). We define the Bayesian effect size BES for gene *g* and platform *p*, by $\frac{{\delta}_{g}{\Delta}_{\mathit{gp}}}{c{\tau}_{p}{\sigma}_{\mathit{gp}}^{{b}_{\mathrm{p}}}}$, and use this as a study-specific Bayesian estimate of differential expression, contrasting it with the z, t, and SAM statistics. Scatterplots of the study-specific t-, z-, and BES statistics are shown in Supplementary Figure 5. If we consider the t, SAM (not shown), and z statistics as evidence of differential expression in a single study, we observe that the evidence is study-dependent with only moderate correlation of these statistics across the splits (Supplementary Figures 5(c) and 5(d)). Hence, scatterplots of the study-specific statistics provide two important pieces of information: first, even in a scenario that minimizes inter-study discordance, the variation across studies of the effect size statistics underscore the difficulty of identifying genes that show consistent evidence of differential expression; secondly, while the scatterplots do not lend themselves directly to identifying a list of genes for follow-up, the moderate correlation among the study-specific statistics does motivate an approach that uses the information from all of the studies.

A set of concordant differentially expressed genes emerges from the visualization of the BES scatterplots in Supplementary Figure 5(b). Through modeling the inter-relationships of genes and studies at higher levels of the model, the Bayesian model shrinks noisy genes to zero without requiring extensive filtering prior to the analysis. The cigar-shaped pattern in Supplementary Figure 5(b) is typical when fitting the Bayesian model, though the correlation is higher than what one may expect to observe when the studies are independent and use different platforms (see Section 7). In choosing a list of genes to follow for subsequent laboratory investigation, the PM_{}(*g*), displayed in Supplementary Figure 5(a), can be used to rank the evidence of concordant differential expression.

Validation of microarray experiments typically involves assaying the RNA transcript abundance of selected genes by using low-throughput platforms, such as qRT-PCR. As the PM_{}(*g*) identifies genes whose differential expression is relatively study- and/or platform-independent, validation of the gene list selected by PM_{}(*g*) may be less likely to result in false discoveries, as suggested by the simulations in the previous section. Thus, meta-analysis has the potential to reduce the cost of downstream analyses. Moreover, meta-analysis enables an impartial investigator not directly associated with the primary studies to synthesize the information that the primary studies contain. The Bayesian model enables one to identify genes and pathways that are concordantly effected across studies, as well as the genes and pathways that appear discordantly regulated. Whether the goal is to produce a gene list that is likely to be validated by other platforms, or to explore the biology underlying concordance and discordance, the Bayesian model provides a useful means of achieving these goals.

Estrogen receptor is an important risk factor for breast cancer tumori-genesis. Several gene expression studies have collected phenotypic information on estrogen receptor (ER) status (positive or negative). In this section, we fit the Bayesian model to four publicly available datasets described in Huang et al. (2003) (Huang), Hedenfalk et al. (2001) (Hedenfalk), Farmer et al. (2005)(Farmer), and Sorlie et al. (2001) (Sorlie), using ER status as the clinical variable. Table 3 shows the distribution of ER status in the four breast cancer studies. See Section 5 for a brief description of the data and pre-processing steps. Integration of the breast cancer studies provides an opportunity to define a ranked list of differentially expressed genes that is potentially more complete, and less likely to be platform-dependent, than lists derived from a single study. Because the studies involve different gene expression platforms, we cross-reference the study-specific gene annotations by Entrez-gene identifiers and focus our discussion on the set of 2064 Entrez genes that were present in each of the four studies.

When fitting the Bayesian hierarchical model to the breast cancer datasets, we found it unnecessary to change the hyperparameters and tuning parameters for the Metropolis–Hastings algorithm from their default values (see Table 1). To monitor the convergence and mixing properties of the Markov chain, we used visual inspection of the trace plots of the various simulated variables. The slowest convergence and mixing properties occurred for the four hyper-parameters *θ _{p}*,

We calculated posterior statistics using every 20th iteration after 2000 iterations of burn-in. The scatterplot in Figure 5 displays the distribution of the posterior means for the indicators of concordant differential expression (x-axis) and discordant differential expression (y-axis). We use a grey-scale to display the gradient of posterior means for differential expression. See Supplemental Figure 5 for a color gradient. In the discussion that follows, we discuss the gradient in terms of evidence for differential expression, ranging from uncertainty (near 0.5) in light grey to strong evidence (near 1) in black.

A scatterplot of the posterior means (PM) for the indicators of concordant (x-axis) and discordant (y-axis) differential expression. Plotting symbols are color coded by the gradient of the PM for the differential expression indicators. In purple, are **...**

Our model finds strong evidence of differential expression in a moderate number of genes: 30.9% of the genes have PM_{ε}(*g*) > 0.95, and among these genes we observe more concordance than discordance, as reflected by the relative density of genes at (x, y) coordinates (1,0) versus (0, 1) in the scatterplot. The remaining genes show moderate to weak evidence (uncertainty) of differential expression. We note that our model has difficulty distinguishing between no differential expression and low levels of differential expression in the ER dataset. Recall that the prior for **Δ** is a multivariate normal distribution with mean zero and overall variance parametrized by *c*^{2} (Section 2). The posterior mean for *c*^{2} in the ER dataset is approximately 0.028 (Supplementary Figure 6), and this value has several related implications. First, when the variance of **Δ** is very small, values of *δ _{g}* = 1 or

In the ER dataset, the posterior mean of *ξ* is 0.77, although a much smaller proportion of genes show strong evidence of differential expression. The software implementation of the Bayesian model flags instances of small *c*^{2}, alerting the user of the potential for inflated *ξ*. Evaluating different priors for **Δ** is a future direction of this research.

The main conclusions of the analysis are only affected by the ranking of the PM_{ε}(*g*). Estimates of concordant differential expression, the most common goal of most integration efforts, appear to be unaffected by large values of *ξ*. For example, the posterior expected proportion of false positives for the experimental data (as estimated using the methods in Efron and Tibshirani (2002)), was low for a range of PM_{}(*g*) cutoffs. In particular, the posterior expected proportion of false positives using thresholds of 0.5 and 0.9 for PM_{}(*g*) were 0.22 and 0.04, respectively.

We separately explored concordant and discordant differential expression among the four ER data sets under study, combining visualizations that are effective for summarizing overall reproducibility (pairwise scatterplots of effect size) with statistics from the Bayesian model that can be used to target a specific subgroup of genes that appear to be concordantly (Figure 6) or discordantly (Figure 7) regulated in the different studies. Genes in the 95th percentile of PM_{}(*g*) (to the right of the vertical dashed line in Figure 6(a)) are plotted with a different symbol (circles) and color (black) in the pairwise scatterplots of BES, t-, and z-statistics. The Bayesian model shrinks noisy estimates of the effect size towards zero (panel b, Figure 6(b)), whereas genes with stronger evidence of concordant differential expression are shrunk less and appear in the upper right and lower left quadrants of the pairwise scatterplots of BES in panel b.

Ranking genes by the PM_{}(*g*) is useful for exploring inter-study agreement of differential expression. Here, we threshold genes by the 95th percentile of the distribution for the posterior average of concordant differential expression, PM_{} **...**

The posterior average of the probability of discordant differential expression, PM_{}(*g*) (panel a), can be used to explore discordance. Here we threshold pairwise scatterplots of the study-specific statistics from the four breast cancer studies **...**

Figure 7 explores discordance. Panels b - d are the same as in Figure 6, but with an emphasis on inter-study discordance identified by thresholding the upper 5% of the PM_{}(*g*) distribution (genes to the right of the vertical dashed line in Figure 7(b)). Again, emphasis is placed on a subset of genes through different plotting symbols (x) and color (black). Note that almost all of the discordance in the scatterplots shown in Figure 7b - d arise from pairwise comparisons of cDNA platforms (Sorlie and Hedenfalk) to the Affymetrix platforms (Farmer and Huang). Discordance between Affymetrix and cDNA platforms may arise, for instance, as a result of probes hybridizing to different transcripts from the same gene. Note that in the scatterplots comparing like platforms, Sorlie versus Hedenfalk (both cDNA) and Farmer versus Huang (both Affymetrix), the effect size estimates of the highlighted genes are positively correlated.

In this paper, we define a hierarchical Bayesian model for microarray expression data collected from several studies, and use it to identify genes that show differential expression between two phenotypic conditions. Two implementations of this model are available in the R package *XDE* available from the Bioconductor website (http://www.bioconductor.org). The first implementation uses a single indicator for differential expression that summarizes information across studies. The second implementation allows multiple indicators for differential expression, permitting differential expression of a gene in some of the studies and not in other studies.

We evaluated the performance of the single and multiple indicator models using artificial and experimental data. The simulation results from the artificial data demonstrate the advantages of a Bayesian model. Compared to a more direct combination of *t*- or SAM-statistics, the 1 − AUC values for the Bayesian model are roughly half of the corresponding values for the *t*- and SAM-statistics. Furthermore, the simulations provide guidelines for when the Bayesian model is most likely to be useful. In small studies the Bayesian model generally outperforms other methods when evaluated by AUC, FDR, and MDR across a range of simulation parameters, and these differences diminish for larger sample sizes in the individual studies. When differential expression was simulated in a subset of the studies, the Bayesian model outperformed the alternative methods irrespective of sample size. In addition, we carried out a ”split-study” validation, which provides a model-free assessment of the method's behavior in the absence of platform differences. The split-study validation illustrates appropriate shrinkage of the Bayesian model in the absence of confounding platform and annotation based differences. Split-study validation may also provide a useful context for exploring how the ranks resulting from integration efforts differ across a set of genes known to be involved in a particular pathway. For example, using the multiple indicator implementation of the Bayesian model and a split of the Farmer study into three artificial studies, we can compare the ranks of genes known to be modulated by estrogen in an analysis of estrogen receptor (ER) status (positive or negative) as the clinical covariate. Supplementary Figure 8 explores this idea, with estrogen-related genes stratified into three categories according to the direction of regulation by estrogen (up or down) and whether the gene has a known transcription factor binding sites (TFBS) in the promoter.

Using experimental data from four high-throughput gene expression studies for breast cancer and ER status as the clinical covariate, posterior averages from the Bayesian model may be used to identify subsets of genes to explore in-silico. Figure 7 identified a subset of genes in the breast studies that were discordant across platforms (cDNA versus Affymetrix) but remain positively correlated within a platform (cDNA versus cDNA and Affymetrix versus Affymetrix). Such discordance can provide information about the differential expression of alternatively transcribed genes. For example, consider the genes in the discordant set that have two or more alternative transcripts. If for any of these genes the target sequence for the cDNA platform lies on transcript A and the target sequence(s) of the Affymetrix platform lies on transcript B, discordance could indicate, for example, that transcript A is up-regulated and transcript B is down-regulated across the two biological conditions. Because such in-silico hypotheses can only be validated by laboratory based methods such as qRT-PCR, we leave this as an open thread for future investigation.

Of the models previously proposed in the literature, the model of Conlon et al. (2006) is conceptually closest to ours. The Conlon model is designed for cross-study, within-platform analyses, and is not directly applicable to the case studies in our article. However, it is useful to contrast the technical features of the two approaches. Both are hierarchical Bayesian models, and both have a differential expression indicator for each gene. Differences emerge in how each model handles the increased variation in expression values for differentially expressed genes. We assign separate distributions to the expression values of samples in each condition. Conlon et al. (2006) assume that the expression values for differentially expressed genes are independent, but with an increased variance; they do not make use of the condition information for each sample. In addition, we adopt a more refined and flexible model for the covariance structure of the expression values and an implementation that allows study-specific indicators of differential expression. A practical consequence of these differences is in the application of the models to gene expression data from different platforms. In particular, our model can be fit to multiple studies regardless of platform, whereas the model of Conlon et al. (2006) is most applicable for combining biological replicates from a single platform; their model could be viewed as a special case of our single indicator implementation in which the technological differences in scale and variation are close to zero and there is conjugacy between location and scale.

Our hierarchical model does not require that studies be measured on the same platform. This generality has advantages and disadvantages. One advantage is that we model differences in scale and variation of expression intensities across platforms directly, removing some of the need for extensive normalization and rank-based approaches to assessing differential expression. However, in any multi-study analysis, discordance can arise from biological differences in the sample populations of each study, as well as technological effects related to the design and implementation of specific array technologies. Through multi-level modeling of gene expression, we borrow strength across studies and genes, by shrinking noisy estimates to zero and capturing correlated signals from the different studies. Simple scatterplots of study-specific measures of effect size, such as the *SAM* statistic, are a simple diagnostic for whether there is information in the joint distribution of the signals and can be evaluated before fitting the Bayesian model. In the typical situation, one will see a cloud of effect size statistics near zero, and positive correlation evidenced by having most scatter points in the positive (+, +) and negative (−, −) quadrants. In this case, the Bayesian model will tend to shrink the cloud of noise to zero, and (correctly) provide less shrinkage of the concordant differentially expressed genes.

It is common in the analysis of high-throughput gene expression data to apply a gene-selection procedure prior to the formal analysis of differential expression. For instance, when estimating differential expression by a statistic that has in its denominator an estimate of the across-sample variation, one may wish to remove genes of low abundance that show very low across-sample variation. In our Bayesian model, each gene has a parameter representing the numerical value of its differential expression. The priors for these parameters have a point mass at zero, corresponding to no differential expression. This feature of the model reduces the need for initial gene filtering, and facilitates the direct consideration, via ranked posterior means, of the complete set of available genes. See also the discussion in Ishwaran and Rao (2003, 2005).

When fitting the Bayesian model to pure noise, the model behaves appropriately and the estimated proportion of differentially expressed genes (the union of concordant and discordant genes) is approximately zero (data not shown). Additionally, the simulated data examples illustrate that the proportion of differentially expressed genes, as estimated by the posterior mean of *ξ*, is typically calibrated. Nevertheless, in the experimental data example the posterior average of *ξ* was 0.77. Closer inspection reveals that the Bayesian model has difficulty distinguishing between genes with low levels of differential expression and no differential expression (Figure 5). This difficulty can be diagnosed by high *ξ* values and a small posterior mean for the variance of the offsets as discussed in Section 7. The main conclusions of our analysis are affected only by the overall ranking of the posterior means for differential expression.

Our Bayesian model can be modified and generalized in several respects. First, the possibility of missing gene expression observations can easily be included. The missing *x _{gsp}* can simply be integrated out from the posterior distribution. Second, in the current model we have used the (common) genes appearing in all the studies. Partly overlapping gene sets can be handled in the model by considering expression values corresponding to genes that are not present in a study as missing. However, in order to design a more efficient computational algorithm one should integrate out both these

Our results provide a strong indication that borrowing strength across both genes and studies can be effective in the analysis of multi-platform studies. As is the case for most complex multilevel models, this comes at the price of added computational effort, and an increased burden of proof that the modeling assumptions are tenable.

RBS was supported by the training grant 5T32ES012871 from the U. S. National Institute of Environmental Health Sciences (P. I. Thomas Louis), training grant 5T32HL007024 from the National Heart, Lung, and Blood Institute, and grant DMS034211 from the National Science Foundation (P. I. Giovanni Parmigiani). Andrew Nobel's research was supported in part by NSF Grant DMS 0406361 and EPA Grant RD-83272001.

- Baldi P, Long AD. A Bayesian framework for the analysis of microarray expression data: Regularized t–test and statistical inferences of gene changes. Bioinformatics. 2001;17(6):509–519. [PubMed]
- Barnard J, McCulloch RR, Meng X-L. Modelling Covariance Matrices in Terms of Standard Deviations and Correlations, with Application To Shrinkage. Statistica Sinica. 2000;10:1281–1311.
- Beer DG, Kardia SL, Huang C-C, Giordano TJ, David E, Misek AML, Lin L, Chen G, et al. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nature Medicine. 2002;8(8):816–824. [PubMed]
- Berger JO. Statistical Decision Theory and Bayesian Analysis. Springer; 1993.
- Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, Ladd C, Beheshti J, Bueno R, Gillette M, Loda M, Weber G, Mark EJ, Lander ES, Wong W, Johnson BE, Golub TR, Sugarbaker DJ, Meyerson M. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proceedings of the National Academy of Sciences USA. 2001;98:13790–13795. [PubMed]
- Bröet P, Richardson S, Radvanyi F. Bayesian hierarchical model for identifying changes in gene expression from microarray experiments. Journal of Computational Biology. 2002;9:671–683. [PubMed]
- Caffo B, Dongmei L, Parmigiani G. Technical report. Vol. 62. Johns Hopkins University, Dept. of Biostatistics; 2004. 2004. Power conjugate multilevel models with applications to genomics.
- Choi H, Shen R, Chinnaiyan A, Ghosh D. A Latent Variable Approach for Meta-Analysis of Gene Expression Data from Multiple Microarray Experiments. BMC Bioinformatics. 2007;8(1):364. [PMC free article] [PubMed]
- Choi JK, Yu U, Kim S, Yoo OJ. Combining multiple microarray studies and modeling interstudy variation. Bioinformatics. 2003;19-1:I84–I90. [PubMed]
- Conlon E. A Bayesian mixture model for metaanalysis of microarray studies. Funct Integr Genomics. 2007 [PubMed]
- Conlon E, Song JJ, Liu J. Bayesian models for pooling microarray studies with multiple sources of replications. BMC Bioinformatics. 2006;7(1):247. [PMC free article] [PubMed]
- Conlon EM, Song JJ, Liu A. Bayesian meta-analysis models for microarray data: a comparative study. BMC Bioinformatics. 2007;8(80) [PMC free article] [PubMed]
- Consortium MAQC, Shi L, Reid LH, Jones WD, Shippy R, Warrington JA, Baker SC, Collins PJ, de Longueville F, Kawasaki ES, Lee KY, Luo Y, Sun YA, Willey JC, Setterquist RA, Fischer GM, Tong W, Dragan YP, Dix DJ, Frueh FW, Goodsaid FM, Herman D, Jensen RV, Johnson CD, Lobenhofer EK, Puri RK, Schrf U, Thierry-Mieg J, Wang C, Wilson M, Wolber PK, Zhang L, Amur S, Bao W, Barbacioru CC, Lucas AB, Bertholet V, Boysen C, Bromley B, Brown D, Brunner A, Canales R, Cao XM, Cebula TA, Chen JJ, Cheng J, Chu T-M, Chudin E, Corson J, Corton JC, Croner LJ, Davies C, Davison TS, Delenstarr G, Deng X, Dorris D, Eklund AC, hui Fan X, Fang H, Fulmer-Smentek S, Fuscoe JC, Gallagher K, Ge W, Guo L, Guo X, Hager J, Haje PK, Han J, Han T, Harbottle HC, Harris SC, Hatchwell E, Hauser CA, Hester S, Hong H, Hurban P, Jackson SA, Ji H, Knight CR, Kuo WP, LeClerc JE, Levy S, Li Q-Z, Liu C, Liu Y, Lombardi MJ, Ma Y, Magnuson SR, Maqsodi B, McDaniel T, Mei N, Myklebost O, Ning B, Novoradovskaya N, Orr MS, Osborn TW, Papallo A, Patterson TA, Perkins RG, Peters EH, Peterson R, Philips KL, Pine PS, Pusztai L, Qian F, Ren H, Rosen M, Rosenzweig BA, Samaha RR, Schena M, Schroth GP, Shchegrova S, Smith DD, Staedtler F, Su Z, Sun H, Szallasi Z, Tezak Z, Thierry-Mieg D, Thompson KL, Tikhonova I, Turpaz Y, Vallanat B, Van C, Walker SJ, Wang SJ, Wang Y, Wolfinger R, Wong A, Wu J, Xiao C, Xie Q, Xu J, Yang W, Zhang L, Zhong S, Zong Y, Slikker W. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol. 2006;24(9):1151–1161. [PMC free article] [PubMed]
- Dellaportas P, Roberts GO. An introduction to MCMC. In: J. Møller., editor. Spatial Statistics and Computational Methods. Springer; Berlin: 2003. pp. 1–41. number 173 in Lecture Notes in Statistics.
- Do K-A, Müller P, Tang F. A Bayesian mixture model for differential gene expression. Journal of the Royal Statistical Society: Series C (Applied Statistics) 2005;54(3):627–644.
- Efron B, Tibshirani R. Empirical Bayes methods and false discovery rates for microarrays. Genetic Epidemiology. 2002;23(1):70–86. [PubMed]
- Efron B, Tibshirani R, Storey JD, Tusher V. Empirical Bayes Analysis of a Microarray Experiment. Journal of the American Statistical Association. 2001;96:1151–1160.
- Farmer P, Bonnefoi H, Becette V, Tubiana-Hulin M, Fumoleau P, Larsimont D, Macgrogan G, Bergh J, Cameron D, Goldstein D, Duss S, Nicoulaz A-L, Brisken C, Fiche M, Delorenzi M, Iggo R. Identification of molecular apocrine breast tumours by microarray analysis. Oncogene. 2005;24(29):4660–4671. [PubMed]
- Garber ME, Troyanskaya OG, Schluens K, Petersen S, Thaesler Z, Pacyna-Gengelbach M, van de Rijn M, Rosen GD, Perou CM, Whyte RI, Altman RB, Brown PO, Botstein D, Petersen I. Diversity of gene expression in adenocarcinoma of the lung. Proceedings of the National Academy of Sciences USA. 2001;98:13784–13789. [PubMed]
- Garrett-Mayer E, Parmigiani G, Zhong X, Cope L, Gabrielson E. Cross-study validation and combined analysis of gene expression microarray data. Biostatistics. 2007 [PubMed]
- Gentleman R, Ruschhaupt M, Huber W. Technical Report 8. The Berkeley Electronic Press; 2005. On the synthesis of microarray experiments. http://www.bepress.com/bioconductor/paper8.
- Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth G, Tierney L, Yang JY, Zhang J. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004;5(10):R80. [PMC free article] [PubMed]
- Ghosh D, Barette TR, Rhodes D, Chinnaiyan AM. Funct Integr Genomics. 4. Vol. 3. Journal Article Meta-Analysis; 2003. Statistical issues and methods for meta-analysis of microarray data: a case study in prostate cancer; pp. 180–8.pp. 1438–793X. (Print) [PubMed]
- Gottardo R, Pannucci J, Kuske C, Brettin T. Statistical analysis of microarray data: a Bayesian approach. Biostatistics. 2003;4:597–620. [PubMed]
- Green P. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika. 1995;82:711–732.
- Hastings W. Monte Carlo sampling methods using Markov chains and their applications. Biometrika. 1970;57:97–109.
- Hayes DN, Monti S, Parmigiani G, Gilks CB, Naoki K, Bhattacharjee A, Socinski MA, Perou C, Meyerson M. Gene expression profiling reveals reproducible human lung adenocarcinoma subtypes in multiple independent patient cohorts. J Clin Oncol. 2006;24(31):5079–5090. [PubMed]
- Hedenfalk I, Duggan D, Chen Y, Radmacher M, Bittner M, Simon R, Meltzer P, Gusterson B, Esteller M, Kallioniemi O, Wilfond B, Borg A, Trent J. Gene-expression profiles in hereditary breast cancer. N Engl J Med. 2001;344(8):539–48. [PubMed]
- Hedges LV, Olkin I. Statistical Methods for Meta-analysis. Academic Press; 1985.
- Huang E, Cheng SH, Dressman H, Pittman J, Tsou MH, Horng CF, Bild A, Iversen ES, Liao M, Chen CM, West M, Nevins JR, Huang AT. Gene expression predictors of breast cancer outcomes. Lancet. 2003;361(9369):1590–6. [PubMed]
- Ibrahim JG, Chen MH, Gray RJ. Bayesian Models for Gene Expression with DNA Microarray Data. Journal of the American Statistical Association. 2002;97:88–99.
- Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003;4(2):249–264. [PubMed]
- Irizarry RA, Warren D, Spencer F, Kim IF, Biswal S, Frank BC, Gabrielson E, Garcia JGN, Geoghegan J, Germino G, Griffin C, Hilmer SC, Hoffman E, Jedlicka AE, Kawasaki E, Martnez-Murillo F, Morsberger L, Lee H, Petersen D, Quackenbush J, Scott A, Wilson M, Yang Y, Ye SQ, Yu W. Multiple-laboratory comparison of microarray platforms. Nat Methods. 2005;2(5):345–350. [PubMed]
- Ishwaran H, Rao J. Detecting differentially expressed genes in microarrays using Bayesian model selection. Journal of the American Statistical Association. 2003;98(462):438–455.
- Ishwaran H, Rao J. Spike and slab gene selection for multigroup microarray data. Journal of the American Statistical Association. 2005;100(471):764–780.
- Johnson W, Li C, Rabinovic A. Adjusting batch effects in microarray data using empirical Bayes methods. Biostatistics. 2007;8(1):118–127. [PubMed]
- Jung Y-Y, Oh M-S, Shin DW, Kang S-H, Oh HS. Identifying differentially expressed genes in meta-analysis via Bayesian model-based clustering. Biom J. 2006;48(3):435–450. [PubMed]
- Kendziorski C, Newton M, Lan H, Gould M. On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles. Statistics in Medicine. 2003;22:3899–3914. [PubMed]
- Kerr K. Extended analysis of benchmark datasets for Agilent two-color microarrays. BMC Bioinformatics. 2007;8(1):371. [PMC free article] [PubMed]
- Liu D, Parmigiani G, Caffo B. Technical report. Vol. 34. Johns Hopkins University, Department of Biostatistics; 2004. 2004. Screening for Differentially Expressed Genes: Are Multilevel Models Helpful? p. 34.
- Lönnstedt I, Speed T. Replicated Microarray Data. Statistica Sinica. 2002;12(1):31–46.
- Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E. Equations of state calculations by fast computing machine. J. Chem. Phys. 1953;21:1087–1091.
- Modrek B, Lee C. A genomic view of alternative splicing. Nat Genet. 2002;30(1):13–19. [PubMed]
- Newton M, Noueiry A, Sarkar D, Ahlwuist P. Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics. 2004;5:155–176. [PubMed]
- Newton MA, Kendziorski CM, Richmond CS, Blattner FR, Tsui KW. On differential variability of expression ratios: Improving statistical inference about gene expression changes from microarray data. Journal of Computational Biology. 2001;8:37–52. [PubMed]
- Pan W. A Comparative Review of Statistical Methods for Discovering Differentially Expressed Genes in Replicated Microarray Experiments. Bioinformatics. 2002;18:546–554. [PubMed]
- Parmigiani G. Measuring uncertainty in complex decision analysis models. Stat Methods Med Res. 2002;11(6):513–537. [PubMed]
- Parmigiani G, Garrett-Mayer ES, Anbazhagan R, Gabrielson E. A cross-study comparison of gene expression studies for the molecular classification of lung cancer. Clin Cancer Res. 2004;10(9):2922–2927. [PubMed]
- Rhodes DR, Barrette TR, Rubin MA, Ghosh D, Chinnaiyan AM. Cancer Res. 15. Vol. 62. Journal Article Meta-Analysis; 2002. Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer; pp. 4427–33. 0008-5472 (Print) [PubMed]
- Rhodes DR, Yu J, Shanker K, Deshpande N, Varambally R, Ghosh D, Barrette T, Pandey A, Chinnaiyan AM. Proc Natl Acad Sci U S A. 25. Vol. 101. Journal Article Meta-Analysis; 2004. Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression; pp. 9309–14. 0027-8424 (Print) [PubMed]
- Shabalin AA, Tjelmeland H, Fan C, Perou CM, Nobel AB. Merging two gene-expression studies via cross-platform normalization. Bioinformatics. 2008;24(9):1154–1160. [PubMed]
- Shen R, Ghosh D, Chinnaiyan A. Prognostic meta-signature of breast cancer developed by two-stage mixture modeling of microarray data. BMC Genomics. 2004;5(1):94. [PMC free article] [PubMed]
- Smith AFM, Roberts GO. Bayesian Computation Via the Gibbs Sampler and Related Markov Chain Monte Carlo Methods (Disc: P53-102) Journal of the Royal Statistical Society, Series B, Methodological. 1993;55:3–23.
- Smyth GK, Speed T. Normalization of cDNA microarray data. Methods. 2003;31(4):265–273. [PubMed]
- Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, Hastie T, Eisen MB, van de Rijn M, Jeffrey SS, Thorsen T, Quist H, Matese JC, Brown PO, Botstein D, Eystein Lonning P, Borresen-Dale AL. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A. 2001;98(19):10869–74. [PubMed]
- Tomlins SA, Rhodes DR, Perner S, Dhanasekaran SM, Mehra R, Sun X-W, Varambally S, Cao X, Tchinda J, Kuefer R, Lee C, Montie JE, Shah RB, Pienta KJ, Rubin MA, Chinnaiyan AM. Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science. 2005;310(5748):644–648. [PubMed]
- Townsend J, Hartl D. Bayesian analysis of gene expression levels: statistical quantification of relative mRNA level across multiple treatements or samples. Genome Biology. 2002;3:research0071.1–71.16. [PMC free article] [PubMed]
- Tseng G, Oh M, Rohlin L, Liao J, Wong W. Issues in cDNA Microarray Analysis: Quality Filtering, Channel Normalization, Models of Variations and Assessment of Gene Effects. 2001 Submitted to Nucleic Acids Research. [PMC free article] [PubMed]
- Tusher V, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences. 2001;98:5116–5121. [PubMed]
- Wang J, Coombes KR, Highsmith WE, Keating MJ, Abruzzo LV. Differences in gene expression between B-cell chronic lymphocytic leukemia and normal B cells: a meta-analysis of three microarray studies. Bioinformatics. 2004;20(17):3166–3178. [PubMed]
- Wu LF, Hughes TR, Davierwala AP, Robinson MD, Stoughton R, Altschuler SJ. Large-scale prediction of Saccharomyces cerevisiae gene function using overlapping transcriptional clusters. Nat Genet. 2002;31(3):255–265. [PubMed]
- Zhong X, Marchionni L, Cope L, Iversen ES, Garrett-Mayer ES, Gabrielson E, Parmigiani G. Technical Report 129. johns Hopkins University Department of Biostatistics; 2007. Optimized Cross-study analysis of microarray based predictors.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |