Let

*Y*_{ij} be the gene expression (log ratio of test sample versus control sample) of gene

*i* on microarray

*j*,

*i* = 1, …,

*G* and

*j* = 1, …,

*n*, where

*G* is the total number of genes and

*n* is the number of microarrays. Let

*μ*_{ij} be the mean expression level of gene

*i* in microarray

*j*, and

be the variance for gene

*i*. We assume that for

*i* = 1, …,

*G* and

*j* = 1, …,

*n*,

where

*μ*_{ij} is the true expression level of gene

*i* on array

*j*; and

*ε*_{ij}’s are measurement errors that are assumed to be independent across all genes and arrays. Because of the coexpression of the spatially proximate genes, the mean vector for array

*j*,

, is modeled by N(

**0**,

**Σ**), where the covariance matrix

**Σ** is specified by

From (

3), we can see that Var(

*μ*_{ij}) =

*τ*^{2}, and Corr(

*μ*_{ij},

*μ*_{i′ j}) =

*e*^{−βDi,i′(R,L)} that is modeled by an exponential function of the distance

*D*_{i}_{,}_{i′}. This is consistent with the empirical evidence that the correlation decays exponentially as the distance of genes becomes larger. Also,

**Σ** is guaranteed to be a symmetric and positive definite matrix, since the exponential of a symmetric matrix is a symmetric positive-definite matrix (

Moakher and Batchelor 2006).

Define

, the vector of observed expression levels from array

*j*, and

**Y** = (

**Y**_{1}, …,

**Y**_{n}), the

*G* ×

*n* matrix of all data. Let

*μ* = (

*μ*_{1}, …,

*μ*_{n}) be the

*G* ×

*n* matrix of all mean levels, and the diagonal matrix

. Then the proposed model can be rewritten as

The full probability model is given by

where Θ is the collection of all involved (hyper)parameters in the model,

*π*’s are prior distributions, and

’s,

*τ*^{2},

*β*,

*R*, and

*L* are all assumed to be a priori independent. For all the variance components, we specify conjugate inverse gamma priors, that is,

for

*i* = 1, …,

*G*, and

*τ*^{2} ~ IG(

*α*_{τ},

*γ*_{τ}), where the hyperparameters are chosen to make the prior very vague, for example, IG(0.01, 0.01). We specify uniform priors for

*R* and

*L*, that is

*π* (

*R*)

*π* (

*L*)

1 with 0 <

*R* <

*R*_{max} and 0 <

*L* <

*L*_{max}. The upper bounds

*R*_{max} and

*L*_{max} can be both chosen as the diameter of the relevant cell, which the chromosome under consideration must be fit in. We consider a truncated normal prior for

*β* (say mean

*β*_{0}, variance

) truncated at zero. The variance

is chosen to be sufficiently large so that the prior is weak (i.e., nearly flat) and the location of its mean is irrelevant. Then the posterior distribution is

where

*I*(·) is an indicator function.

Now we can derive the full conditional posterior distributions. We use Θ\*θ* to denote the collection of all the parameters in Θ except for *θ*.

For *j* = 1, …, *n*,

For the variance of measurement errors

,

For the variance *τ*^{2} of *μ*_{ij}’s,

For the parameters *β*, *R*, and *L*,

We are primarily interested in inferring the structural parameters

*R* and

*L*, and

*β* which quantifies the association between expression correlation of genes and their 3D distance. We use a Gibbs sampler to obtain the posterior draws of the parameters. According to the full conditionals, the parameters

*μ*_{j},

,

*τ*^{2} can be sampled directly, while

*β*,

*R*, and

*L* can be sampled using a built-in M-H algorithm. For each dataset, we run 20,000 iterations and use 50% for burn-in. Numerical experience indicates that the posterior distribution for

*R* and

*L* may be complex and have multiple modes. To facilitate the convergence of the MCMC chains, we use the parameters estimated from the exploratory analysis to set the initial values for the algorithm. We get rough estimates of

*R* and

*L* from where the − log

_{10}(

*p*-value) is maximized, and that of

*β* through the relationship

*β* =

*α*_{1}. Standard techniques (

Brooks and Roberts 1998) are used to assess the sensitivity and validity of the model.

Alternatively, we may consider the maximum likelihood method for parameter estimation. To simplify the computation, we treat *σ*_{i}’s as nuisance parameters, and estimate *σ*_{i} using the sample standard deviation of expression levels of gene *i*. We can integrate out each *μ*_{j} to obtain the marginal distribution of **Y**|*R*, *L*, *β*, *τ*^{2},

The MLE (maximum likelihood estimator) of (*R*, *L*, *β*, *τ*^{2}) is

which can be estimated using a quasi-Newton method (i.e., the variable metric algorithm; see

Byrd et al. 1995). Again, we use the parameters estimated from the exploratory analysis to set the initial values for the algorithm. Plugging in the MLE and the sample standard deviation for

*σ*_{i}’s, we can get spatially smoothed estimates of

*μ*_{ij}’s based on (

4).

In our numerical experiments, we find that the results from the fully Bayesian approach are very similar to those from the MLE approach. In this paper, we present the results based on the fully Bayesian approach for consistency, while the MLE approach could be used as an efficient method for estimation.