Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2719771

Formats

Article sections

Authors

Related links

Neuroimage. Author manuscript; available in PMC 2010 October 1.

Published in final edited form as:

Published online 2009 May 20. doi: 10.1016/j.neuroimage.2009.05.034

PMCID: PMC2719771

NIHMSID: NIHMS119725

Correspondence Jeanette A Mumford, Ph.D., Department of Psychology, University of California, Los Angeles, 1285 Franz Hall, Box 156304, Los Angeles, CA, USA, 90095, phone:213-291-0903, fax: 310-206-5895, Email: ude.alcu@drofmum

The publisher's final edited version of this article is available at Neuroimage

See other articles in PMC that cite the published article.

While many advanced mixed-effects models have been proposed and are used in fMRI, the simplest, ordinary least squares (OLS), is still the one that is most widely used. A survey of 90 papers found that 92% of group fMRI analyses used OLS. Despite the widespread use, this simple approach has never been thoroughly justified and evaluated; for example, the typical reference for the method is a conference abstract, (Holmes and Friston, 1998), which has been referenced over 400 times.

In this work we fully derive the simplified method in a general setting and carefully identify the homogeneity assumptions it is based on. We examine the specificity (Type I error rate) of the OLS method under heterogeneity in the one-sample case and find that the OLS method is valid, with only slight conservativeness. Surprisingly, a Satterthwaite approximation for effective degrees of freedom only makes the method more conservative, instead of more accurate. While other authors have highlighted the inferior power of the OLS method relative to optimal mixed effects methods under heterogeneity, we revisit these results and find the power differences very modest.

While statistical methods that make the best use of the data are always to be preferred, software or other practical concerns may require the use of the simple OLS group modeling. In such cases, we find that group mean inferences will be valid under the null hypothesis and will have nearly optimal sensitivity under the alternative.

The analysis of multisubject fMRI data presents a number of challenges, in particular the need to account for two sources of variance: “measurement error” variability in the estimated response in each subject, and the “individual differences” variability in the true response between subjects. Appropriately modeling these within- and between-subject variances have motivated a number of papers on the best way to perform group modeling of fMRI data (Friston et al., 2002, 2005; Woolrich et al., 2004; Worsley et al., 2002; Beckmann et al., 2003; Mumford and Nichols, 2006; Penny and Holmes, 2006). Most of these methods individually weight data from each subject, down-weighting subjects with relatively high intrasubject variability, possibly even shrinking intrasubject estimates towards a population estimate. We refer to such weighted methods generically as generalized least squares (GLS) (Searle, 1971). Although the GLS methods make fewer assumptions about the distribution of the data, they are computationally more complicated to employ.

While several software packages have implemented such voxel-wise GLS methods^{1}, the simpler method, of simply modeling 1st level contrast data with an un-weighted, ordinary least squares (OLS) analysis is still in widespread use. For example, a small survey of 90 papers matching the keyword “group fmri” anywhere in the text in NeuroImage, Human Brain Mapping and Cerebral Cortex ^{2} found that 92% of group fMRI analyses used OLS. Further, OLS also forms the core of other methods. For example permutations are generally based on OLS (though see Mriaux et al. (2006)) and region of interest analyses typically are based on OLS.

The original reference for OLS analysis for group fMRI is a conference abstract, (Holmes and Friston, 1998), which has been referenced 448 times according to Google Scholar. While there are publications on GLS methods that compare to OLS, these comparisons have focused on the sensitivity but not the *specificity* (the ability to control false positives accurately) of the OLS model. Friston et al. (2005) compared thresholded test statistics from both the GLS and OLS models to determine how robust the OLS model was to violations of the assumption of homoscedasticity; their study focused on a single real dataset and did not consider the specificity of OLS when the assumptions were violated. Likewise Beckmann et al. (2003) compared the power of OLS and GLS under heteroscedasticity and they showed a moderate increase in power when using GLS. Instead of formally estimating power, their work looked at the percent change in the *Z* statistic, which is an approximation to the *t*-statistic that is used in standard analyses.

In this paper we provide a detailed description of OLS as it is typically used, focusing on the assumptions of this model and how well they hold for fMRI data. In particular, we highlight that the OLS approach always provides unbiased estimates of effect magnitude and, for the frequently-used one-sample model, unbiased variance estimates. The other possible problem caused by heterogeneity is disturbance of the distributional accuracy of *t*- or F-statistics, which can affect p-value accuracy. The traditional solution is to alter the degrees-of-freedom (DF) as part of a Satterthwaite adjustment (Satterthwaite, 1946). Satterthwaite has been found to be useful with the OLS model to protect against false positives in single subject fMRI analysis when data are temporally autocorrelated (Worsley and Friston, 1995; Kiebel et al., 2003), and we consider the performance of the Satterthwaite approximation in group fMRI analysis.

Group fMRI data are typically analyzed in a two-stage process. In the 1st level intrasubject models are fit independently to each subject, and in the 2nd level summary measures from each subject are modeled.

For a given voxel, there is a first stage model for each subject *k*:

$${Y}_{k}={X}_{k}{\beta}_{k}+{\epsilon}_{k},$$

(1)

where *Y _{k}* is the

While many regressors are needed to fit complex experimental designs and nuisance effects, an individual research question is usually addressed with 1-dimensional contrast^{3} *c*, forming a linear combination of parameter estimates *cβ _{k}*. The effect magnitude

$$c{\widehat{\beta}}_{k}=c{({X}_{k}^{\prime}{V}_{k}^{-1}{X}_{k})}^{-1}{X}_{k}^{\prime}{V}_{k}^{-1}{Y}_{k}$$

(2)

and its sample variance, for fixed true *β _{k}*, is

$$\widehat{\text{Var}}(c{\widehat{\beta}}_{k})=c{({X}_{k}^{\prime}{V}_{k}^{-1}{X}_{k}^{\prime})}^{-1}{c}^{\prime}\phantom{\rule{0.16667em}{0ex}}{\widehat{\sigma}}_{k}^{2},$$

(3)

$${\widehat{\sigma}}_{k}^{2}=(Y-X{\widehat{\beta}}_{k}{)}^{\prime}{V}_{k}^{-1}(Y-X{\widehat{\beta}}_{k})/(T-p).$$

(4)

Though not recommended, ordinary least squares (OLS) can be used instead of GLS at the first level by (incorrectly) assuming *V _{k}* =

At the “2nd level” we would ideally regress the true subject responses, *γ* = {*cβ _{k}*}

$$\gamma ={X}_{G}{\beta}_{G}+{\epsilon}_{G}$$

(5)

where *X _{G}* is

Unfortunately, we only have the estimated contrasts, *Y _{G}* = {

$${Y}_{G}={X}_{G}{\beta}_{G}+{\epsilon}_{G}^{\ast}$$

(6)

where
${\epsilon}_{G}^{\ast}$ is the mixed-effects error,
${\epsilon}_{G}^{\ast}=({Y}_{G}-\gamma )+{\epsilon}_{G}$, containing variation from both imperfect intrasubject fit (*Y _{G}* −

Write the OLS estimate of *β _{G}*,
${\widehat{\beta}}_{\text{OLS}}={X}_{G}^{-}{Y}_{G}$, where

For the OLS approach, it is assumed that the first level variance is homogeneous,
${\sigma}_{i}^{2}c{({X}_{i}^{\prime}{V}_{i}^{-1}{X}_{i})}^{-1}{c}^{\prime}={\sigma}_{j}^{2}c{({X}_{j}^{\prime}{V}_{j}^{-1}{X}_{j})}^{-1}{c}^{\prime}$ for all *i, j* and therefore the second level error variance can be expressed as
$\text{Var}({\epsilon}_{G}^{\ast})=({\sigma}_{\mathit{win}}^{2}+{\sigma}_{G}^{2}){I}_{N}$, where
${\sigma}_{\mathit{win}}^{2}$ is the common within-subject variance. Therefore the variance can be simplified to
$\text{Var}({\epsilon}_{G}^{\ast})={\sigma}_{\text{OLS}}^{2}{I}_{N}$, where
${\sigma}_{\text{OLS}}^{2}$ is the combined within- and between-subject variance term. Since there is just a single variance term, this model is much easier to estimate and does not require iterative maximization techniques.

Without the homogeneity assumption the OLS estimates may not have optimal precision, though they are unbiased (E(_{OLS}) = *β _{G}*). While the standard errors are not unbiased in general, in the widely-used one-sample

In the GLS approach to the multistage mixed model, the assumption at the second stage is that
$\text{Var}({\epsilon}_{G}^{\ast})={\sigma}_{G}^{2}{I}_{N}+{\text{Var}}_{\beta}({Y}_{G})$. For fMRI software packages that use a GLS approach, such as FSL or fmristat, the estimates of the variance
${\sigma}_{k}^{2}$ and correlation *V _{k}* from the first level analysis are used for
${\widehat{\text{Var}}}_{\beta}({Y}_{G})=\text{diag}\{{\widehat{\sigma}}_{1}^{2}c{({X}_{1}^{\prime}{\widehat{V}}_{1}^{-1}{X}_{1})}^{-1}{c}^{\prime},\dots ,{\widehat{\sigma}}_{N}^{2}c{({X}_{N}^{\prime}{\widehat{V}}_{N}^{-1}{X}_{N})}^{-1}{c}^{\prime}\}$, where the diagonal elements correspond to the individual estimated variances from equation 3, and
${\sigma}_{G}^{2}$ is estimated as part of an iterative model estimation algorithm, such as restricted maximum likelihood (Harville, 1974).

This note focuses on the one-sample *t*-test under the OLS approach, so from this point on we can assume that *x*_{1}*,*…, *x _{N}* are the

$${T}_{\mathit{OLS}}=\frac{\overline{x}}{\sqrt{{S}^{2}/N}},$$

(7)

where
$\overline{x}={\sum}_{i=1}^{N}{x}_{i}/N$ orresponds to and _{OLS} and

$${S}^{2}=\sum _{i=1}^{N}{(x-\overline{x})}^{2}/(N-1)$$

(8)

corresponds to
${\widehat{\sigma}}_{\text{OLS}}^{2}$. Inference is carried out by comparing *T* to a *t*-distribution with *N* − 1 degrees of freedom (*T _{N}*

Although the OLS variance estimate is unbiased in the case of the one-sample *t*-test (See Appendix A), the distributional assumptions change under heteroscedasticity. Under homoscedasticity the sample variance *S*^{2} (equation 8) is proportional to a
${\chi}_{N-1}^{2}$ random variable, where the degrees of freedom also define the *t*-distribution used to test the null hypothesis of the *t*-statistic (equation 7). Under heteroscedasticity the sample variance is only *approximately* proportional to a *χ*^{2} random variable. The motivation of the Satterthwaite approach is to estimate the effective degrees of freedom (eDF) such that the sample variance is proportional to
${\chi}_{\mathit{eDF}}^{2}$.

The Satterthwaite degree of freedom approximation is based on matching the first and second moments of *S*^{2} and a scaled *χ*^{2} distribution, solving for the *χ*^{2} degrees-of-freedom *ν _{SAT}* (Satterthwaite, 1946),

$${\nu}_{\mathit{SAT}}=\frac{2{E}^{2}({S}^{2})}{\text{Var}({S}^{2})}.$$

Using the values in Table 1 and after some algebra, one finds that

$${\nu}_{\mathit{SAT}}=\frac{{(N-1)}^{2}{\sum}_{i}{\sum}_{j}{\sigma}_{i}^{2}{\sigma}_{j}^{2}}{N(N-2){\sum}_{i}{\sigma}_{i}^{4}+{\sum}_{i}{\sum}_{j}{\sigma}_{i}^{2}{\sigma}_{j}^{2}}.$$

(9)

In the real data analyses below, we use the FSL estimates for
${\sigma}_{i}^{2}={\widehat{\sigma}}_{i}^{2}+{\widehat{\sigma}}_{B}^{2}$ to compute *ν _{SAT}*. (FSL’s FEAT analysis software uses GLS to find these quantities and saves these estimates as

Our data are from a finger tapping experiment of the right hand involving 12 normal subjects (Johansen-Berg et al., 2002). This was a block design study consisting of blocks of rest and 3 pseudorandomly cued tasks: tapping of the index finger, sequentially tapping fingers, randomly tapping fingers.

The data were analyzed using the fMRIB software library (FSL), using a two-level “FLAME” model. This model produced 12 contrast estimates, and 12 subject specific mixed-effects variance estimates for each of the 226,000 voxels. The contrast of interest for our study tested the difference between the response when randomly tapping the fingers and sequentially tapping the fingers. For each voxel the *p*-value for the OLS statistic was calculated using the Satterthwaite correction and two other methods that are described in the following sections.

The permutation *p*-values were obtained by comparing the test statistic, *T*, to an empirical *t*-distribution based on permutations. Since we are making inference on contrasts of parameter estimates involving differences, the order of differencing does not matter under the null hypothesis of no activation; for example *β*_{1} − *β*_{2} is equivalent to *β*_{2} − *β*_{1}. Therefore, by permuting the signs on the contrasts we can construct the null distribution of the contrast (Nichols and Holmes, 2002). Specifically, all possible 2* ^{N}* permutations are created by the different +/− combinations of the

Since the true null distribution is unknown under the heteroscedastic case, we used Monte Carlo to calculate “correct” *p*-values at each voxel, assuming the variances of the contrast for each subject,
${\sigma}_{i}^{2}$, to be known. For each realization, we generated 10,000 sets of *N*(0,
${\sigma}_{i}^{2}$) data for each of the *i* subjects and used it to calculate 10,000 test statistics. *P*_{MC} = % of 10,000 test statistics as large or larger than the test statistic, *T*, for that voxel.

Simulations were used to study different sample sizes with differing numbers of outliers and varying degrees of outlying variances. To simulate outlying within-subject variances we used a mixture of *χ*^{2} random variables where, with probability 0.9, the variance was chosen from a
${\chi}_{{\sigma}_{in}^{2}}^{2}$ distribution and, with a probability of 0.1 it was chosen from a
${\chi}_{{\sigma}_{\mathit{out}}^{2}}^{2}$ distribution, where
${\sigma}_{in}^{2}$ and
${\sigma}_{\mathit{out}}^{2}$ are the within-subject variances for the non-outlying and outlying subjects, respectively. Between-subject variances were chosen such that the overall variance for each simulation was kept constant. Therefore across different variance settings a given effect size would correspond to equivalent statistical power.

Variances that have been found in real data are shown in Table 2 and the range of values for ${\sigma}_{in}^{2}$, ${\sigma}_{\mathit{out}}^{2}$ and ${\sigma}_{B}^{2}$ were chosen to include these values. The details are described in Appendix B. While the overall standard deviation was fixed, varying sample sizes required different effect magnitudes Δ to maintain 80% power across simulations. Specifically, Δ was set to 28.14, 19.04, and 15.34, for 10, 20, and 30 subjects, respectively.

Using the subject-specific first level contrast estimates we calculated *T _{OLS}* (eq 7) and obtained four different P-values:

Comparisons of *P*_{11}, *P*_{SAT}, and *P*_{perm} with *P*_{MC} over values of *P*_{MC} < 0.05. The x-axis show the lower bound of the interval for *P*_{MC} and the red stars indicate the means of the distributions.

In order to understand the poor performance of *P _{SAT}*, we examined the implied distribution of

The distribution of *S*^{2} (left) and *T* (right). Although the mean and variance of the distributions of *S*^{2} for the MC simulation and *ν*_{SAT} are similar, the lower tails do not match as well as *ν*_{11}. The larger lower tail of the distribution of **...**

The distribution of *S*^{2} based on 11 DF does not look as similar to that based on the Monte Carlo simulation overall, but for the values of *S*^{2} that we are interested in, the lower values, the distributions are much closer and hence *p*-values are more similar. While nonparametric inferences are known to be exact, we confirmed this by comparing *P*_{perm} to *P*_{MC}. In the right panel of Figure 1, the mean of the ratio *P*_{MC}/*P*_{perm} is nearly 1 (means marked with asterisks), except for the smallest P-values which likely are exhibiting discreteness-induced conservativeness. One possible limitation of our Monte Carlo simulations is that they assume that the FLAME-derived variance estimates are the true values of the variances in the real data. To assess this assumption, we repeated the Monte Carlo simulation using *t*-statistic values based on samples from a Normal distribution with known variances and obtained similar results (see Supplementary Material), suggesting the FLAME variance estimates are accurate.

Simulations were used to study type I error rate and power over a range of degrees of outlying variance when using *T _{OLS}* versus

Type I error rate (left) and power (right) as a function of the % difference in the mixed effects variances of outlying and non-outlying variances,
$100({\sigma}_{\mathit{out}}^{2}-{\sigma}_{in}^{2})/({\sigma}_{\mathit{out}}^{2}+{\sigma}_{B}^{2})$, for sample sizes of 10, 20 and 30. **...**

The right panel of Figure 3 shows the power under the OLS model, where the true power (under GLS) was 80% in each case. As found in Beckmann et al. (2003) and Friston et al. (2005) power can be lost when using OLS under heteroscedasticity. The decrease was as high as 9% and was worst for the smaller sample size and larger outlying variances.

The x-axis range of Figure 3 can be compared to the values found from real data shown in the last column of Table 2. We searched over 12 data analyses that were analyzed using the Feat analysis tool of FSL, including event related and blocked designs and found that 5 of these studies had subjects with outlying variances. This was determined by subsetting voxels with nonzero between-subject variances and plotting boxplots of the voxel-averaged within-subject variances. In cases where outlying variances were found, the within-subject variance distributions were similar for voxels where the between-subject variance was 0. Table 2 lists the average within-subject variances for outlying and non-outlying subjects within the interquartile range of the nonzero between-subject variances, scaled such that ${\sigma}_{in}^{2}=400$ for all studies.

We hypothesized that when using a one-sample *t*-test for a group contrast mean of fMRI data, simply using *N* − 1 degrees of freedom would lead to invalid *p*-values due to the heterogeneity of variances. Surprisingly, in our data analysis, we found *N* − 1 degrees of freedom to be slightly conservative and hence valid when compared to *p*-values calculated with a Monte Carlo simulation or permutation test. The 2 moment matching Satterthwaite approximation was even more conservative than using *N* − 1 degrees of freedom. Although the Satterthwaite approximation has been shown to give valid hypothesis tests in single subject fMRI analysis (Kiebel et al., 2003), it did not perform well in the given situation where observations are uncorrelated, but have heterogeneous variance. This is even more surprising since the Satterthwaite approximation was originally developed to handle cases of heteroscedasticity, but suggests that the approach may not perform well with low degrees-of-freedom. As shown in our comparison of distributions of *S*^{2} and *T* in Figure 2, the Satterthwaite approximation only matches the first two moments of the distribution, which does not ensure the left tails of the distributions of *S*^{2} for Satterthwaite and the true distribution match; hence the tails of the *T* distributions will not match. Therefore the Satterthwaite approximation tends to be too conservative in this application.

A three moment matching eDF approach of Scariano and Davenport (1986) was also considered, but similar to the Satterthwaite approximation the effective degrees of freedom are always less than *N* − 1 and so this method was not considered. The permutation test is the only test that does not assume the *T* test statistics follow a specific distribution, and its only limitation is discrete P-values for very small sample sizes (e.g. for 6 subjects all P-values are multiples of 1/2^{6} = 0.015625).

Our simulation study allowed us to study type I error and power under a range of sample sizes and outlying variances that are representative of real data findings. Our findings for type I error supported our real data analysis finding that the OLS-based hypothesis test on the sample mean was slightly conservative under heteroscedasticity. Although the true type I error rate was most conservative for small sample sizes and/or the presence of very large outliers, the smallest we found in our simulations was 0.0456 when the goal type I error rate was 0.05.

Although previous studies by Beckmann et al. (2003) and Friston et al. (2005) implied there was a loss in power under the OLS model, they did not formally quantify the loss. Our results show that although there is a loss in power, it was not found to be larger than a 9% in our simulations and this was for a small sample size of 10 subjects with a very large outlying variance.

Although we have shown the OLS model is robust to violations of the heteroscedasticity assumption for the 1-sample *t*-test, it is probably not the case that this result would also hold in the case of simple linear regression as the slope of a line is easily influenced by outliers. It is likely that in the case of simple linear regression the GLS model should be used to ensure outliers are properly down-weighted.

Finally, we note that these results for fMRI also inform group analyses in other modalities. In particular, in PET or EEG, where it may be impractical or undesirable to estimate intrasubject variance, it is useful to know that the OLS model is performing well in the face of any potential heteroscedasticity.

In conclusion, while a weighted, GLS mixed effects model is the more optimal modeling approach, we find “plain old” OLS surprisingly robust for the widely-used one-sample model. We have provided evidence that an OLS model used with varying designs or outlier-induced heteroscedasticity actually controls false positive risk and has near-optimal power.

In this appendix we find the bias of the OLS variance estimators under (unmodeled) heterogeneous variance. Writing the second level model originally shown in equation 6 using a simplified notation we have,

$$Y=X\beta +\epsilon $$

(10)

where *Y* is the *N*-vector of contrast data fed up from the first level, *X* is *N* × *p* second-level design matrix, and *β* are the group-level parameters, and *ε* are the second-level errors. We assume that Var(*ε*) = *V σ*^{2}, where *σ*^{2} is the average mixed effects variance, *V* = *diag*(*v _{i}*), Σ

$${\widehat{\beta}}_{\text{OLS}}={({X}^{\prime}X)}^{-1}{X}^{\prime}Y$$

(11)

$${\widehat{\sigma}}_{\text{OLS}}^{2}=(Y-X{\widehat{\beta}}_{\text{OLS}}{)}^{\prime}(Y-X{\widehat{\beta}}_{\text{OLS}})/(N-p)$$

(12)

$${\widehat{\text{Var}}}_{\text{OLS}}(\widehat{\beta})={({X}^{\prime}X)}^{-1}{\widehat{\sigma}}_{\text{OLS}}^{2}$$

(13)

First essential result is that E(_{OLS}) = *β*, that is, the estimates of the regression coefficients are unbiased. However, the estimates of error and parameter variance are not necessarily unbiased. First

$$\text{E}({\widehat{\sigma}}_{\text{OLS}}^{2})=\text{trace}(({I}_{N}-H)V)/(N-p)$$

(14)

$$=\frac{{\sum}_{i}(1-{h}_{i}){v}_{i}}{N-p}{\sigma}^{2}$$

(15)

where *H* = *X*(*X*′*X*)^{−1}*X*′ is the so-called hat matrix, and *h _{i}* is the

The estimator variance is

$$\text{Var}({\widehat{\beta}}_{\text{OLS}})={({X}^{\prime}X)}^{-1}{X}^{\prime}V\phantom{\rule{0.16667em}{0ex}}X{({X}^{\prime}X)}^{-1}{\sigma}^{2}$$

(16)

while

$$\text{E}({\widehat{\text{Var}}}_{\text{OLS}}(\widehat{\beta}))={({X}^{\prime}X)}^{-1}\text{E}({\widehat{\sigma}}_{\text{OLS}}^{2}).$$

(17)

If there is no bias in ${\widehat{\sigma}}_{\text{OLS}}^{2},{\widehat{\text{Var}}}_{\text{OLS}}(\widehat{\beta})$, is unbiased when

$${X}^{\prime}X={X}^{\prime}V\phantom{\rule{0.16667em}{0ex}}X.$$

(18)

This is true for a one-sample problem, but not even for a balanced ANOVA model. However, for balanced ANOVA models and typical contrasts of interest, there may be no bias. For example, for a balanced two sample t-test, with design matrix

$${X}^{\prime}=\left[\begin{array}{lllllll}1\hfill & 1\hfill & \cdots \hfill & 0\hfill & 0\hfill & \cdots \hfill & 0\hfill \\ 0\hfill & 0\hfill & \cdots \hfill & 1\hfill & 1\hfill & \cdots \hfill & 1\hfill \end{array}\right]$$

(19)

and contrast *c* = [−1 1] can be shown to have
$c{\widehat{\text{Var}}}_{\text{OLS}}(\widehat{\beta}){c}^{\prime}$ unbiased for *c*Var(_{OLS})*c*′.

Using GLS, the group mean and estimated variance have the following form,

$${\widehat{\mu}}_{\mathit{GLS}}={\left(\sum _{i=1}^{N}\frac{1}{{\sigma}_{i}^{2}}\right)}^{-1}\sum _{i=1}^{N}\frac{{x}_{i}}{{\sigma}_{i}^{2}}$$

(20)

$$\text{Var}({\mu}_{\mathit{GLS}})={\left(\sum _{i=1}^{N}\frac{1}{{\sigma}_{i}^{2}}\right)}^{-1},$$

(21)

where
${\sigma}_{i}^{2}$ is the sum of the within- and between-subject variance for subject *i*. When the variance is known, this yields the *z* statistic,
${Z}_{\mathit{GLS}}={\widehat{\mu}}_{\mathit{GLS}}/\sqrt{\text{Var}({\mu}_{\mathit{GLS}})}$. Assuming 90% and 10% of the within-subject variances are
${\sigma}_{in}^{2}$ and
${\sigma}_{\mathit{out}}^{2}$, respectively, that
${\sigma}_{B}^{2}$ is the between-subject variance and that *x _{i}* = Δ for all subjects,

$${Z}_{\mathit{GLS}}=\sqrt{N}\mathrm{\Delta}{\left(0.9\frac{1}{{\sigma}_{in}^{2}+{\sigma}_{B}^{2}}+0.1\frac{1}{{\sigma}_{\mathit{out}}^{2}+{\sigma}_{B}^{2}}\right)}^{1/2},$$

(22)

which corresponds to a test statistic for a group mean whose value is Δ with *N* subjects and a standard deviation of

$${\left(0.9\frac{1}{{\sigma}_{in}^{2}+{\sigma}_{B}^{2}}+0.1\frac{1}{{\sigma}_{\mathit{out}}^{2}+{\sigma}_{B}^{2}}\right)}^{-1/2}.$$

(23)

For our simulations we chose a constant ${\sigma}_{in}^{2}=400$ across all simulations and varied the outlying variance across the range between 400–3600 and the between subject variance was chosen so that the overall standard deviation in equation 23 was held constant at 33. This range included variance combinations that were found in real data analyses with outlying variances shown in Table 2.

^{1}FSL (http://www.fmrib.ox.ac.uk/fsl/), fmristat (www.math.mcgill.ca/keith/fmristat/); SPM (www.fil.ion.ucl.ac.uk/spm/), but only through the hidden
`spm mfx` function.

^{2}The 30 most recent papers from each journal were used including early views of in press articles with dates ranging between December 2007- March 2009.

^{3}In fullest generality, the contrast *C* and even the number of parameters *P* may vary between subjects. The only requirement is that the contrast of parameter estimates has the same units and interpretation across all subjects.

^{4}In standard mixed model estimation both the within- and between-subject variances are estimated iteratively, but this approach is computationally too intensive for fMRI data and so the within-subject variance is simply set to the value of the first stage variance estimate.

**Publisher's Disclaimer: **This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

- Beckmann CF, Jenkinson M, Smith SM. General multilevel linear modeling for group analysis in FMRI. NeuroImage. 2003;20:1052–1063. [PubMed]
- Cazalis F, Tom S, Reger M, Stover E, Turner K, Poldrack R. Event-related fmri study of mirror reading skill acquisition. Oral presentation at the October, 2004 meeting of the Society for Neuroscience.2004.
- Foerde K, Knowlton BJ, Poldrack RA. Modulation of competing memory systems by distraction. Proc Natl Acad Sci USA. 2006;103:11778–11783. [PubMed]
- Friston K, Stephan K, Lund T, Morcom A, Kiebel S. Mixed-effects and fmri studies. NeuroImage. 2005;24:244–252. [PubMed]
- Friston KJ, Penny W, Phillips C, Kiebel S, Hinton G, Ashburner J. Classical and Bayesian inference in neuroimaging: theory. NeuroImage. 2002;16:465–483. [PubMed]
- Harley E, Pope W, Villablanca J, Mumford J, Suh R, Mazziotta JDE, Engel S. Engagement of fusiform cortex and disengagement of lateral occipital cortex in the acquistion of radiological expertise. Cerebral Cortex. 2009 In Press. [PMC free article] [PubMed]
- Harville D. Bayesian inference for variance components using only error contrasts. Biometrika. 1974;61:383–385.
- Harville D. Matrix Algebra From a Satistician’s Perspective. Springer; 2008.
- Holmes A, Friston K. Generalisability, random effects & population inference. NeuroImage 7 (4 (2/3)), S754, proceedings of Fourth International Conference on Functional Mapping of the Human Brain; June 7–12, 1998; Montreal, Canada. 1998.
- Johansen-Berg H, Rushworth MFS, Bogdanovic MD, Kischka U, Wimalaratna S, Matthews PM. The role of ipsilateral premotor cortex in hand movement after stroke. Proc Natl Acad Sci USA. 2002;99:14518–23. [PubMed]
- Kiebel SJ, Glaser DE, Friston KJ. A heuristic for the degrees of freedom of statistics based on multiple variance parameters. Neuroimage. 2003;20:591–600. [PubMed]
- Mumford JA, Nichols T. Modeling and inference of multisubject fMRI data. IEEE Eng Med Biol Mag. 2006;25:42–51. [PubMed]
- Mriaux S, Roche A, Dehaene-Lambertz G, Thirion B, Poline JB. Combined permutation test and mixed-effect model for group average analysis in fMRI. Hum Brain Mapp. 2006;27:402–410. [PubMed]
- Nichols TE, Holmes AP. Nonparametric permutation tests for functional neuroimaging: a primer with examples. Hum Brain Mapp. 2002;15:1–25. [PubMed]
- Penny W, Holmes A. Random effects analysis. In: Friston K, Ashburner J, Kiebel S, Nichols T, Penny W, editors. Statistical Parametric Mapping: The analysis of functional brain images. Elsevier; London: 2006.
- Satterthwaite F. An approximate distribution of estimates of variance components. Biometrics Bulletin. 1946;2:110–114. [PubMed]
- Scariano S, Davenport J. A four moment approach and other practical solutions to the Behrens-Fisher problem. Communications in Statistics-Theory and Methods. 1986;15:1467–1505.
- Searle SR. Linear Models. John Wiley & Sons; 1971.
- Stover E, Trepel C, Fox C, Poldrack R. The neural correlates of decision making under risk: an fmri study. NeuroImage 31 (Supp 1), S157, the 12th Annual Meeting of the Organization of Human Brain Mapping; June 11–15, 2006; Florence, Italy. 2006.
- Woolrich MW, Behrens TECF, Jenkinson M, Smith SM. Multilevel linear modelling for FMRI group analysis using Bayesian inference. NeuroImage. 2004;21:1732–1747. [PubMed]
- Worsley KJ, Friston KJ. Analysis of fMRI time-series revisited–again. NeuroImage. 1995;2:173–181. [PubMed]
- Worsley KJ, Liao CH, Aston J, Petre V, Duncan GH, Morales F, Evans AC. A general statistical analysis for fMRI data. NeuroImage. 2002;15:1–15. [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |