The choice of optimizing for independence between spatial maps could equally well be replaced by optimizing for independence between time-courses. Different authors have argued in favour of one or the other technique, where the main objection appears to revolve around the question of whether orthogonality (i.e. uncorrelatedness) between estimated sources should be enforced in the temporal or spatial domain (

Friston 1998;

Petersen *et al*. 2000). At a conceptual level, the notion of orthogonality is overly restrictive in either domain: for temporal modes, the existence of stimulus correlated effects (e.g. motion artefacts or higher-order brain function) means that enforced orthogonality necessarily results in a misrepresentation of underlying temporal signals. Similarly, for spatial modes,

Friston (1998) has argued that even though different brain function might be spatially localized, the principle of ‘functional integration’ might imply that neuronal processes share a large proportion of cortical anatomy. These arguments suggest that independence and implied orthogonality are always suboptimal for the analysis of data which is as complicated as that obtained from functional MRI experiments.

From a signal detection point of view, however, it is important to consider the extent to which signal ‘appears’ in space or time. Within the temporal domain, signal often spans the entire length of an experiment. If the ‘true’ temporal characteristics of different signals are partially correlated (e.g. stimulus-correlated motion), a decomposition which enforces orthogonality in the temporal domain will necessarily misrepresent at least one of the time-series in order to satisfy the constraint. In the spatial domain, however, ‘signals’ in fMRI are sparse and typically are contained in a small proportion of all voxels. Even for what in fMRI are considered ‘large’ activation clusters or for artefactual sources with large spatial extent (e.g. image ghosts), only a fraction of intracranial voxels are involved.^{5} In the presence of noise, the majority of voxels in any spatial maps have random ‘background noise’ value and will reduce the observed spatial correlation, such that even when ‘true’ spatial maps are significantly overlapping, a decomposition which enforces orthogonality between estimated spatial maps can still give a relatively accurate representation of the signal.

Formally, consider the case of two source signals

*s*_{1} and

*s*_{2}, represented as column vectors of length

*N*, and (zero-mean) Gaussian noise

*η*_{1} and

*η*_{2} with variance

and

. In the presence of noise, the correlation changes from

to

When signals are sparse, Var(

*s*_{1}) and Var(

*s*_{2}) are small and the denominator of

is dominated by the noise variance. The reduction in correlation between the noise-free and noisy case is a function of the signal amplitude modulation, the sparseness and the relative noise level.

As an example, *a* shows two partially overlapping spatial ‘signals’ each occupying approximately 17% of the total image areas, together with two artificial time-courses. Owing to their partial overlap, these source signals are spatially correlated with a correlation of *ρ*~0.5. In the absence of noise, these maps cannot be estimated accurately by any technique enforcing orthogonality between estimated spatial maps. In the presence of noise,^{6} however, the spatial correlation between linear estimates reduces significantly: *b* shows the spatial maps obtained from performing linear regression of the data against the ‘true’ time-series. The spatial maps obtained from a PCA decomposition (*c*) have ~0 spatial correlation, and fail to identify the ‘true’ spatial maps. Also, the temporal characteristics of the signal are not well represented. By comparison, the estimated spatial maps from an ICA decomposition (*d*) well represent signal in space and time. Although the spatial sources are clearly visible, the spatial correlation between the estimated spatial maps is still ~0. This is a consequence of the optimization for maximally non-Gaussian source projections. Final thresholded ICA maps derived from a Gaussian/Gamma mixture model on the noisy maps give a reasonably good spatial representation for the original sources: the estimated thresholded maps (*e*) have large spatial correlation (*ρ*~0.47).

This example demonstrates that the mathematical constraint of orthogonality within the set of spatial maps does not necessarily imply that large areas of ‘activation’ which overlap significantly between maps can no longer be extracted. Instead, the amount to which this mathematical constraint restricts the estimation of partly overlapping sources is a function of (i) the overall sparseness of signals and (ii) the signal-to-noise ratio. This suggests that, in practice, the constraints induced by optimizing for independence are less restrictive in the spatial domain than the temporal domain. Although compensating for partial correlation of ‘signal’ by anticorrelating ‘noise’ conceptually is also possible in the temporal domain, the significantly lower number of time-points does not typically provide a sufficient number of ‘background’ time-points that could be utilized to ensure orthogonality while not altering ‘interesting’ portions of the estimated source signals. This property is particularly important for investigating RSNs, because it means that functionally distinct systems can overlap anatomically as long as they have sufficiently distinct time-courses.