We have developed a novel dynamical systems method to model intrinsic and modulatory interactions in fMRI data. MDS uses a vector autoregressive state-space model incorporating both intrinsic and modulatory causal interactions. Intrinsic interactions reflect causal influences independent of external stimuli and task conditions, while modulatory interactions reflect context dependent influences. Our proposed MDS method overcomes key limitations of commonly used methods for estimating the causal relations using fMRI data.
Critically, causal interactions in MDS are modeled at the level of latent signals, rather than at the level of the observed BOLD-fMRI signals. Our simulations clearly demonstrate that this has the added effect of eliminating the confounding effects of regional variability in HRF. The parameters and latent variables of the state-space model were estimated using two different methods. In the MDS-MLE method, the statistical significance of the parameters of the state equation, which represent the causal interactions between multiple brain nodes, was tested using a Bootstrap method. In the MDS-VB method, we used non-informative priors to facilitate automatic relevance detection. We first discuss findings from our simulations, and show that MDS-VB provides the robust and accurate solutions even at low SNRs and smaller number of observed samples (time points). We then contrast the performance of MDS with the widely used GCA method. In this context, we highlight instances where GCA works reasonably well and where it fails. Finally, we discuss several important conceptual issues concerning the investigation of dynamic causal interactions in fMRI, contrasting MDS with other recently developed methods.
Performance of MDS on simulated datasets – contrasting MLE and VB approaches
In the following sections, we evaluate and discuss the performance of MDS under various scenarios. Importantly, we demonstrate, for the first time, that VB approaches provide better estimates of model parameters than MLE based approaches. We investigated the performance of MDS-MLE and MDS-VB on simulated datasets generated at SNRs of 0, 5 and 10 dB for network structures of sizes 2, 3 and 5 and time samples of 200, 300 and 500. We simulated regional HRF variations in such a way that the directions of hemodynamic response delays were in opposite direction to the delays in the latent signals (). HRF delays could therefore influence the estimation of causal interactions when applied directly on the observed BOLD-fMRI signals. This makes the problem of estimating causal interactions particularly challenging and provides novel insights into strengths and weaknesses of multiple approaches used here.
The performance of MDS was found to be robust when tested under various simulated conditions. Specifically, MDS was able to reliably recover both intrinsic and modulatory causal interactions from the simulated datasets and its performance was found to be superior to the conventional approaches such as GCA. Among MDS methods, the performance of MDS-VB was found to be superior to MDS-MLE with respect to performance metrics such as sensitivity, false positive rate and accuracy in identifying intrinsic and modulatory causal interactions (–). MDS-VB showed significantly improved performance over MDS-MLE, especially at adverse conditions such as low SNRs, large network size and for less number of observed samples ().
The superior performance of MDS-VB can be attributed to the regularization imposed by priors in this method. Our priors not only regularized the solution but also helped in achieving sparse solutions. By using sparsity promoting priors, the weights corresponding to insignificant links are driven towards zero and therefore enable automatic relevance detection (Tipping, 2001
). This approach is not only useful for regularizing solutions when the number of unknown parameters is high, but also for providing sparse and interpretable solutions. This feature of VB can be especially important in analyzing networks with large number of nodes, an aspect often overlooked in most analyses of causality in complex networks.
Another advantage of Bayesian analysis lies in computing the statistical significance of the network connections estimated by MDS methods. In the MLE approach, we need to resort to a Bootstrap approach, which can be computationally expensive. MDS-VB, on the other hand, provides posterior probabilities of each model parameter, as opposed to point estimates in MLE, which can be used to compute their statistical significance. From a computational perspective, MDS-VB is several orders of magnitude faster than MLE-MLE because it does not require nonparametric tests for statistical significance testing. Taken together, these findings suggest that MDS-VB is a superior and more powerful method than MDS-MLE.
Comparison with GCA
We demonstrated the importance of modeling the influence of both external and modulatory stimuli for estimating the causal networks by applying GCA on a five-node network. On this dataset, GCA failed to detect both the modulatory and intrinsic connections between nodes 1 and 2 (). As mentioned earlier, GCA missed this connection because the network has both intrinsic and modulatory connections between these nodes but with weights of opposite signs. Therefore, in GCA the net strength of this connection is very small and it did not survive a conservative test of statistical significance. Our MDS methods, on the other hand, uncovered both these connections. This phenomenon is most obvious in our simulations of the 2-node network wherein GCA could not find causal interactions between the nodes (, and ). In this network also both intrinsic and modulatory connections have weights with opposite signs. These results demonstrate the importance of explicitly modeling the influence of external and modulatory stimuli. Overall, the performance of GCA, when applied on the datasets generated at various SNR, networks and for different number of observations, was found to be inferior when compared to both MDS methods. This was true with respect to both sensitivity and accuracy in indentifying causal interactions between multiple brain nodes (, and ). When compared to MDS, the performance of GCA drops significantly at lower SNRs. These results suggest that MDS is more robust against observation noise than GCA. Therefore, our simulations suggest that MDS-VB outperforms GCA for networks consisting of less than 6 nodes. More extensive simulations however are needed to compare the performance of MDS with GCA, for larger networks.
Conventional GCA methods do not take into account dynamic changes in modulatory inputs and their effect on context dependent causal interactions. In order to compare GCA more directly with MDS, we examined causal interactions in the absence of modulatory influences. As expected, in this case, the performance of GCA was comparable to that of MDS. Together, these findings suggest that GCA can accurately recover causal interactions in the absence of modulatory effects. Although newer dynamic GCA methods have been proposed, they appear to be designed more for improving estimations of causal interactions rather than examining context dependent dynamic changes in causal interactions (Havlicek et al., 2010; Hemmelmann et al., 2009
; Hesse et al., 2003
; Sato et al., 2006
). Further simulation studies are needed to assess how well dynamic GCA can estimate context specific modulatory effects.
We next contrast our findings using GCA and MDS in the context of the equivalence of ARIMA and structural time series models. GCA is based on autoregressive integrated moving average (ARIMA) models proposed by Box and Jenkins whereas MDS is a structural time series model (Box. et al., 1994
). In the econometrics literature, it is well known that the linear structural time series models have equivalent ARIMA model representations (Box. et al., 1994
). This equivalence has been under-appreciated in the neuroimaging literature, as demonstrated by the recent discussion regarding the relative merits of GCA and DCM (Friston, 2009a
; Roebroeck et al., 2009
). Our detailed simulations suggest that, under certain conditions, GCA is able to recover much of the causal network structure in spite of the presence of HRF delay confounds. This is most clearly illustrated using the simulations shown in , where we found that GCA could recover the network structure except for the intrinsic/modulatory connection from node 1 to 2. Our simulations also suggest that GCA may not be able to uncover causal connections when there is a conflict between intrinsic and modulatory connections (, , and ) but for other cases it is able to recover the underlying networks. In estimating the causal interactions, the estimated model order for GCA using the Akaike information criterion (AIC) was more than 3. Note that in our simulations, the causal interactions at the latent signal level were generated using VAR with model order 1 (Equation 1
). It is plausible that in GCA, the higher model order is used to compensate for variations in HRF delay and experimental effects such as context specific modulatory connections (Deshpande et al., 2009
). Our simulations suggest that this is indeed that case and that optimal model order selection in GCA results in improved estimation of causal interactions between nodes. Nevertheless, structural time series based models like MDS and DCM can provide better interpretation of network structure since they can distinguish between intrinsic and context specific modulatory causal interactions in latent signals.
Effects of down-sampling on MDS performance
In most fMRI studies, data are typically acquired at sampling rates of about 2 seconds (or TR = 2seconds). However, dynamical interactions between brain regions occur at faster time scales of 10–100 milliseconds. To examine the effects of downsampling fMRI data on the performance of MDS, we first simulated interactions between nodes at a sampling rates of 1KHz and then re-sampled the time series to 0.5 Hz after convolving with region-specific HRFs. MDS-VB was then applied on these datasets to estimate causal interactions between nodes. We also examined the influence of HRF delays on the estimation of causal interactions under four scenarios (Figures S1 and S2
), similar to the strategy used by Deshpande and colleagues (Deshpande et al., 2009
) to study the effect of HRF variability on GCA. In the first scenario, there were no causal interactions between nodes but HRFs were delayed between the nodes. In this case, MDS-VB performed accurately and did not infer any false causal interactions. This shows that MDS-VB can model and remove the effects HRF variation while estimating causal interactions at latent signal level. In the second scenario, we introduced causal interactions between nodes, but without HRF variations. MDS reliably estimated causal interactions for various delays in latent signals (Figure S1A
). In the third scenario, we introduced causal interactions between nodes and also varied HRFs such that delays in latent signals and HRFs were in the same direction. In this case, MDS was able to recover both intrinsic and modulatory causal interactions accurately (Figure S1B, S1C
). In the fourth scenario, when delays in latent signal and HRF opposed each other performance dropped significantly, just as with GCA (Deshpande et al., 2009
). Further research is needed to examine whether causal interactions under this scenario are inherently unresolvable by MDS and other techniques such as DCM.
Comparison of MDS with other approaches
As noted above, like GCA, MDS can be used to estimate causal interactions between a large numbers of brain nodes. Unlike GCA, however, causal interactions are estimated on the underlying latent signals while simultaneously accounting for regional variations in the HRF. Furthermore, unlike GCA, MDS can differentiate between intrinsic and stimulus-induced modulatory interactions. Like DCM, MDS takes into account regional variations in HRF while estimating the causal interactions between brain regions. And like DCM, MDS also explicitly models external and modulatory inputs, allowing us to simultaneously estimate intrinsic and modulatory causal interactions between brain regions. Unlike DCM, however, MDS does not require the investigator to test multiple models and choose one with the highest model evidence. This overcomes an important limitation of DCM - as the number of brain regions of interest increases, an exponentially large number of models needs to be examined; as a result, the computational burden in evaluating these models and identifying the appropriate model can become prohibitively high. MDS overcomes such problems and as our study illustrates, MDS incorporates the relative merits of both GCA and DCM while attempting to overcome their limitations.
Both DCM and MDS are state-space modes but DCM uses a deterministic state model, (although a stochastic version has been recently developed (Daunizeau et al., 2009
)) whereas MDS employs a stochastic model. Modeling latent interactions as a stochastic process is important for taking into account intrinsic variations in latent signals that are not induced by experimental stimuli. Another important difference is that MDS uses empirical basis functions to model variations in HRF whereas DCM uses a biophysical Balloon model (Friston et al., 2003
). Since the Balloon model is a nonlinear model, several approximations are required to solve it. In contrast, empirical HRF basis functions allow MDS to use a linear dynamical systems framework. The relative accuracy of these approaches is currently not known.
One important advantage of MDS is that it does not assume that the fMRI time series is stationary unlike methods based on vector autoregressive modeling such as GCA. This is important because the dynamics of the latent signals can be altered significantly by experimental stimuli, leading to highly non-stationary signals. In GCA, the time series is tested for stationarity by either examining the autocorrelation of the time series or by investigating the presence of unit roots. If the time series is found to be nonstationary then one commonly used approach to remove non-stationarity is to replace the original time series with a difference of the current and previous time points (Seth, 2010
). A problem with the use of such a manipulation is it acts as a high-pass filter that can significantly distort the estimated causal interactions (Bressler and Seth, 2010
Two methods based on dynamical systems based approach for modeling fMRI data have been proposed recently (Ge et al., 2009
; Smith et al., 2009
). Smith and colleagues used a switching linear dynamical systems model wherein modulatory inputs were treated as random variables (Smith et al., 2009
). In contrast, MDS models them as deterministic quantities which are known for a given fMRI experiment. Modeling modulatory inputs as unknown random variables is useful for fMRI experiments in which the occurrence of modulatory inputs is unknown. However, for most fMRI studies modulatory inputs are known and modeling them as unknown quantities unnecessarily increases the number of parameters to be estimated. Also, the switching dynamical systems model makes additional assumptions in computing the probability distributions of the state variables (Murphy, 1998
). Further, Smith and colleagues used an MLE approach to estimate latent signals and model parameters. As we show in this study, compared to MLE, a VB approach yields more robust, computationally efficient and accurate model estimation even when the SNR and the number of time points are low, as is generally the case with fMRI data. Another difference is that Smith and colleagues combine intrinsic and modulatory matrices i.e., for every j-th
modulatory input, the connection matrix (Aj = A + Cj
) is estimated from the data. In MDS, we estimate intrinsic and modulatory matrices Cj
separately which explicitly dissociates intrinsic and modulatory effects on causal interactions between brain regions. Another difference lies in testing the statistical significance of the estimated causal connections. In MLE- MDS, we use a non-parametric approach and in MDS-VB, posterior probabilities of the model parameters are used for testing the significance of the causal interactions. Finally, it should be noted that the performance of this method under varying SNRs and sample sizes is not known since no simulations were performed.
Ge and colleagues (2009)
used a different state-space approach to estimate causal interactions in the presence of external stimuli. They used vector autoregressive modeling for the state equation to model causal interactions among brain regions, whereas the observation model was nonlinear. They used an extended Kalman filtering approach to estimate the state variables and model parameters. This method was applied on local field potential data, so its usefulness for fMRI data is unclear. However, there are several differences between MDS and this approach. MDS has been developed explicitly for fMRI data to account for HRF variations in brain regions while simultaneously estimating causal interactions. In the work of Ge and colleagues, both state variables and unknown model parameters were treated as state variables and extended Kalman filtering is used to obtain maximum likelihood estimates of these variables (Ge et al., 2009
). In MDS, we have taken a different approach - state variables are separated from model parameters. This allowed us to use sparsification promoting priors in the MDS-VB approach. Our results on simulated data suggest that MDS-VB outperforms MDS-MLE especially at low SNRs and smaller number of temporal observations. Finally, Ge and colleagues used Kalman filtering to estimate state variables while in MDS we used Kalman smoothing for estimating the latent signals (Ge et al., 2009
). In Kalman smoothing, both past and future data is used to estimate latent signals whereas filtering uses only past values to estimate the current values. In general, smoothing provides better estimates of latent signal than the filtering approach (Bishop, 2006
). Finally, although Ge and colleagues validated their approach on two three-node toy models, the performance of this method is not known under varying conditions such as different SNRs, network sizes and number of data samples.