PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
Neuroimage. Author manuscript; available in PMC 2013 November 1.
Published in final edited form as:
PMCID: PMC3432302
NIHMSID: NIHMS341320

A spatiotemporal dynamic distributed solution to the MEG inverse problem

Abstract

MEG/EEG are non-invasive imaging techniques that record brain activity with high temporal resolution. However, estimation of brain source currents from surface recordings requires solving an ill-conditioned inverse problem. Converging lines of evidence in neuroscience, from neuronal network models to resting-state imaging and neurophysiology, suggest that cortical activation is a distributed spatiotemporal dynamic process, supported by both local and long-distance neuroanatomic connections. Because spatiotemporal dynamics of this kind are central to brain physiology, inverse solutions could be improved by incorporating models of these dynamics. In this article, we present a model for cortical activity based on nearest-neighbor autoregression that incorporates local spatiotemporal interactions between distributed sources in a manner consistent with neurophysiology and neuroanatomy. We develop a dynamic Maximum a Posteriori Expectation-Maximization (dMAP-EM) source localization algorithm for estimation of cortical sources and model parameters based on the Kalman Filter, the Fixed Interval Smoother, and the EM algorithms. We apply the dMAP-EM algorithm to simulated experiments as well as to human experimental data. Furthermore, we derive expressions to relate our dynamic estimation formulas to those of standard static models, and show how dynamic methods optimally assimilate past and future data. Our results establish the feasibility of spatiotemporal dynamic estimation in large-scale distributed source spaces with several thousand source locations and hundreds of sensors, with resulting inverse solutions that provide substantial performance improvements over static methods.

Keywords: MEG/EEG, Source localization, Inverse problem, Dynamic spatiotemporal modeling

1. Introduction

Magnetoencephalography (MEG) and electroencephalography (EEG) are non-invasive brain imaging techniques that provide high temporal resolution measurements of magnetic and electric fields at the scalp generated by the synchronous activation of neuronal populations. It has been estimated that a detectable signal can be recorded if as few as one in a thousand synapses become simultaneously active in an area of about 40 square millimeters of cortex (Hämäläinen et al., 1993). The exceptional time resolution of these techniques provides a unique window into the dynamics of neuronal process that cannot be obtained with other functional neuroimaging modalities, such as functional magnetic resonance imaging (fMRI) and positron emission tomography (PET), which measure brain activity indirectly through associated slow metabolic or cerebrovascular changes.

Localizing active regions in the brain from MEG and EEG data requires solving the neuromagnetic inverse problem, which consists of estimating the cerebral current distribution underlying a time series of measurements at the scalp. The ill-posed nature of the electromagnetic inverse problem and the relatively large distance between the sensors and the sources limit the spatial resolution of MEG and EEG. For the inverse problem, solution of the corresponding forward problem in MEG/EEG, i.e., determining the measured magnetic and electric field at the scalp generated by a given distribution of neuronal currents, is a prerequisite. This problem can be solved by adopting a quasistatic approximation to Maxwell’s equations, resulting in a linear map which relates the activity of arbitrarily located neuronal sources to the signals in a set of sensors (Hämäläinen et al., 1993; Mosher et al., 1999).

Two types of models have been proposed in the neuromagnetic inverse problem literature: equivalent current dipole models and distributed source models. Equivalent current dipole models are based on the assumption that a small set of current dipoles with unknown locations, amplitudes, and orientations can closely approximate the actual current distribution (Mosher et al., 1992; Uutela et al., 1998; Bertrand et al., 2001; Somersalo et al., 2003; Jun et al., 2005; Kiebel et al., 2008; Campi et al., 2008). Distributed source models, on the other hand, assume that the recorded activity results from the activation of a spatial distribution of current dipoles with known locations (see Baillet et al. (2001) for a review). Most source localization methods assume that scalp measurements and their underlying sources are independent across time, and convenient probabilistic or computational prior constraints are imposed to obtain unique solutions. For example, inverse methods have been proposed that penalize current sources with large amplitude or power (Hämäläinen and Ilmoniemi, 1994; Van Veen et al., 1997; Uutela et al., 1999; Auranen et al., 2005), impose spatial smoothness (Pascual-Marqui et al., 1994), or favor focal estimates (Gorodnitsky et al., 1995). Furthermore, Bayesian methods have been used to obtain spatially sparse estimates, where components of the current source covariance are estimated directly (Phillips et al., 2005; Mattout et al., 2006) or additional hierarchical priors are assigned in order to compute posterior distributions (Sato et al., 2004; Nummenmaa et al., 2007; Wipf and Nagarajan, 2009). The assumption of temporal independence in all of these methods allows the inverse solution at each point in time to be computed individually, without regard for dynamics, treating the probability distribution of the underlying sources as static in time. While this approach is computationally convenient, it ignores the temporal structure observed in neural recordings at many different levels (Buzsaki and Draguhn, 2004), which could be used to improve inverse solutions.

Recent methods for source localization have incorporated temporal smoothness constraints as part of a general Bayesian framework. The approach taken in these methods is to specify an arbitrary prior distribution for the dipole sources in space and time, sometimes in terms of basis functions, with limited or space-time separable interactions in order to obtain simplified estimation algorithms (Baillet and Garnero, 1997; Greensite, 2003; Daunizeau et al., 2006; Daunizeau and Friston, 2007; Friston et al., 2008; Trujillo-Barreto et al., 2008; Zumer et al., 2008; Limpiti et al., 2009; Ou et al., 2009; Bolstad et al., 2009). Linear state-space models have also been used to model source current dynamics. However, in order to reduce computational complexity, these methods either apply spatially-independent approximations to their respective estimation algorithms (Yamashita et al., 2004; Galka et al., 2004) or a priori fix model specific parameters to avoid the problem of parameter estimation (Long et al., 2011). Overall, while these methods incorporate temporal structure in their models, they specify highly constrained spatiotemporal interactions, such as space-time separability or spatial independence, that may not accurately reflect dynamic relationships between different brain areas.

Converging lines of evidence from neurophysiology, biophysics, and neuroimaging illustrate that dynamic spatiotemporal interactions are a central feature of brain activity. Intracranial recordings in different species, including humans, exhibit strong spatial correlations during rest and task periods that persist up to distances of 10 mm along the cortical surface (Bullock et al., 1995; Destexhe et al., 1999; Leopold et al., 2003). These local spatial interactions are supported neuroanatomically by axonal collateral projections from pyramidal cells that spread laterally at distances of up to 10 mm (Nunez, 1995). Biophysical spatiotemporal dynamic models of neuronal networks at various levels of abstraction have been effective in reproducing properties of electromagnetic scalp recordings seen during both normal and disease states (Jirsa et al., 2002; Wright et al., 2004; Robinson et al., 2005; David et al., 2005; Sotero et al., 2007; Kim and Robinson, 2007; Izhikevich and Edelman, 2008; Gross et al., 2001). Furthermore, fMRI and PET studies have shown temporally coherent fluctuation in activation within widely distributed cortical networks during resting-state and experimentally administered task periods (Raichle et al., 2001; Gusnard and Raichle, 2001; Fox et al., 2005; Fox and Raichle, 2007).

In this article we present a new dynamic source localization method that models local spatiotemporal interactions between distributed cortical sources in a manner consistent with neurophysiology and neuroanatomy, and then uses this model to estimate an inverse solution. Specifically, (1) we describe a model of the spatiotemporal dynamics based on nearest-neighbors multivariate autoregression along the cortical surface; (2) We develop an algorithm for dynamic estimation of cortical current sources and model parameters from MEG/EEG data based on the Kalman Filter, the Fixed Interval Smoother, and the Expectation-Maximization (EM) algorithms; (3) We derive expressions to relate our dynamic estimation formulas to those of standard static algorithms; (4) We apply our spatiotemporal dynamic method to simulated experiments of focal and distributed cortical activation as well as experimental data from a human subject. Portions of this work have been previously presented in Lamus et al. (2007).

2. Methods

2.1. Measurement model

In an MEG/EEG experiment, we obtain a temporal set of recordings from hundreds of sensors located above the scalp. The data are sampled by Ny sensors at times equation M4, where Δ is the sampling interval and T is the number of measurements in time. Let yn,t denote the measurement at time t in sensor n, and define yt [equivalent] [y1,t,y2,t, …, yNy,t]′ as the Ny × 1 vector of measurements at all sensors at time t. We assume that the measurements were generated by Nx current dipole sources distributed on the cortical surface and oriented perpendicular to it. Let xn,t denote the source amplitude of the nth dipole at time t, and define xt [equivalent] [x1,t, x2,t, …, xNxt]′ as the Nx × 1 vector of cortical source activity in all considered locations, or cortical state vector, at time t. Typically, Ny ~ a few hundred, and Nx ~ several thousand. The relationship between the measurement vector yt and the cortical state vector xt is given by the measurement equation,

equation M5
(1)

where G is the Ny × Nx lead field gain matrix computed using a quasistatic approximation of the Maxwells equations (Hämäläinen et al., 1993), i.e., the solution of the forward problem, and vt is a Ny × 1 Gaussian white noise vector with zero mean covariance matrix C representing background instrumental and environmental noise independent from xt for all time points. Since the orientation of the current generators of the electromagnetic field, i.e., the apical dendrites of pyramidal cells, is perpendicular to the cortical surface, the choice of fixing the current dipole orientation along this direction is justified (Hämäläinen et al., 1993). Nevertheless, our development can be easily extended to account for unconstrained source orientations.

2.2. Spatiotemporal dynamical source model

As we pointed in the Introduction Section 1, evidence from neurophysiology studies, neuroanatomy, biophysics, and neuroimaging suggests that cortical activation is a distributed spatiotemporal dynamic process (Bullock et al., 1995; Destexhe et al., 1999; Leopold et al., 2003; Nunez, 1995; Jirsa et al., 2002; Wright et al., 2004; Robinson et al., 2005; David et al., 2005; Sotero et al., 2007; Kim and Robinson, 2007; Izhikevich and Edelman, 2008; Gross et al., 2001; Raichle et al., 2001; Gusnard and Raichle, 2001; Fox et al., 2005; Fox and Raichle, 2007). Because spatiotemporal dynamics of this kind are fundamental to brain physiology, inverse solutions could be greatly improved by incorporating models that approximate these dynamics.

One way to model local spatiotemporal connections of this type is to use a nearest-neighbor autoregressive model. In this autoregressive model, neuronal currents at a given point in time and space xn,t are a function of past neuronal currents within a small local neighborhood along the cortical surface and a small perturbation wn,t that accounts for unknown factors affecting the evolution of cortical currents,

equation M6
(2)

where N(n) is the index set of nearest neighbors of the nth dipole source, an is a scalar between 0 and 1 that weighs the past neuronal currents between the nth dipole source and its set of nearest neighbors N(n), and λ is a positive scalar used to guarantee the stability of the cortical state dynamics. The state noise wt [equivalent] [w1,t,w2,t, …, wNxt]′ is a Nx × 1 Gaussian vector with zero mean and covariance matrix Q, independent from the measurement noise vt. The weights dn,i, which represent the individual influence of neighboring sources in the autoregression, are assumed to decay with the distance from the nth to the ith dipole source. A simple relation that reflects this modeling assumption is to make the weights inversely proportional to the distance between dipoles,

equation M7
(3)

where the proportionality constant is chosen to make the sum of the weights over the index i equal 1, allowing the power (prior variance) in every dipole xn,t to remain approximately constant over time.

Figure 1 illustrates this nearest-neighbors autoregressive model. The left panel shows the cortical surface reconstructed from high-resolution magnetic resonance images (MRI) using Freesurfer (Dale et al., 1999; Fischl et al., 1999), where the caudal portion is represented by its triangulated mesh of dipoles. The zoomed-in panel in the right isolates a dipole source (central red dot) and its corresponding nearest-neighbor dipoles (green dots). At a given point in time, the activity of the central dipole (red dot) is a function of its past activity and the weighted activity of a small neighborhood of dipole sources (green dots), where the weights (black dashed arrows) are inversely proportional to the distance from the central dipole to its neighbor. We employ a cortical surface representation with Nx = 5124 dipoles sources, with an average distance of 6.2 mm between nearest neighbors, yielding a model that is consistent with the local spatial properties of intracranial electrophysiology and neuroanatomy. In Appendix A, we analyze how modifications in our spatial model, such us those arising from choosing a different weighting scheme or a misspecification of the weights an and dn,i (Eq. 2), influence the smoothness encoded a priori in the dipole covariance. There we show that as long as the modification in the spatial model is not too large, the smoothness modeled in the dipole sources is not dramatically altered.

Figure 1
Illustration of the spatiotemporal dynamic source model. The left panel shows the reconstructed cortical surface where the caudal portion of cortex is depicted as a triangulated mesh of dipole sources. The zoomed-in panel in the right shows a dipole source ...

We can readily rewrite Equation 2 as the multivariate autoregressive model,

equation M8
(4)

where the state feedback matrix F encompasses the neighborhood interactions between the sources at time t in terms of the sources in the previous time step t − 1:

equation M9
(5)

and (F)n,i is the nth row of the ith column of F. The initial dipole source vector x0 is assumed to be Gaussian with zero mean and covariance matrix Σ0, and independent of equation M10.

2.3. Prior model for the parameters

The model defined in Sections 2.1 and 2.2 yields a probabilistic “forward model” for the dynamic generation of the electromagnetic scalp recording. The next step in the description of our model is to define prior densities for the unknown parameters {G, C, F, Q, Σ0}. We should note that while in theory all the parameters in our model have uncertainties, some of them can be fixed a priori based on knowledge of the system under investigation. For example, the lead field matrix G can be computed using the boundary-element model based on high-resolution magnetic resonance images (MRI) (Hämäläinen and Sarvas, 1989) and the measurement noise covariance C can be reliably estimated from empty room recordings or pre-stimulus data. The covariance Σ0 can be updated as described in Section 2.4. The state feedback matrix F is constructed as indicated in Section 2.2 to incorporate nearest-neighbor interaction in order to model basic local intracortical dynamic connections. The selection of the values of λ and an in F (Eq. 5) is discussed in Section 2.5.

We note that the process of fixing the lead field G and state feedback F matrices can be seen as the analog of selecting the matrix of covariates or regressors in a linear model, where the experimenter’s knowledge guides the selection of the regressor as means of testing a scientific hypothesis based on collected data. In our case, the regressors in F, which are fixed based on neurophysiological modeling considerations, determine how much the state xt can be explained by its past xt− 1, and the unexplained portion is left to the state noise term wt. In the same way, the regressors in G, obtained from geometrical and biophysical knowledge, separate the measured MEG yt between the signals coming from the brain xt, and the instrument and environmental noise vt.

The remaining unknown parameters, which must be estimated from data, are the elements of the state noise covariance matrix Q. We assume that state noise covariance matrix Q is diagonal,

equation M11
(6)

where the parameter vector is θ = [θ1, θ2, …, θNx]′, thus yielding a number of unknowns in the order of thousands (Nx ≈ 5000 in our case).

A common probability model for the parameter vector θ is the conjugate prior distribution. In this case the conjugate prior follows an inverse gamma density:

equation M12
(7)

where the hyperparameters α and β are set such that the mean of this prior is consistent with the order of magnitude in the model units, and the variance is maximized yielding a non-informative prior (see Appendix B).

2.4. Inference with a dynamic Maximum a Posteriori Expectation-Maximization algorithm (dMAP-EM)

In order to localize active regions of cortex from scalp recordings, we have to derive estimates for the sequence of source amplitudes

equation M13
(8)

and parameters

equation M14
(9)

in the model. Specifically, we have to find the Maximum a Posteriori (MAP) estimate of the parameters,

equation M15
(10)

where equation M16 is the posterior density of the parameters θ conditioned on the full set of measurements

equation M17
(11)

and the Empirical Bayes estimate of the source amplitudes, i.e., the conditional mean of the state vector given the full set of measurements equation M18 and the estimate of the parameters in the model [theta w/ hat]MAP,

equation M19
(12)

where the notation in the subscript of xt |T indicates that we are conditioning on the measurements form time 1 until time T.

Once we obtain the MAP estimate of the parameters [theta w/ hat]MAP, computing the expectation in the Empirical Bayes estimate of the source amplitudes (Eq. 12) can be readily obtained using the well-known Kalman Filter (Kalman, 1960) and Fixed Interval Smoother algorithms (Rauch et al., 1965). However, the direct optimization required to compute [theta w/ hat]MAP (Eq. 10) would be computationally intractable. Therefore, we developed a dynamic Maximum a Posteriori Expectation-Maximization (dMAP-EM) algorithm to estimate source amplitudes and parameters in our model (Eqs. 10 and 12) based on the work of Shumway and Stoffer (1982) and Green (1990). The Expectation-Maximization algorithm (Dempster et al., 1977) is a general iterative method to obtain Maximum Likelihood or Maximum a Posteriori estimates when the observed data can be regarded as incomplete. In our case we treat the sequence of measurement and dipole sources, equation M20 as the complete data, and iterate performing an E-step followed by an M-step until convergence is achieved. In each iteration of the algorithm, the expectations in the E-step (Section 2.4.1) can be computed with the Kalman Filter (Kalman, 1960), Fixed Interval Smoother (Rauch et al., 1965), and lag-covariance (de Jong and Mackinnon, 1988) recursions, and the maximization in the M-step (Section 2.4.2) can be obtained in closed form. In each iteration we can evaluate the posterior density of the parameters θ using the innovations form (Kitagawa and Gersch, 1996) (Section 2.4.3).

2.4.1. E-step

The dMAP-EM algorithm is initialized with the parameters Q(θ(0)) and equation M21. In the rth iterate of the E-step, the algorithm computes the conditional expectation of the complete data log-likelihood equation M22, given the observed data equation M23 and the previous estimates of the parameters θ(r− 1), with an added term for the log-prior density:

equation M24
(13)

We emphasize that the expectation in Equation 13 is computed with respect to equation M25, where we have conditioned on the full set of measurements and the parameter estimate of the previous iteration.

Before continuing with the algorithm we establish the following variables to simplify the notation. We define the conditional mean

equation M26
(14)

the conditional covariance

equation M27
(15)

and the conditional lag-covariance

equation M28
(16)

where the subscript in Equations 14, 15, and 16 indicate that we are conditioning on the measurements form time 1 until time s, and the superscript r refers to the iteration number. For example, as we used in Equation 12, by setting s equal the number of samples in time T we indicate that are conditioning on the full set of measurements.

As shown in Appendix C, the computation of the conditional expectations in the function U(θ|θ(r− 1)) (Eq. 13) can be obtained in closed form yielding

equation M29
(17)

where cNx and cNy are constants not depending on θ, and

equation M30
(18)

equation M31
(19)

where

equation M32
(20)

The conditional expectations and covariances in Equation 20 can be computed with the Kalman Filter (Kalman, 1960), Fixed Interval Smoother (Rauch et al., 1965), and lag-covariance (de Jong and Mackinnon, 1988) recursions.

The Kalman Filter, Fixed Interval Smoother, and lag-covariance recursions

We can use the forward and backward recursions (Kitagawa and Gersch, 1996) to compute the desired expectations and covariances. In the rth iteration we initialize the recursion with equation M33 and equation M34. For the forward pass, t = 1, 2, …, T, we compute the predictions:

equation M35
(21)

and filtered estimates:

equation M36
(22)

and for the backward pass, t = T − 1, …,0, we find the smoothed estimates:

equation M37
(23)

The lag-covariances can be obtained using the covariance smoothing algorithm (de Jong and Mackinnon, 1988):

equation M38
(24)

We should note that the conditional mean and covariance that we ultimately use for source localization (Eq. 12) correspond to the Fixed Interval Smoother estimate (FIS, Eq. 23) obtained on the final iteration of the MAP-EM algorithm. The FIS provides an estimate of the state based on the full set of measurements, and results in improved performance in terms of reduced error covariance compared to the Kalman Filter (KF), which uses a causal subset of the data (Anderson and Moore, 2005).

In Appendix D, we show that the Kalman Filter and Fixed Interval Smoother estimates can be interpreted as the solution to a penalized least squares problem, analogous to that of the well-known L2 minimum-norm estimate (MNE) (Hämäläinen and Ilmoniemi, 1994). Viewed from this perspective, we can readily see that the FIS, KF, and MNE source localization methods are structurally similar, but with an important difference in how the prior mean for the source activity is represented at a given time. The KF and FIS optimally update their prior means by assimilating data from the past, and, in addition, from the future measurements in the case of FIS. In contrast, the MNE assumes that the prior mean is zero at all times and favors source values close to zero.

2.4.2. M-step

The M-step in the rth iterate is achieved by maximizing with respect to the parameters θ the function U(θ|θ(r− 1)) computed in the E-step (Eq. 17), which equals the sum of two terms: (1) the conditional expectation of the complete data log-likelihood given the full set of measurements and the previous estimates of the parameters, and (2) the log-prior density,

equation M39
(25)

It is easy to see that the maxima is achieved at

equation M40
(26)

where (A(r))n,n is the nth row of the nth column of A(r) (Eq. 18), and α and β are the hyperparameters defined in Equation 7.

2.4.3. Evaluation of the log-posterior density

The dMAP-EM algorithm iterates for r = 1, 2, …, ro performing E-steps and M-steps until the logarithm of the posterior density of the parameters

equation M41
(27)

evaluated at θ = θ(r) reaches an asymptote at some iteration ro.

We can evaluate the logarithm of the likelihood, i.e., the first term in Equation 27, using the innovations form (Kitagawa and Gersch, 1996),

equation M42
(28)

and the logarithm of the prior is obtained is obtained from Equation 7:

equation M43
(29)

Since the evidence in the data, equation M44, is a constant not depending on θ, we do not need to compute it in any iteration.

2.4.4. Summary of the dMAP-EM algorithm

The algorithm is initialized with parameters θ(0) and equation M45. In the rth iteration, we set the state noise covariance Q(θ(r− 1)) = diag(θ(r− 1)) and initial state covariance equation M46. We then perform an E-step (Section 2.4.1) by running the Kalman Filter, Fixed Interval Smoother, and lag-covariance recursion (Section 2.4.1), and perform an M-step (Section 2.4.2) to update the parameters θ(r). At each iteration we can update Σ0 with the heuristic equation M47. The algorithm iterates for r = 1, 2, …, ro, performing an E-step followed by an M-step until the log-posterior density evaluated at θ(ro) converges (Section 2.4.3). The Maximum a Posteriori (MAP) estimate of the parameters is then [theta w/ hat]MAP = θ(ro), and the Empirical Bayes estimate of the sources amplitudes is equation M48. Figure 2 summarizes the computations performed in our dMAP-EM algorithm.

Figure 2
The dMAP-EM algorithm

2.5. Initialization of algorithms

The state feedback matrix F was constructed as indicated in Section 2.2 to incorporate nearest-neighbor interaction in order to model basic local intracortical connections. To evaluate the weights dn,i (Eqs. 3 and 5), the distance between dipoles was obtained from the triangular tessellation of the cortical surface. The value of an in Equation 5 was set to 0.51 to indicate that the source amplitude in the present time is approximately equally weighted between the past activation in the same dipole and its neighborhood, and the choice 0.5 < an ≤ 1 assures that F is nonsingular1. The value of λ in Equation 5 was set to 0.95: this constrains the modulus of the largest eigenvalue of F to be strictly less that 1, and guarantees stability in the source model (Eq. 4). We should note that although it would be interesting to estimate the parameters equation M49 and λ, this would add ~ 5000 degrees of freedom to the estimation task, making this option unfeasible.

The measurement noise covariance C was estimated as the sample covariance from empty room recordings. To initialize the algorithms we set the source covariance at time zero ( equation M50) as a multiple of the identity, equation M51, where

equation M52
(30)

thus approximating the power signal-to-noise ratio (SNR) of the measurements. The state noise covariance Q(θ(0)) was initialized as a multiple of the identity, equation M53 , where equation M54 was set to a tenth of equation M55 to establish a starting value in scale with the data SNR.

3. Data analysis

3.1. Design of simulation studies

We employed simulation studies to compare the source localization performance of four methods: (1) The L2 minimum-norm estimate (MNE, equation M56 in Equation D.6) (Hämäläinen and Ilmoniemi, 1994); (2) An extension of MNE, which we call static Maximum a Posteriori Expectation-Maximization (sMAP-EM), where the variance of each dipole source is estimated similarly to that shown in Friston et al. (2008) and Wipf and Nagarajan (2009); (3) The FIS without the estimation of the parameters θ ( equation M57 in Equation 23, i.e., the smoothed estimate in the first iteration of our algorithm); (4) And our dMAP-EM algorithm ( equation M58 in Equation 23, i.e., the smoothed estimate in the last iteration of our algorithm).

We should note that sMAP-EM uses a static source model and the inverse gamma prior (Eq. 7) for the source variances. Therefore, the sMAP-EM is similar to our method, but with the state feedback matrix F set to zero, i.e., F = 0. It provides a baseline to compare how the matrix F, as specified to model local intracortical connections, can be used as covariates or regressors to explain the source activity xt in terms of its past xt− 1.

We constructed two simulated data sets with active regions of different sizes to compare the performance of these methods in cases where the activity is distributed across a large area, and where it is highly focal:

  • Large Patch: We selected a large active region within primary somatosensory cortex (Figure 3 top panel);
    Figure 3
    Large cortical patch simulation results for MNE, sMAP-EM, FIS, and dMAP-EM methods. The intensity maps represent the amplitude of simulated and estimated sources. The colorbar’s maximum (bright yellow) and minimum (bright blue) corresponds to ...
  • Small Patch: We chose a small active region over primary auditory cortex (Figure 5 top panel).
    Figure 5
    Small cortical patch simulation results for MNE, sMAP-EM, FIS, and dMAP-EM methods. The intensity maps represent the amplitude of simulated and estimated sources. The colorbar’s maximum (bright yellow) and minimum (bright blue) corresponds to ...

In order to avoid committing an “inverse crime”, where simulated data come from the same source space and dynamic system used in estimation, we simulated cortical activation on a highly discretized mesh with ~ 150,000 dipole sources in each hemisphere and a temporal generating model differing from that of our autoregressive model. The time course of the sources on each active patch was simulated as a 10-Hz sinusoidal oscillation over a period of one second in order to emulate a realistic MEG experiment, sin(2π·10·Δt), where the sampling frequency 1/Δ was 200 Hz thus yielding 200 time samples.

The lead field matrix was computed with the MNE software package (Hämäläinen and Sarvas, 1989) using a single-compartment boundary-element model (BEM) based on high-resolution MRIs processed with Freesurfer (Dale et al., 1999; Fischl et al., 1999) from a human subject. The measurement equation (Eq. 1) was then used to obtain the simulated MEG recordings, where the measurement noise was set to achieve a power signal-to-noise ratio (SNR) of 5, a value typical for MEG measurements, with signal amplitudes scaled uniformly across the active regions to achieve this SNR.

3.2. Results of simulation studies

We compared dipole source estimates and their 95% Bayesian credibility (confidence) intervals (CIs) from the static MNE, sMAP-EM, FIS, and dMAP-EM algorithms, using the simulated measurements described in Section 3.1. The 95% CIs are obtained from the diagonal elements of the posterior covariance matrix of each method. We should note that these CIs are Empirical Bayes estimates since we are also conditioning on the model parameters θ. Specifically, the 95% CIs of the FIS dipole source estimates are equal to ±2 times the diagonal elements of posterior covariance in the first EM iteration equation M59 (Eq. 23), while the CIs of the dMAP-EM source estimates equal ±2 times the diagonal elements of equation M60, i.e. the posterior covariance of the last EM iteration. Similarly, the CIs for the MNE and sMAP-EM estimates can be obtained by using the well-known formula for the posterior covariance in a jointly Gaussian model, or equivalently by setting F = 0 in our formulation.

Figure 3 shows the spatial extent of the estimated activity obtained from the large patch simulation. In particular, these intensity maps represent the amplitude of dipole currents xn,t at a particular time, as opposed to (scaled or normalized) statistical maps. The MNE extends beyond the simulated area in primary sensory cortex, overlapping pre-central gyrus, parietal cortex, and a small active area in the temporal lobe. The sMAP-EM estimates were highly focal in comparison to the extent of the simulated active region. The FIS estimates were similar to the MNEs, but with better coverage of the simulated patch area. The dMAP-EM method yielded dipole source estimates whose spatial extent closely matched the simulated active region, as observed with the yellow and red hues over primary somatosensory cortex.

Figure 4 compares the temporal tracking performance for representative dipole sources located inside the active area for the large patch simulation. The MNE time series greatly underestimates the amplitude of the true simulated time series and present wide CIs. The sMAP-EM either underestimates (sub-panels B and C) or overestimates (sub-panel D) the source amplitude and presents larger CIs that the other methods. Similar to MNE, FIS time series estimates also fail to track the true underlying signal accurately, however, they exhibit smaller CIs. For most dipole sources (sub-panels B, C, and D) dMAP-EM closely tracks the true underlying time series while maintaining small CIs. We present in Supplementary Information 1 (SI1.mov) a video with the results of this simulation study that visualizes dynamically the spatial localization and temporal tracking performance.

Figure 4
Time course estimates for the large patch simulation. The upper panel shows a zoomed-in view of the simulated cortical area, where green dots represent the selected dipoles labeled A, B, C, and D. The black lines represent simulated sources, while the ...

The estimation results from MNE, sMAP-EM, FIS, and dMAP-EM for the small patch simulation are shown in Figure 5. Again, the intensity maps represent the amplitude of the source estimates, and not a statistical map. In MNE and FIS the estimates extended beyond the active area to cover regions in the inferior temporal lobe, parietal cortex, and the inferior frontal areas. In contrast, dipole source estimates obtained with sMAP-EM and dMAP-EM are focal and closely match the spatial extent of the active simulated area.

Figure 6 compares the estimated and simulated time courses for representative dipole sources in the small patch simulation. As shown in the upper left panel, the dipoles labeled A, C and D are outside the active region, while the dipole labeled B is inside the active region. Neither MNE nor FIS are able to accurately track the true underlying active time series (sub-panel B), and both methods produce spurious activation in locations outside the focal active patch (sub-panels A, C, and D). However, the FIS method shows smaller CIs. The sMAP-EM method tracks the active source (sub-panel B), although it slightly overestimates it. In the inactive locations (sub-panels A, C, and D) the sMAP-EM shows small but noisy estimates with very large CIs. The dMAP-EM method accurately tracks the time series for the active dipole source (sub-panel B), while correctly showing small, near-zero activity outside the focal active patch (sub-panels A, C, and D). Furthermore, the dMAP-EM shows small CIs in all cases. We present in the Supplementary Information 2 (SI2.mov) a video with the results of this simulation study.

Figure 6
Time course of estimation results for the small patch simulation. The upper shows a zoomed-in view of the simulated cortical area where green dots represent selected dipoles labeled A, B, C, and D. The black lines represent simulated sources, while the ...

3.3. Error analyses

In this section we evaluate the source localization accuracy of our dMAP-EM methodology in comparison to MNE, sMAP-EM, and FIS. Specifically, we are interested in evaluating the sensitivity and specificity of these source localization methods, as well as the correlation between the simulated sources and their estimates. To do this, we computed receiver operating characteristic (ROC) curves (See Moon and Stirling (2000), or Darvas et al. (2004) for an application in the MEG/EEG inverse problem) and root mean square errors (RMSE) of sources estimates in the simulation studies described in Sections 3.1 and 3.2.

3.3.1. Receiver operating characteristic (ROC) curves

The ROC technique is used to evaluate the performance of binary detection systems, where the null hypothesis H0 generally represents the absence of an underlying signal while the alternative hypothesis HA denotes its presence. This analysis is done by determining the relation between the detection probability,

equation M61
(31)

and the false alarm probability,

equation M62
(32)

as we vary a threshold for the hypothesis test, where in practice, the probabilities are estimated by proportions from simulation studies.

For this analysis we consider a source signal to be inactive (H0 is true) when the simulated nth dipole source at time t equals zero ( equation M63), while a source signal is considered active (HA is true) when equation M64. For a given estimation method, we define x̂n,t as the nth dipole source estimate at time t. Furthermore, we reject the null hypothesis H0 to indicate that the source estimate is active with respect to a test threshold c > 0 if |x̂n,t| > c. Consequently, for a specific threshold c, an estimate of the detection probability [p with hat]D is given by the fraction of events where we correctly detected an active source, i.e., when the dipole source estimate was considered active (|x̂n,t| > c) given that the underlying true source was active ( equation M65). Similarly, the estimate of false alarm probability [p with hat]FA is given by the proportion of events where we considered the source estimate to be active (|x̂n,t| > c) but the underlying true source was inactive ( equation M66) and made a “false alarm” (See Appendix E for details).

We computed ROC curves for the large and small patch simulation studies for MNE (red), sMAP-EM (magenta), FIS (green), and dMAP-EM (blue) estimates. The results from the large and small patch simulation study are shown in the sub-panels A and B of Figure 7, respectively. The ROC curves for dMAP-EM showed a superior source detection accuracy in both simulation studies. In the large patch study, the dMAP-EM achieved a ~ 90% detection of true active sources ([p with hat]D ≈ 0.9) with as few as ~ 2% false alarms ([p with hat]FA ≈ 0.02), while the other methods required at least ~ 40% false alarms to achieve the same detection accuracy. Similarly, in the small patch simulation, the dMAP-EM achieved a ~ 95% detection of true active sources with as few as ~ 2% false alarms, whereas the other methods required at least ~ 40% false alarms to detect ~ 95% of the true active sources. Furthermore, the area under the ROC curve (see tables in sub-panels A and B of Figure 7), which is a comprehensive measure of the detection accuracy, was greater in the dMAP-EM method than in the other methods, thus indicating a significant improvement in the source localization accuracy of our method.

Figure 7
ROC curves, RMSE, and convergence of algorithms. ROC curves and area under the ROC curves from the large patch simulation (sub-panel A) and the small patch simulation (sub-panel B) show that the dMAP-EM method outperforms MNE, FIS, and sMAP-EM methods ...

3.3.2. Root mean square errors (RMSE)

To further characterize the accuracy of each method, we evaluated the deviation of the dipole source estimates x̂n,t in relation to the true simulated sources equation M67 in absolute terms by computing root mean square errors (RMSE) in each simulation study. Specifically, for the MNE, FIS, sMAP-EM, and dMAP-EM methods, we computed the RMSE of each dipole source (n = 1, 2, …, Nx) over the length in time (T = 200) of the simulation,

equation M68
(33)

and calculated summary statistics of these errors. We separated the summary statistics for the RMSEs of dipole sources inside and outside the simulated active region to disentangle the origin of the estimation errors. Given that the number of dipoles inside the active patch was small, we computed scatter plots and the sample mean of the RMSE of sources in this region (Fig. 7, sub-panels C and D). However, because the number of dipole source located outside the active patch was relatively large, we computed box-plot summaries of the RMSE of sources from this region, thereby displaying 0.01, 0.25, 0.5, 0.75, and 0.99 approximate quantiles (Fig. 7, sub-panels E and F). We should note that in box-plots the approximate 0.75 and 0.25 quantiles are represented by the bottom (thin line) and top (thick colored line) of the box, while the 0.5 quantile (median) is denoted by the colored dashed line across the box. The approximate 0.99 and 0.01 quantiles are given by the top and bottom “whiskers”, i.e., the horizontal black lines, and the grey crosses outside whiskers are considered outliers.

Sub-panels C and D of Figure 7 show the scatter plots and the average (dash lines) RMSE of dipole sources inside the active region in the large and small patch simulation studies, respectively. In both simulation scenarios the dMAP-EM estimates showed a significant improvement of the average RMSE in relation to the other methods. In the large patch simulation the dMAP-EM method yielded an average RMSE of 7 nAm, which represents an reduction of ~ 5.4%, ~ 2.7%, and ~ 23% in the average of errors with respect to MNE, FIS, and sMAP-EM methods, respectively. Similarly, the dMAP-EM method resulted with an average RMSE of 26 nAm in the small patch simulation showing a reduction of ~ 42% with respect to MNE and FIS, and of ~ 33% in relation to the sMAP-EM errors. In summary, the average RMSE of sources in active regions is significantly reduced in the dMAP-EM method for both large and small patch simulation scenarios.

Sub-panels E and F of Figure 7 show the RMSE box-plots of dipole sources located outside the active region in large and small patch simulation, respectively. In both simulated scenarios, our dMAP-EM method yielded the smallest quantiles among the analyzed methods. The reduction in RMSE relative to MNE, FIS, and sMAP-EM are shown in Table 1. The dMAP-EM method improved the 0.99 and 0.75 RMSE quantiles by 9% to 51%, showing that the dMAP-EM estimates significantly and robustly improved source localization accuracy.

Table 1
Percentage reduction of the RMSE quantiles for dipole sources outside the active region under dMAP-EM, compared to MNE, FIS, and sMAP-EM

3.4. Convergence and computational requirements

The convergence of the dMAP-EM algorithm was assessed by tracking the logarithm of the posterior density (Eq. 27),

equation M69
(34)

omitting the log-evidence term, which does change during EM iteration. Sub-panels G and H of Figure 7 show the convergence, i.e., the cost evaluated at each iteration, of the dMAP-EM and sMAP-EM algorithms for the large patch and small patch simulation studies, respectively. In both simulation scenarios, the cost function reaches a plateau in less than 15 iterations.

The runtime of the dMAP-EM algorithm per EM iteration is effectively equation M70 (Mendel, 1971), where we assumed that the number of sensors Ny is fixed and much smaller than the number of dipole sources Nx (Ny [double less-than sign] Nx), and T is the number of measurements in time. The algorithm was implemented in Matlab (The MathWorks, Natick, MA) and run on a dual 6-core Linux workstation at 2.67 GHz with 24 GB RAM. In our analyses the number of dipole sources was Nx = 5124, the number of sensor was Ny = 204, and the number of measurements in time was T = 200. Since we did not attempt any kind of model reduction procedure, the algorithm yielded a computation time of ~ 2.5 hours per EM iteration.

3.5. Analysis of experimental data from human subjects

We also applied the MNE, FIS, sMAP-EM, and dMAP-EM algorithms to mu-rhythm MEG data from a human subject. The mu-rhythm originates from motor and somatosensory cortices, and consists of oscillations with 10 and 20 Hz components. Data were collected using a 306-channel Neuromag Vectorview MEG system at Massachusetts General Hospital. The subject was instructed to rest with eyes open during the recording. We recorded 12 minutes of data at a sampling frequency of 601 Hz with a bandwidth of 0.1 to 200 Hz, and later downsampled to 200.3 Hz. The data were visually inspected to select 1 second of strong mu-rhythm activity, as evidenced by its characteristic “μ” shape, for subsequent analyses. In Figure 8, we show the results of the three methods where the intensity maps represent the amplitude of source estimates. This figure also includes the topography of the magnetic field component normal to the sensor surface at the same time instant (Hämäläinen and Ilmoniemi, 1994; Numminen et al., 1995). Similar to Figures 3 and and55 in the simulated scenario, the MNE and FIS produced broad, spatially distributed estimates with activity covering primary somatosensory and motor regions as well as parietal, occipital, and temporal areas. The sMAP-EM method yielded highly focal estimates that appear spatially irregular. The dMAP-EM estimate presents a more compact spatial extent that covers primary somatosensory and parietal areas.

Figure 8
Analysis of human MEG mu-rhythm data. The top panel shows the topography of the magnetic field component normal to the sensor surface, with isocontour lines indicating a field change of 100 fT. In the remaining panels, the intensity maps represent the ...

The time course estimates and their 95% Bayesian credibility intervals (CIs) are shown in Figure 9. The 95% CIs (light colored areas) are obtained as described in Section 3.2. The upper panel shows a zoomed-in view of the cortical activity map obtained with the MNE method, where green dots represent the selected dipoles in primary motor (A), primary somatosensory (B), and parietal association areas (C and D). We note that these particular areas were selected as they have been reported to generate the mu-rhythm signals (Hari and Salmelin, 1997). Both MNE and FIS methods yielded estimates with smaller amplitudes, however, the CIs for the FIS method were smaller. In sMAP-EM estimates, the amplitude of the dipoles estimates A, B, and C was small with wider CIs in comparison to the dMAP-EM estimates. The dMAP-EM method yielded estimates with higher-amplitude oscillations, especially in dipoles B, C, and D, where the estimate follows the stereotypical mu-shape that characterizes these data, and presented smaller CIs than those of the MNE and sMAP-EM methods. In the Supplementary Information 3 (SI3.mov) we present a video of the results of this analysis.

Figure 9
Time course of estimation results for human MEG mu-rhythm data. The upper panel shows a zoomed-in view of the cortical activity map obtained with the MNE method, where the green dots represent dipoles labeled A, B, C, and D. The colored lines are estimated ...

4. Discussion

Intracranial electrophysiological recordings have shown that on a local scale, brain activity is spatially and temporally correlated (Bullock et al., 1995; Destexhe et al., 1999; Leopold et al., 2003; Nunez, 1995). Similarly, non-invasive fMRI and PET studies have shown that brain activation is temporally coherent in a spatially distributed network (Raichle et al., 2001; Gusnard and Raichle, 2001; Fox et al., 2005; Fox and Raichle, 2007). On the modeling side, biophysical spatiotemporal dynamic models of neuronal networks have been able to simulate electromagnetic scalp signals similar to those seen in recordings during normal and disease states (Jirsa et al., 2002; Wright et al., 2004; Robinson et al., 2005; David et al., 2005; Sotero et al., 2007; Kim and Robinson, 2007; Izhikevich and Edelman, 2008; Gross et al., 2001). We incorporated these insights to probabilistically model cortical activation as a distributed spatiotemporal dynamic process, and used this model as the basis for an inverse solution. This probabilistic model acts as a soft a priori constraint on the evolution of spatiotemporal cortical trajectories, in a way that allows the recorded data to update our belief on this trajectory at each moment in time. In this model of cortical activity, the nearest-neighbor interactions are intended primarily to model spatial dependencies observed with intracranial electrophysiology. However, as shown by the large and small patch simulation results, the flexibility of this soft constraint results in accurate estimates of both extended and focal brain activity.

Spatiotemporal MEG/EEG source localization algorithms (Baillet and Garnero, 1997; Greensite, 2003; Daunizeau et al., 2006; Daunizeau and Friston, 2007; Friston et al., 2008; Trujillo-Barreto et al., 2008; Zumer et al., 2008; Limpiti et al., 2009; Ou et al., 2009; Bolstad et al., 2009) have been previously developed with the aim of obtaining temporally smooth estimates by imposing computationally convenient prior constraints on dipole sources. While these methods provide a way of constraining the temporal evolution of inverse solutions, the relationship between prior constraints and underlying physiology is unclear. In contrast, our spatiotemporal dynamic model is structured to represent well-known local cortical biophysics, neuroanatomy, and electrophysiology, and can be developed further to account for more complex brain dynamics and spatial interactions. Other authors have proposed state-space models in the EEG inverse problem with volumetric Laplacian spatial interactions, using recursive least-squares (Yamashita et al., 2004) or an approximate version of the Kalman filter (Galka et al., 2004). In contrast, our approach establishes dynamic relationships along the cortical surface to represent local cortical interactions observed in physiological studies, and uses the full-dimensional Kalman filter and Fixed-Interval Smoother in conjunction with an EM algorithm to provide statistically optimal estimates of dipole source activity as well as dipole-specific model parameters. While these calculations are high dimensional and computationally demanding, we chose this approach in order to place the model in a spatial and temporal scale consistent with the biophysics of cortical interactions.

To understand the dynamic algorithms in relation to previously developed methods, we re-expressed the Kalman Filter and Fixed Interval Smoother estimates in a form analogous to that of the well-known static MNE (see Appendix D). This analysis showed that the dynamic methods have a structure that is very similar to MNE, with the Kalman Filter and Fixed Interval Smoother prediction covariance matrices playing the same role as the prior source covariance, or regularization matrix, in MNE. However, there is a critical difference in how the methods account for the prior mean source activity at a given moment in time. MNE assumes that this prior mean is zero at all times, while the Kalman filter and Fixed Interval Smoother optimally update their prior means by assimilating data from the past, as well as the future in the case of FIS. In this way, as pointed out by Desai (2005) and Long et al. (2011), MNE and other similarly structured static algorithms impose a tendency towards zero source activity at every moment in time by ignoring estimates computed for other time points. Any reasonable model approximating spatiotemporal brain dynamics, employed within this dynamic estimation framework, would avoid this tendency towards zero, and would improve performance by enabling information from past and future estimates to help infer the cortical state at a particular point in time.

We designed simulation studies to compare source localization performance of the static MNE, sMAP-EM, FIS, and dMAP-EM estimates. The simulations were constructed to assess algorithm performance on both distributed and focal cortical activity. We simulated MEG data with a highly discretized cortical mesh and a deterministic temporal signal for source activity outside our autoregressive model class to avoid committing an “inverse crime”. In both simulations, i.e., with either a large or small active patch, the dMAP-EM method outperformed the MNE, sMAP-EM, and FIS methods in terms of spatial localization accuracy, temporal tracking of the simulated time series, the posterior error covariance, as well as RMSE and ROC analyses. These results suggest that the joint estimation of model parameters and source localization in a spatiotemporal dynamic model, as performed by the dMAP-EM algorithm, can significantly improve inverse solutions. Furthermore, the fact that dMAP-EM algorithm provides more accurate dipole source estimates than FIS, which does not estimate model parameters, indicates that parameter estimation within the dynamic model is critically important in this inverse problem. We also applied these methods to human MEG mu-rhythm data, and obtained results similar to the simulated scenarios: The dMAP-EM method yielded distributed yet spatially compact estimates with pronounced time series amplitude in areas of cortex consistent with previous mu-rhythm studies, while the MNE and FIS produced spatially spread estimates with small amplitudes, and the sMAP-EM yielded highly focal and spatially irregular estimates.

Recent work by Wipf and Nagarajan (2009) has taken on the important task of characterizing the many seemingly dissimilar static methods described in the MEG inverse literature within a unified statistical framework. An important conclusion of that work is that most static MEG inverse methods can be viewed as solutions to a covariance selection problem, allowing fast covariance selection algorithms from statistics to be applied directly to the MEG inverse problem. Framing the inverse problem in terms of covariance selection, however, leaves open the question of how one should specify the form of that covariance, particularly in the presence of spatiotemporal phenomena. Dynamic modeling and estimation provides a framework to integrate mechanisms and empiricism from biophysics and neurophysiology into solutions for the MEG inverse problem. In this view, the solution to the inverse problem becomes one of specifying and identifying biophysical and dynamical models that closely approximate the underlying neurophysiological system. The spatial and temporal covariance structure then emerges from the second-order statistics inherent in the spatiotemporal dynamic model. If these models can be phrased in the appropriate statistical framework, fast and efficient algorithms with well-known properties can be applied to compute inverse solutions and parameter estimates.

Sophisticated dynamic models have been studied previously in the context of MEG/EEG data (David et al., 2006; Kiebel et al., 2009). These Dynamic Causal Models (DCMs) use simplified spatial representations, such as equivalent current dipoles, to place greater emphasis on temporal modeling while retaining computational tractability. In contrast, the focus of our work has been to explore how spatiotemporal dynamic models in the distributed source framework, and inspired by underlying neurophysiology, can be used to improve source localization. This approach necessitates a higher-dimensional spatial model that can represent local cortical dynamics on a spatial scale consistent with neurophysiological recordings, balanced by a relatively simple temporal model to make computations tractable. In the long run, we envision an approach where more complex spatiotemporal models will be used to solve the MEG/EEG inverse problem. These models could include long-distance connectivity information derived from diffusion tensor imaging (DTI) and nonlinear interactions, while preserving a realistic spatial scale to represent brain activity.

5. Conclusions and Future Work

In this work we (1) developed a distributed stochastic dynamic model based on a nearest-neighbors autoregression on the cortical surface to represent spatiotemporal cortical dynamics, (2) derived the dMAP-EM algorithm for optimal dynamic estimation of cortical current sources and model parameters from MEG/EEG data based on the Kalman Filter, Fixed Interval Smoother, and Expectation-Maximization (EM) algorithms, (3) developed expressions to relate our dynamic estimation method to standard static algorithms, and (4) applied the spatiotemporal dynamic method to simulated experiments of focal and distributed cortical activation as well as to human experimental data. The results showed that our dMAP-EM method outperforms MNE, sMAP-EM, and FIS methods in terms of spatial localization accuracy, temporal tracking, posterior error covariance, and RMSE and ROC measures.

Our results demonstrate the feasibility of spatiotemporal dynamic estimation in distributed source spaces with several thousand dipoles and hundreds of sensors, resulting in inverse solutions with substantial performance improvements over static methods. Our analysis of known cortical biophysics, models (static vs. dynamic), source estimates, and error in localization revealed clear reasons why one would expect the dynamic methods to perform better than the static MNE. In future work, we will develop new techniques to improve computational performance by means of model reduction or high-performance computing, and will incorporate more realistic neurophysiological models within this dynamic modeling framework.

Highlights

  • Brain activity is a distributed spatiotemporal dynamic process
  • A dynamic spatiotemporal source model consistent with neurophysiology and neuroanatomy is used for source localization from MEG/EEG data
  • The dMAP-EM algorithm is developed to provide estimates of current sources and model parameters
  • The dMAP-EM algorithm is applied in simulated experiments as well as in the analysis of human experimental data
  • The dMAP-EM algorithm outperforms other methods in terms of spatial localization accuracy and temporal tracking

Supplementary Material

Appendix A. Robustness of source model agains variations in the feedback matrix F

In this appendix we analyze how modifications in our spatial model (F) influence the prior source covariance, and show that the resulting smoothness encoded a priori in the dynamic source model is robust against misspecification of the F matrix. To do this, we first take a modified state model (Eq. 4) where the feedback matrix [p with tilde] [equivalent] F + ΔF has been perturbed by ΔF. Then, we consider the prior source covariance of the original model Σ and that of the modified model [Sigma], when they have reached equilibrium (steady-state), i.e., limt→∞ Cov(xt). At last, we derive an upper bound for the matrix 2-norm of the difference between the equilibrium covariances ΔΣ [equivalent] [Sigma] − Σ, as a function of the perturbation of the spatial model ΔF.

We assume that both F and [p with tilde] yield stable dynamical systems (i.e., the modulus of their largest eigenvalue is strictly less than 1). From the stability condition, the equilibrium prior state covariance of the original model (Σ) and that of the modified model ([Sigma]) correspond to the respective (unique) solutions of the discrete Lyapunov equations (Brockwell and Davis, 2006):

equation M71
(A.1)

We now express the difference in the equilibrium covariances ΔΣ by subtracting the equalities above to obtain:

equation M72
(A.2)

Now we take the matrix induced 2-norm in Equation A.2, and apply repeatedly the triangle inequality and the submultiplicative property of induced norms:

equation M73
(A.3)

We rearrange terms in Equation A.3 to obtain:

equation M74
(A.4)

The fact that both feedback matrices are stable is equivalent to ||F||2, ||[p with tilde]||2 < 1. Using the inverse triangle inequality and setting F = [p with tilde] − ΔF, we obtain |||[p with tilde]||2 − ||ΔF ||2 | ≤ ||F||2 < 1. This inequality implies that ||ΔF ||2 < 2. We plug the previous result and the fact that ||F||2 < 1 in the inequality above (Eq. A.4), and find that the matrix 2-norm of the difference between the equilibrium covariances ΔΣ is bounded above by a constant multiplied by the 2-norm of the perturbation in the feedback matrix ΔF:

equation M75
(A.5)

where the constant equation M76.

From Equation A.5 we can conclude that as long as the mis-specification of the feedback matrix ΔF is small, the smoothness modeled a priori in the dipole sources is not dramatically altered.

Appendix B. A non-informative prior for the state noise covariance Q = diag(θ)

In this section we present a derivation of the non-informative prior for the parameters θ. Our task is to set the hyperparameters α and β such that the prior probability density of θn (Eq. 7) is non-informative by making it approximately flat. This can be achieved by giving a large variance to the prior while fixing the mean to a value consistent with the order of magnitude in the model. We assume the values of the hyperparameter α is greater that 2 in order to obtain analytic expression for the mean (E[θn] = β/(α + 1)) and variance (Var(θn) = β2/[(α − 1)2(α − 2)]). First, we set the prior mean to the order of magnitude in our model units, i.e., E[θn] = 10−18, and since we want the prior variance Var(θn) to be large, we set it in a way such that it is much larger than the mean, Var(θn) [dbl greater-than sign] E[θn]. The relationship between the variance and the mean is Var(θn) = E(θn)2/(α − 2), which implies that:

equation M77
(B.1)

Plugging α into the equation for the mean yields:

equation M78
(B.2)

Therefore, setting the hyperparameters to α = 2 + δ and β = 10− 18, where δ [double less-than sign] β, provides a non-informative prior with a mean that is consistent with our model.

Appendix C. Derivation of U(θ|θ(r−1)) in the E-step

From Equation 13 we have that

equation M79
(C.1)

where the complete data log-likelihood is derived from the measurement (Eq. 1) and source (Eq. 4) models:

equation M80
(C.2)

with

equation M81
(C.3)

where cNx and cNy are constants not depending on θ.

We apply Equation C.3 to the complete data log-likelihood (Eq. C.2) and compute its expectation with respect to equation M82, where we have conditioned on the full set of measurements and the parameter estimate of the previous iteration (Shumway and Stoffer, 1982). We then add the logarithm of the prior density of θ and obtain Equation 17.

Appendix D. Relationships between the dynamic and static estimators

In this appendix we present an algebraic analysis of the Kalman Filter (KF) and Fixed Interval Smoother (FIS) estimates that illustrates their relationship to the L2 minimum-norm estimate (MNE) (Hämäläinen and Ilmoniemi, 1994). We emphasize that in this analysis the parameters θ in the model are assumed to be fixed. Specifically, we set aside the so-called problem of identification or learning of parameters and focus on comparing and contrasting the functional form of the source amplitude estimates given by these methods.

We note that while it has been well-established that, on one side, the MNE can be seen as a static Maximum a Posteriori estimate of the current dipole sources where the measurements are assumed to be independent in time (Hämäläinen and Ilmoniemi, 1994; Wipf and Nagarajan, 2009), and on the other, the KF and FIS algorithms are implementations of a Maximum a Posteriori estimation problem in a linear Gaussian state-space model (Kitagawa and Gersch, 1996; Roweis and Ghahramani, 1999), our contribution in this section is: (1) To show how the KF, and especially the FIS, are solutions to particular penalized least square problems structurally similar to the L2 minimum-norm cost function, where the penalty term reflects how the information of past equation M83 and future equation M84 measurements is optimally accounted by the estimate; and (2) to present the formulas for the KF and FIS estimates in a way that parallel those of the well-known MNE equation as an attempt to further introduce these dynamic estimation techniques in the broad neuroimaging community.

To facilitate the notation, in this section we will assume that all densities are conditioned by a set of parameters and use the notation p(·) [equivalent] p(·|θ). We begin by recalling that the MNE assumes the probability density of the source amplitude vector p(xt) to be Gaussian with mean μt = μ(MNE) [equivalent] 0 and covariance, or regularization matrix, Pt = Σ(MNE). Therefore, the MNE maximizes the posterior density (Hämäläinen and Ilmoniemi, 1994),

equation M85
(D.1)

where the likelihood p(yt | xt) is Gaussian with mean Gxt and covariance C (Eq. 1). We must emphasize that the MNE “prior” density p(xt) does not contain any information about the measurements.

A similar interpretation can be given to the Kalman Filter estimate of xt given data up to time t, as shown in Kitagawa and Gersch (1996); Roweis and Ghahramani (1999). In this case, the “prior” corresponds to the conditional density of xt given data up to time t − 1, equation M86, which is Gaussian with mean μt = xt |t− 1 and covariance Pt = Pt |t− 1 given by the prediction step of Kalman Filter recursions (Eq. 21). Then the Kalman Filter computes the Maximum a Posteriori estimate of xt given data up to time t by maximizing the posterior density:

equation M87
(D.2)

We highlight that the Kalman Filter “prior” density equation M88 contains information only from past measurements equation M89.

The Fixed Interval Smoother estimate can be interpreted similarly, where the “prior” density corresponds to the conditional density of xt given previous data up to time t − 1 and future data after time t + 1, equation M90. To the authors’ knowledge, this particular interpretation of the FIS has not been reported in the literature. Again, the “prior” density is Gaussian, and its mean μt = xt |T \t and covariance Pt = Pt |T \t can be obtained by standard methods with computationally costly algebraic manipulation. In this case, the notation of the subscript t |T \t reflects that we are conditioning on all data except the immediate data point yt. Therefore, the Fixed Interval Smoother estimate maximizes the posterior density of the state given all measurements:

equation M91
(D.3)

We emphasize that the Fixed Interval Smoother “prior” density equation M92 includes information from both past equation M93 and future measurements equation M94.

We can see now that once we have available the “prior” densities’ mean and covariance for each method, obtaining the Maximum a Posteriori estimates for the MNE, KF, and FIS by maximizing Equations D.1, D.2, and D.3, respectively, corresponds to apply a penalized least squares technique where the data fit term corresponds to the likelihood term and the penalty term is related to the “prior” density. Specifically, for each method, the penalized least squares function to minimize is

equation M95
(D.4)

where μt and Pt are the “prior” mean and covariance as defined above for each method. We should note at this point that Equation D.4 establishes a parallel between L2 minimum-norm cost function, and the Kalman Filter and Fixed Interval Smoother estimates, where the data fit term is identical in these methods, but the penalty varies to indicate how and whether past and future measurements should influence the estimate.

The minimizer of Equation D.4 is given by

equation M96
(D.5)

Now we can simply replace the corresponding “prior” mean and covariance for each method to obtain the equation M97, the Kalman Filter estimate (xt |t) and the Fixed Interval Smoother estimate (xt |T ):

equation M98
(D.6)

Equation D.6 allows us to compare and contrast the algebraic forms of these estimates. We should note that while line 1 is the well-known MNE formula and line 2 is in fact identical to the Kalman Filter algorithm (Eq. 22), line 3 corresponds to a novel derivation of the Fixed Interval Smoother estimate that allows us to highlight similarities and differences between these estimates. We can see that each method builds a prediction of xt using the respective “prior” mean, and then updates this prediction using the measurement at the same time point yt by computing a residuals. In the case of MNE, the prediction discards any observed data and assumes it is zero. The Kalman Filter is an improvement over MNE since it builds a prediction based on previous data equation M99. Finally, the Fixed Interval Smoother goes further to build a prediction based on previous data equation M100 as well as future data equation M101. For all methods, MNE, KF, and FIS, once the prediction is made, the estimate is obtained by updating the prediction using the immediate data point yt by adding a term that is proportional to the residuals.

Appendix E. Computation of ROC curves

In this appendix we present the computations that define the estimates of the detection ([p with hat]D) and false alarm ([p with hat]FA) probabilities. These quantities, which depend on the detection threshold c, are given by

equation M102
(E.1)

where indic(·) in Equation E.1 is the indicator function on a set S:

equation M103
(E.2)

We recall that Nx ≈ 5000 represents the number of dipole sources and T = 200 is the number of time samples.

Footnotes

1Since we can write the feedback matrix as equation M1, where (D)i,j = di,j and D is a stochastic matrix (i.e., its row sum equals 1), then equation M2 and F is invertible, if equation M3. Since D is a stochastic matrix, the modulus of its eigenvalues is less or equal than 1, and ||D||2 ≤ 1. It follows that a > 0.5 is sufficient to make F nonsingular.

References

  • Anderson BDO, Moore JB. Optimal Filtering. Dover Publications; Mineola, NY: 2005.
  • Auranen T, Nummenmaa A, Hämäläinen MS, Jääskeläinen IP, Lampinen J, Vehtari A, Sams M. Bayesian analysis of the neuromagnetic inverse problem with l(p)-norm priors. Neuroimage. 2005;26:870–884. [PubMed]
  • Baillet S, Garnero L. A Bayesian approach to introducing anatomo-functional priors in the EEG/MEG inverse problem. IEEE Trans Biomed Eng. 1997;44:374–385. [PubMed]
  • Baillet S, Mosher JC, Leahy RM. Electromagnetic brain mapping. IEEE Signal Proc Mag. 2001;18:14–30.
  • Bertrand C, Ohmi M, Suzuki R, Kado H. A probabilistic solution to the MEG inverse problem via MCMC methods: the reversible jump and parallel tempering algorithms. IEEE Trans Biomed Eng. 2001;48:533–542. [PubMed]
  • Bolstad A, Van Veen BD, Nowak R. Space-time event sparse penalization for magneto-/electroencephalography. Neuroimage. 2009;46:1066–1081. [PMC free article] [PubMed]
  • Brockwell PJ, Davis RA. Time series: Theory and methods. 2 Springer; New York: 2006.
  • Bullock TH, McClune MC, Achimowicz JZ, Iragui-Madoz VJ, Duckrow RB, Spencer SS. Temporal fluctuations in coherence of brain waves. Proc Natl Acad Sci USA. 1995;92:11568–11572. [PubMed]
  • Buzsaki G, Draguhn A. Neuronal oscillations in cortical networks. Science. 2004;304:1926–1929. [PubMed]
  • Campi C, Pascarella A, Sorrentino A, Piana M. A Rao-Blackwellized particle filter for magnetoencephalography. Inverse Probl. 2008;24:025023.
  • Dale AM, Fischl BR, Sereno MI. Cortical surface-based analysis: I. Segmentation and surface reconstruction. Neuroimage. 1999;9:179–194. [PubMed]
  • Darvas F, Pantazis D, Kucukaltun-Yildirim E, Leahy RM. Mapping human brain function with MEG and EEG: methods and validation. Neuroimage. 2004;23(Suppl 1):S289–99. [PubMed]
  • Daunizeau J, Friston KJ. A mesostate-space model for EEG and MEG. Neuroimage. 2007;38:67–81. [PubMed]
  • Daunizeau J, Mattout J, Clonda D, Goulard B, Benali H, Lina JM. Bayesian spatio-temporal approach for EEG source reconstruction: Conciliating ECD and distributed models. IEEE Trans Biomed Eng. 2006;53:503–516. [PubMed]
  • David O, Harrison LM, Friston KJ. Modelling event-related responses in the brain. Neuroimage. 2005;25:756–770. [PubMed]
  • David O, Kiebel SJ, Harrison LM, Mattout J, Kilner JM, Friston KJ. Dynamic causal modeling of evoked responses in EEG and MEG. Neuroimage. 2006;30:1255–1272. [PubMed]
  • Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Series B Stat Methodol. 1977;39:1–38.
  • Desai NU. Master’s thesis. Massachusetts Institute of Technology; 2005. Souce localization for magnetoencephalography by spatio-temporal Kalman fitering and Fixed-Interval smoothing.
  • Destexhe A, Contreras D, Steriade M. Spatiotemporal analysis of local field potentials and unit discharges in cat cerebral cortex during natural wake and sleep states. J Neurosci. 1999;19:4595–4608. [PubMed]
  • Fischl BR, Sereno MI, Dale AM. Cortical surface-based analysis: II. Inflation, flattening, and a surface-based coordinate system. Neuroimage. 1999;9:195–207. [PubMed]
  • Fox MD, Raichle ME. Spontaneous fluctuations in brain activity observed with functional magnetic resonance imaging. Nat Rev Neurosci. 2007;8:700–711. [PubMed]
  • Fox MD, Snyder AZ, Vincent JL, Corbetta Maurizio, Van Essen DC, Raichle ME. The human brain is intrinsically organized into dynamic, anticorrelated functional networks. Proc Natl Acad Sci USA. 2005;102:9673–9678. [PubMed]
  • Friston KJ, Harrison LM, Daunizeau J, Kiebel SJ, Phillips C, Trujillo-Barreto NJ, Henson RN, Flandin G, Mattout J. Multiple sparse priors for the M/EEG inverse problem. Neuroimage. 2008;39:1104–1120. [PubMed]
  • Galka A, Yamashita O, Ozaki T, Biscay R, Valdes-Sosa PA. A solution to the dynamical inverse problem of EEG generation using spatiotemporal Kalman filtering. Neuroimage. 2004;23:435–453. [PubMed]
  • Gorodnitsky IF, George JS, Rao BD. Neuromagnetic source imaging with FOCUSS: a recursive weighted minimum norm algorithm. Electroencephalogr Clin Neurophysiol. 1995;95:231–251. [PubMed]
  • Green PJ. On use of the EM algorithm for penalized likelihood estimation. J R Stat Soc Series B Stat Methodol. 1990;52:443–452.
  • Greensite F. The temporal prior in bioelectromagnetic source imaging problems. IEEE Trans Biomed Eng. 2003;50:1152–1159. [PubMed]
  • Gross J, Kujala J, Hämäläinen MS, Timmermann L, Schnitzler A, Salmelin R. Dynamic imaging of coherent sources: Studying neural interactions in the human brain. Proc Natl Acad Sci USA. 2001;98:694–699. [PubMed]
  • Gusnard DA, Raichle ME. Searching for a baseline: Functional imaging and the resting human brain. Nat Rev Neurosci. 2001;2:685–694. [PubMed]
  • Hämäläinen MS, Hari R, Ilmoniemi RJ, Knuutila J, Lounasmaa OV. Magnetoencephalography - theory, intrumentation, and applications to noninvasive studies of the working human brain. Rev Modern Phys. 1993;65:413–497.
  • Hämäläinen MS, Ilmoniemi RJ. Interpreting magnetic fields of the brain: Minimum norm estimates. Med Biol Eng Comput. 1994;32:35–42. [PubMed]
  • Hämäläinen MS, Sarvas J. Realistic conductivity geometry model of the human head for interpretation of neuromagnetic data. IEEE Trans Biomed Eng. 1989;36:165–171. [PubMed]
  • Hari R, Salmelin R. Human cortical oscillations: A neuromagnetic view through the skull. Trends Neurosci. 1997;20:44–49. [PubMed]
  • Izhikevich EM, Edelman GM. Large-scale model of mammalian thalamocortical systems. Proc Natl Acad Sci USA. 2008;105:3593–3598. [PubMed]
  • Jirsa VK, Jantzen KJ, Fuchs A, Scott Kelso JA. Spatiotemporal forward solution of the EEG and MEG using network modeling. IEEE Trans Med Ima. 2002;21:493–504. [PubMed]
  • de Jong P, Mackinnon MJ. Covariances for smoothed estimates in state space models. Biometrika. 1988;75:601–602.
  • Jun SC, George JS, Paré-Blagoev J, Plis SM, Ranken DM, Schmidt DM, Wood CC. Spatiotemporal Bayesian inference dipole analysis for MEG neuroimaging data. Neuroimage. 2005;28:84–98. [PubMed]
  • Kalman RE. A new approach to linear filtering and prediction problems. Trans ASME J Basic Eng. 1960;82:35–45.
  • Kiebel SJ, Daunizeau J, Phillips C, Friston KJ. Variational Bayesian inversion of the equivalent current dipole model in EEG/MEG. Neuroimage. 2008;39:728–741. [PubMed]
  • Kiebel SJ, Garrido MI, Moran R, Chen CC, Friston KJ. Dynamic causal modeling for EEG and MEG. Hum Brain Mapp. 2009;30:1866–1876. [PubMed]
  • Kim JW, Robinson PA. Compact dynamical model of brain activity. Phys Rev E. 2007;75:031907. [PubMed]
  • Kitagawa G, Gersch W. Smoothness priors analysis of time series. Springer; New York: 1996.
  • Lamus C, Long CJ, Hämäläinen MS, Brown EN, Purdon PL. Parameter estimation and dynamic source localization for the magnetoencephalography (MEG) inverse problem. 4th IEEE International Symposium on Biomedical Imaging: From Nano to Macro; Washington, D.C., USA. 2007. pp. 1092–1095. [PMC free article] [PubMed]
  • Leopold DA, Murayama Y, Logothetis NK. Very slow activity fluctuations in monkey visual cortex: Implications for functional brain imaging. Cereb Cortex. 2003;13:422–433. [PubMed]
  • Limpiti T, Van Veen BD, Attias HT, Nagarajan SS. A spatiotemporal framework for estimating trial-to-trial amplitude variation in event-related MEG/EEG. IEEE Trans Biomed Eng. 2009;56:633–645. [PMC free article] [PubMed]
  • Long CJ, Purdon PL, Temereanca S, Desai NU, Hämäläinen MS, Brown EN. State-space solutions to the dynamic magnetoencephalography inverse problem using high performance computing. Ann Appl Stat. 2011;5:1207–1228. [PMC free article] [PubMed]
  • Mattout J, Phillips C, Penny WD, Rugg MD, Friston KJ. MEG source localization under multiple constraints: An extended Bayesian framework. Neuroimage. 2006;30:753–767. [PubMed]
  • Mendel JM. Computational requirements for a discrete Kalman Filter. IEEE Trans Automat Contr. 1971;AC16:748.
  • Moon TK, Stirling WC. Mathematical methods and algorithms for signal processing. Prentice Hall; Upper Saddle River, NJ: 2000.
  • Mosher JC, Leahy RM, Lewis PS. EEG and MEG: Forward solutions for inverse methods. IEEE Trans Biomed Eng. 1999;46:245–259. [PubMed]
  • Mosher JC, Lewis PS, Leahy RM. Multiple dipole modeling and localization from spatio-temporal MEG data. IEEE Trans Biomed Eng. 1992;39:541–557. [PubMed]
  • Nummenmaa A, Auranen T, Hämäläinen MS, Jääskeläinen IP, Lampinen J, Sams M, Vehtari A. Hierarchical Bayesian estimates of distributed MEG sources: Theoretical aspects and comparison of variational and MCMC methods. Neuroimage. 2007;35:669–685. [PubMed]
  • Numminen J, Ahlfors SP, Ilmoniemi RJ, Montonen J, Nenonen J. Transformation of multichannel magnetocardiographic signals to standard grid form. IEEE Trans Biomed Eng. 1995;42:72–78. [PubMed]
  • Nunez PL. Neocortical dynamics and human EEG rhythms. Oxford University Press; USA: 1995.
  • Ou W, Hämäläinen MS, Golland P. A distributed spatio-temporal EEG/MEG inverse solver. Neuroimage. 2009;44:932–946. [PMC free article] [PubMed]
  • Pascual-Marqui RD, Michel CM, Lehmann D. Low resolution electromagnetic tomography: A new method for localizing electrical activity in the brain. Int J Psychophysiol. 1994;18:49–65. [PubMed]
  • Phillips C, Mattout J, Rugg MD, Maquet P, Friston KJ. An empirical Bayesian solution to the source reconstruction problem in EEG. Neuroimage. 2005;24:997–1011. [PubMed]
  • Raichle ME, MacLeod AM, Snyder AZ, Powers WJ, Gusnard DA, Shulman GL. A default mode of brain function. Proc Natl Acad Sci USA. 2001;98:676–682. [PubMed]
  • Rauch HE, Tung F, Striebel CT. Maximum likelihood estimates of linear dynamic systems. AIAA J. 1965;3:1445–1450.
  • Robinson PA, Rennie CJ, Rowe DL, O’Connor SC, Gordon E. Multiscale brain modelling. Philos Trans R Soc Lond B Biol Sci. 2005;360:1043–1050. [PMC free article] [PubMed]
  • Roweis S, Ghahramani Z. A unifying review of linear gaussian models. Neural Comput. 1999;11:305–345. [PubMed]
  • Sato Ma, Yoshioka T, Kajihara S, Toyama K, Goda N, Doya K, Kawato M. Hierarchical Bayesian estimation for MEG inverse problem. Neuroimage. 2004;23:806–826. [PubMed]
  • Shumway RH, Stoffer DS. An approach to time series smoothing and forecasting using the EM algorithm. J Time Ser Anal. 1982;3:253–264.
  • Somersalo E, Voutilainen A, Kaipio JP. Non-stationary magnetoencephalography by Bayesian filtering of dipole models. Inverse Probl. 2003;19:1047–1063.
  • Sotero RC, Trujillo-Barreto NJ, Iturria-Medina Y, Carbonell F, Jimenez JC. Realistically coupled neural mass models can generate EEG rhythms. Neural Comput. 2007;19:478–512. [PubMed]
  • Trujillo-Barreto NJ, Aubert-Vázquez E, Penny WD. Bayesian M/EEG source reconstruction with spatio-temporal priors. Neuroimage. 2008;39:318–335. [PubMed]
  • Uutela K, Hämäläinen MS, Salmelin R. Global optimization in the localization of neuromagnetic sources. IEEE Trans Biomed Eng. 1998;45:716–723. [PubMed]
  • Uutela K, Hämäläinen MS, Somersalo E. Visualization of magnetoencephalographic data using minimum current estimates. Neuroimage. 1999;10:173–180. [PubMed]
  • Van Veen BD, Van Drongelen W, Yuchtman M, Suzuki A. Localization of brain electrical activity via linearly constrained minimum variance spatial filtering. IEEE Trans Biomed Eng. 1997;44:867–880. [PubMed]
  • Wipf DP, Nagarajan SS. A unified Bayesian framework for MEG/EEG source imaging. Neuroimage. 2009;44:947–966. [PubMed]
  • Wright JJ, Rennie CJ, Lees GJ, Robinson PA, Bourke PD, Chapman CL, Gordon E, Rowe DL. Simulated electrocortical activity at microscopic, mesoscopic and global scales. Int J Bifurcat Chaos. 2004;14:853–872. [PubMed]
  • Yamashita O, Galka A, Ozaki T, Biscay R, Valdes-Sosa PA. Recursive penalized least squares solution for dynamical inverse problems of EEG generation. Hum Brain Mapp. 2004;21:221–235. [PubMed]
  • Zumer JM, Attias HT, Sekihara K, Nagarajan SS. Probabilistic algorithms for MEG/EEG source reconstruction using temporal basis functions learned from data. Neuroimage. 2008;41:924–940. [PubMed]