Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2784099

Formats

Article sections

Authors

Related links

IEEE Trans Biomed Eng. Author manuscript; available in PMC 2010 November 1.

Published in final edited form as:

Published online 2009 August 18. doi: 10.1109/TBME.2009.2028615

PMCID: PMC2784099

NIHMSID: NIHMS154737

K. Hild & S. Nagarajan are with the Department of Radiology, University of California at San Francisco, CA, 94143 USA

The publisher's final edited version of this article is available at IEEE Trans Biomed Eng

See other articles in PMC that cite the published article.

Independent Components Analysis (ICA) has previously been used to denoise EEG/MEG signals before performing neural source localization. Source localization is then performed using a method such as beamforming or dipole fitting. Here we show how ICA can also be used as a source localization method, negating the need for beamforming and dipole fitting. This type of approach is valid whenever an estimate of the forward (mixing) model for all putative source locations is available, which includes EEG and MEG applications. The proposed method consists of estimating the forward model using the laws of physics, estimating a second forward model using ICA, and then correlating the columns of the matrices that represent the two forward models. We show that, when synthetic data are used, the proposed localization method produces a smaller localization error than several alternatives. We also show localization results for real auditory-evoked MEG data.

Many data sets can be modeled as a linear multiple-input, multiple-output (MIMO) mixture of source signals. The main goal of standard Independent Components Analysis (ICA) methods is to learn a linear MIMO function that, when applied to a mixture of sources (the observations), extracts the original, unmixed sources. The MIMO function that maps the source space to the observation space is commonly referred to as the mixing matrix in the field of ICA and to the forward solution in the field of neuroscience. Likewise, the MIMO function that correctly extracts the sources from the observations is referred to as either the demixing matrix or the inverse solution.

Since its inception researchers have used ICA for a variety of applications. The list of applications includes blind source separation (BSS) [1],[2], blind identification [3],[4],[5], angle of arrival estimation [6], jammer suppression [7], deconvolution [8],[9] feature extraction [10], denoising/artifact rejection [11],[12], genetic analyses [13],[14], and for estimating the number of active sources [15]. Unlike for many of the applications listed previously it is possible to use the laws of physics to estimate the forward solution for electroencephalographic (EEG) and magnetoencephalographic (MEG) applications.

ICA can also be used to estimate the forward solution, but the two estimates have important differences (the differences below were previously pointed out by Knuth [29]). The physics-derived forward solution contains information on the spatial location of all of the putative neural sources, but it is not able to determine which of the putative sources are active. The ICA-derived forward solution depends only on the active sources. It does not, however, contain any information on the location of these sources. Hence, neither of these approaches can be used, individually, to locate the neural sources.

Localization is commonly performed on EEG and MEG data after applying ICA denoising using, e.g., EEGLAB [17] or its MEG counterpart known as NUTMEG [18]. The ICA denoising is used to extract meaningful components, such as evoked responses, or to remove unwanted interference sources. Localization is then performed on the processed data using techniques such as beamforming [19],[20], dipole fitting [21],[22],[23],[24],[25],[26], or low resolution brain electromagnetic tomography (sLORETA) [27],[28].

In this paper, which extends our previous results [31], we show how the physics-derived and ICA-derived forward solutions can be combined to directly estimate the spatial locations of the active sources, thus negating the need for a separate localization method.

We will assume MEG data throughout, although the ideas can also be applied to EEG data with little modification. The following assumptions are made,

- H1: Quasi-static approx. of Maxwell’s equations apply
- H2: Signals are temporally discrete
- H3: Sources are mutually statistically independent
- H4: At most one source has a Gaussian distribution
- H5: Forward solution is full-column rank
- H6: No. of observations ≥ no. of sources

Assumption H1 is standard for EEG and MEG data [16], H2 is ubiquitous, and H3-H6 are standard whenever ICA is used [2]. Note, however, that the noise in each sensor represents a unique source in addition to any other sources (thus violating H6). Furthermore, real neural sources are almost certainly distributed (thus violating H5 and H6) and, based on neurophysiological arguments, we expect the time courses of neighboring voxels to be correlated (thus violating H3). Fortunately, as far as ICA is concerned, sources having a relatively small total energy in sensor space can oftentimes be neglected, thus reducing the number of effective sources (blind identification [3],[4],[5] and bias removal [32] may also help alleviate concerns about H6). In addition, multiple focal sources (used to represent a single distributed source) become one effective source (1) in the limit as the maximum pairwise distance between the constituent focal sources approaches zero or (2) in the limit as the minimum pairwise absolute normalized correlation coefficient approaches one. In the former case, it may be possible to accurately localize the effective source when ICA processing is used, but only the mean time course of the constituent focal sources can be recovered. In the latter case, it may be possible to recover the correct time course, but ICA processing necessarily biases the localization. Hence, in cases where an ICA-based method yields meaningful results, it may be that the data are represented well by a relatively few energetic effective sources (where case (1) holds for localization applications and case (2) holds for time course extraction applications) and the effective sources are approximately statistically independent (also, H3 and H4 can be traded for different assumptions commonly used in BSS, e.g., those that require the sources to have unique spectra [38], [41], or to be statistically non-stationary [42]).

Using the quasi-static approximation of Maxwell’s equations [16] the (observed) magnetic field at time *t* can be expressed using the (*N*_{b}×1) vector, b(t), as the following volume integral [16],

$$\mathit{b}\left(t\right)={\int}_{\Omega}\mathit{L}\left(\mathit{r}\right)\psi \left(\mathit{r}\right)s(\mathit{r},t)d\mathit{r}+\mathit{y}\left(t\right)+\nu \left(t\right)$$

(1)

where Ω represents the three-dimensional source space, ** r** is a spatial location within Ω expressed in Cartesian coordinates,

For temporally discrete signals the magnetic field at sample n is assumed to be given by,

$${\mathit{b}}_{n}={\int}_{\Omega}\mathit{L}\left(\mathit{r}\right)\psi \left(\mathit{r}\right){s}_{n}\left(\mathit{r}\right)d\mathit{r}+{\mathit{y}}_{n}+{\nu}_{n}$$

(2)

where *b*_{n} = ** b**(

$$\begin{array}{cc}\hfill {\mathit{b}}_{n}& ={\int}_{\Omega}\mathit{L}\left(\mathit{r}\right)\psi \left(\mathit{r}\right)\sum _{i=1}^{{N}_{s}}\delta \left({\mathit{r}}_{i}\right){s}_{i,n}d\mathit{r}+{\mathit{y}}_{n}+{\nu}_{n}\hfill \\ \hfill & =\sum _{i=1}^{{N}_{s}}{\mathit{L}}_{i}{\psi}_{i}{s}_{i,n}+{\mathit{y}}_{n}+{\nu}_{n}\hfill \\ \hfill & =[{\mathit{L}}_{1}{\psi}_{1}{\mathit{L}}_{2}{\psi}_{2}{\mathit{L}}_{{N}_{s}}{\psi}_{{N}_{s}}]{\mathit{s}}_{n}+{\mathit{y}}_{n}+{\nu}_{n}\hfill \end{array}$$

(3)

where *r*_{i} corresponds to the location of the *i ^{th}* neural source,

We define the (*N*_{b}×*N*_{ψ}*N*_{s}) composite lead field matrix as,

$${\mathit{L}}_{C}=[{\mathit{L}}_{1}\phantom{\rule{thickmathspace}{0ex}}{\mathit{L}}_{2}\phantom{\rule{thickmathspace}{0ex}}{\mathit{L}}_{{N}_{s}}]$$

(4)

and the (*N*_{ψ}*N*_{s}×*N*_{s}) orientation matrix as,

$$\Psi =[\begin{array}{cccc}\hfill {\psi}_{1}\hfill & \hfill 0\hfill & \hfill \hfill & \hfill 0\hfill \\ \hfill 0\hfill & \hfill {\psi}_{2}\hfill & \hfill \hfill & \hfill 0\hfill \\ \hfill 0\hfill & \hfill 0\hfill & \hfill \dots \hfill & \hfill \hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill \hfill & \hfill 0\hfill \\ \hfill \hfill \hfill \hfill & \hfill {\psi}_{{N}_{s}}\hfill \hfill \hfill & ]\end{array}$$

(5)

where **0** is a (*N*_{ψ}×1) vector of zeros. Substituting the product of *L*_{C} and **Ψ** from Equations (4) and (5) into Equation (3) produces,

$${\mathit{b}}_{n}={\mathit{L}}_{C}{\Psi}_{{s}_{n}}+{y}_{n}+{\nu}_{n}$$

(6)

We let *F*_{s} = *L*_{C} and *y*_{n} = *F*_{x}*x*_{n}, where *F*_{s} is the (*N*_{b}×*N*_{s}) forward solution for the neural sources of interest, F_{x} is the (*N*_{b}×*N*_{x}) forward solution for the interference sources, and *x*_{n} is the (*N*_{x}×1) vector of interference sources. The final expression for the magnetic field is found by inserting Equations (7) and (8) into Equation (6), which produces,

$${\mathit{b}}_{n}={\mathit{F}}_{s}{\mathit{s}}_{n}+{\mathit{F}}_{x}{\mathit{x}}_{n}+{\nu}_{n}$$

(7)

The physical model given by Equation (7) expresses the observations, *b*_{n}, as a function of three different types of signals: neural sources, interference sources, and noise. The neural source vector, *s*_{n}, is defined as the application-specific sources of interest. The interference source vector, *x*_{n}, is defined as the set of signals that are of no interest and are not generated in the sensors. They may originate in the brain, other parts of the body (e.g., heart), or outside the body. The noise is defined as the set of signals that originate in the sensors. Notice that, under H6, Equation (7) is identical to the standard ICA model, with additive noise, which is given by *b*_{n} = ‾ ** F** ‾

Our goal is to localize each of the neural sources. In our model of the data, given by Equation (7), only the observations, *b*_{n}, are known. Anything else we desire must be estimated from the data. We propose to perform localization using two different estimates of *F*_{s}, which we hereafter refer to as simply the forward solution (since *F*_{x} is of no interest). We denote the ICA-derived and physics-derived estimates of the forward solution as ** A** and

The proposed source localization method consists of three main steps. The steps are,

- Estimate ICA-derived forward solution
- Denoise using PFA
- Demix using ICA
- Select neural sources

- Estimate physics-derived forward solution
- Form grid
- Compute composite lead field matrix
- Estimate orientation matrix

- Correlate columns of the two forward solutions

Many ICA algorithms estimate the inverse solution, as opposed to the forward solution. In this case we need to apply a pseudo-inverse. First, we choose values for *N*_{s} and *N*_{x}. Then we learn the three matrices that yield the source estimate vector, ${\widehat{\stackrel{\u2012}{\mathit{s}}}}_{n}$, and the neural source estimate vector, ${\widehat{\stackrel{\u2012}{\mathit{s}}}}_{n}$, which are given by,

$$\begin{array}{cc}\hfill {\widehat{\stackrel{\u2012}{\mathit{s}}}}_{n}& ={\mathit{W}}_{d}{\mathit{W}}_{p}{\mathit{b}}_{n}\hfill \\ \hfill {\widehat{\mathit{s}}}_{n}& =\stackrel{\u2012}{\mathit{I}}{\widehat{\stackrel{\u2012}{\mathit{s}}}}_{n}\hfill \end{array}$$

(8)

where *W*_{p} is a (*N*_{p}×*N*_{b}) projection matrix, *W*_{d} is a (*N*_{p}×*N*_{p}) demixing matrix, and ‾** I** is a (

$$\mathit{A}={\left({\stackrel{\u2012}{I}W}_{d}{\mathit{W}}_{p}\right)}^{\u2020}$$

(9)

In Step 1a we learn the projection matrix, *W*_{p}. The projection matrix is intended to suppress the undesired signals, especially the noise (under the proper conditions the demixing matrix can suppress the interference sources). It also helps to control the complexity of the proposed algorithm by reducing the dimensionality of *b*_{n} prior to learning the demixing matrix. The projection matrix can be found using any desired dimension-reduction method, including methods such as Principal Components Analysis (PCA), Partitioned Factor Analysis, and a supervised method appropriate for brain computer interfaces (BCI) known as the common spatial pattern decomposition [34].

Partitioned Factor Analysis (PFA) denoising [35],[36] is an alternative to PCA. PFA can be used when all or part of the data can be partitioned into two regions: one where the neural sources are known to be active and one where they are known to be inactive. This condition is approximately met when the data are collected using the stimulus-evoked paradigm and then averaged over trials [36]. In this case the active region is assumed to be the post-stimulus window and the inactive region is assumed to be the pre-stimulus window. PFA is designed to suppress both the interference and the noise and it directly extracts the neural sources from the data. Thus, for PFA denoising we (ideally) use *N*_{p}=*N*_{s}.

In Step 1b we learn the demixing matrix, *W*_{d}. The demixing matrix is intended to separate the sources. ICA algorithms typically learn *W*_{d} by making the source estimates, ${\widehat{\stackrel{\u2012}{\mathit{s}}}}_{n}$ as statistically independent as possible. The demixing matrix also enables us to suppress the interference sources through the process of selection.

In Step 1c we learn the selection matrix, ‾** I**. Recall that there is no need for selection if PFA is used, in which case we let ‾

The forward solution, *F*_{s}, is a function of the location of the *N*_{b} sensors and the *N*_{s} neural sources. For source localization applications the locations of the sensors are oftentimes known. However, the locations of the neural sources are always, by definition, unknown. To address this shortcoming we form a grid of *N*_{g} putative locations and then we use the data to estimate the activity of the putative neural source at each of these voxels. This is the approach taken by beamforming methods [19], which estimate an upper bound of the energy at each voxel. One consequence of using a grid is that the forward solution, *F*_{s}, and its physics-derived estimate, ** F**, differ in size. The former is the (

In Step 2a we build a dense uniform grid of *N*_{g} >> *N*_{s} putative locations throughout the brain. The grid is formed over the region corresponding to the brain. The brain region is determined by the co-registration of MEG data with a structural magnetic resonance image (MRI) [37]. More specifically, we place three fiducial coils (one nasion and two opposing preauricular points) on each subject/patient before acquiring their MEG and structural MRI data. We use the NUTMEG suite [18] to load both types of data into memory and then select the brain region with the mouse by clicking on the structural MRI. Once the desired grid resolution is chosen, the NUTMEG program returns *r*_{1} to *r*_{Ng}, which are the coordinates of the *N*_{g} putative source locations.

In Step 2b we learn the (spatially oversampled) composite lead field matrix, *L*_{C}. This (*N*_{b}×N_{ψ}*N*_{g}) matrix is constructed using Equation (4) with the exception that the neural sources are assumed to be located at the *N*_{g} grid points.

In Step 2c we learn the (spatially oversampled) orientation matrix, **Ψ**. This (*N*_{ψ}*N*_{g}×*N*_{g}) matrix is constructed using Equation (5) with the exception that it must use the same set of *N*_{g} locations used to create *L*_{C}. The (*N*_{ψ}×1) orientation vector at voxel *j*, *ψ*_{j}, is chosen so that it maximizes a similarity measure between ** A** and the

The parameter ${\psi}_{i,j}^{}$ is given by,

$$\begin{array}{c}\hfill {\psi}_{i,j}^{}=\underset{{\psi}_{j}}{\overset{\mathrm{argmax}}{}}\frac{{\mathit{A}}_{i}^{T}{\mathit{F}}_{j}{\left({\mathit{A}}_{i}^{T}{\mathit{A}}_{i}{\mathit{F}}_{j}^{T}{\mathit{F}}_{j}\right)}^{1/2}}{}\hfill & =\underset{{\psi}_{j}}{\overset{\mathrm{argmax}}{}}\frac{{\psi}_{j}^{T}{\mathit{L}}_{j}^{T}{\mathit{A}}_{i}{\mathit{A}}_{i}^{T}{\mathit{L}}_{j}{\psi}_{j}}{{\psi}_{j}^{T}{\mathit{L}}_{j}^{T}{\mathit{L}}_{j}{\psi}_{j}}\hfill \hfill \end{array}$$

(10)

where *A*_{i} is the *i ^{th}* column of

$$\begin{array}{cc}\hfill f\left(j\right)& =\underset{i}{\overset{\mathrm{argmax}}{}}\frac{{\mathit{A}}_{i}^{T}{\mathit{F}}_{i,j}^{}\hfill & =\underset{i}{\overset{\mathrm{argmax}}{}}\frac{{\mathit{A}}_{i}^{T}{\mathit{L}}_{j}{\psi}_{i,j}^{}}{}\hfill}{}\hfill \end{array}$$

(11)

where ${\mathit{F}}_{i,j}^{}$. We denote the overall optimum orientation vector as ${\psi}_{f\left(j\right),j}^{}$.

If assumptions H1 and H2 are valid, the orientation vector at each (true) neural source location is estimated accurately, and if the true locations of the neural sources happen to fall on the chosen grid, which is guaranteed for uniform sampling in the limit as *N*_{g} → ∞, then (by definition) each of the *N*_{s} columns of ** F** that correspond to a neural source is identical to one of the columns in

If, in addition, assumptions H3 to H6 are valid, the userdefined value of *N*_{s} equals the true number of neural sources, the combination of denoising and demixing successfully removes all interference and noise, and ** A** is such that the neural source estimates are mutually statistically independent then each of the

The consequence of the two preceding paragraphs is that each of the columns of ** A** also equals (to within a multiplicative scalar) one of the columns in

$$\begin{array}{cc}\hfill {\rho}_{j}& =\frac{{\mathit{A}}_{f\left(j\right)}^{T}{\mathit{F}}_{f\left(j\right),j}^{}\hfill & =\frac{{\mathit{A}}_{f\left(j\right)}^{T}{\mathit{L}}_{j}{\psi}_{f\left(j\right),j}^{}}{}\hfill}{}\hfill \end{array}$$

(12)

where *ρ*_{j} is the absolute value of the normalized correlation coefficient (except that the means of the columns are not removed), which is guaranteed to take values between 0 and 1, and Equation (11) is used to determine *f*(*j*), which is the column of ** A** that is most similar to the

Due to the well-known gain and permutation indeterminacies of ICA [2], the ICA-derived estimate of the forward solution, under the ideal conditions listed above, is related to the true forward solution using *F*_{s} = ** DPA**, where

For the synthetic data we use a total of 500 different evoked-response data sets, which corresponds to 50 Monte Carlo trials for each of 2 experiments and 5 different values of the input signal-to-interference ratio (SIR), which is defined by,

$$\mathrm{SIR}=10{\mathrm{log}}_{10}\left(\frac{\sum _{i=1}^{{N}_{b}}\sum _{n=0}^{{N}_{\text{post}}-1}{\left(\sum _{k=1}^{K}{\mathit{F}}_{s}{\mathit{s}}_{n,k}\right)}_{i}^{2}}{\sum _{i=1}^{{N}_{b}}\sum _{n=0}^{{N}_{\text{post}}-1}{\left(\sum _{k=1}^{K}{\mathit{F}}_{x}{\mathit{x}}_{n,k}\right)}_{i}^{2}}\right)\left(\mathrm{dB}\right)$$

(13)

where the stimulus onset is *n*=0, *s*_{n,k} and *x*_{n,k} are (*N*_{s}×1) vectors, ${\left({\sum}_{k}{\mathit{F}}_{s}{\mathit{s}}_{n,k}\right)}_{i}^{2}$ is the square of the (scalar) *i ^{th}* element of the sum, and ${\left({\sum}_{k}{\mathit{F}}_{x}{\mathit{x}}_{n,k}\right)}_{i}^{2}$ is similarly defined. Spatially and temporally white Gaussian noise is added to each sensor. The power of the isotropic noise is chosen so that the input signal-to-noise ratio (SNR) is −5 dB. Each dataset is 1000 samples in length, 5/8 of which corresponds to the pre-stimulus period and 3/8 to the post-stimulus period, and the sampling frequency is 1 kHz. Each neural source has a damped sinusoidal time course having a random frequency between 5–20 Hz, whereas each (neural) interference source has an un-damped sinusoidal time course. Damping is performed such that the energy of the neural sources is concentrated within the post-stimulus period. Determination of the lead field matrices requires knowledge of the spatial locations of the sensors. For this we use the true spatial locations of the sensors corresponding to an in-house 275-channel whole head MEG system (Omega 2000, VSM MedTech Inc., Port Coquitlam, Canada). To simplify the analysis all of the sources of the synthetic data are placed within the sagittal plane that passes through the center of the head. The voxel size is 3.7 mm per side and

The performance is measured using the localization error, which is the mean distance between the true neural source locations and the estimated locations. The estimated locations correspond to the tallest two peaks of the tomographic map. Since there are 2 neural sources, there are two different ways of associating the true source locations with the estimated source locations. We use the more meaningful association, which is the one that has a smaller mean distance.

We compare the performance of the proposed column correlation method, 3 scalar minimum variance beamformers (operating on the raw data and operating on denoised data using PCA and PFA denoising), and 2 methods that use ICA denoising prior to localization (sLORETA and dipole fits). The proposed column correlation method uses PFA denoising and the AR-MOG ICA algorithm with *K*_{g} = 0 and *K*_{Q} = 3 [38]. The AR-MOG algorithm assumes each source distribution is a mixture of Gaussians, the parameters of which are learned from the data. Hence, it is appropriate for EEG and MEG sources, which are thought to have multi-modal distributions [40]. For the beamforming methods we use the denoised data covariance matrix whenever denoising is used. For all 6 methods we use Bayesian regularization for inverting ill-conditioned matrices, where a matrix is defined to be ill-conditioned when the ratio of the maximum eigenvalue to the minimum eigenvalue is greater than 1e12.

First we report the computational complexities of the 6 methods, which are measured in terms of mean completion time per dataset. The mean completion times are 3.0, 3.1, 18.7, 19.3, 40.6, and 63.0 seconds for the PCA/Beamformer, Raw Data/Beamformer, PFA/Beamformer, PFA/Correlate-Columns, ICA/Dipole Fit, and ICA/sLORETA, respectively.

Figures Figures11 and and22 pertain to the first experiment. Figure 1 shows contour plots for 4 of the methods, where the y-axis represents left-right position and the z-axis represents vertical position. For this example, the Raw Data/Beamformer has peaks at all 5 (neural and interference) sources (the peaks at the neural sources are barely visible), the PCA/Beamformer map has a single peak occurring between 2 of the interference sources, whereas the PFA/Beamformer and PFA/Column Correlation methods have peaks near both neural sources.

Contour plots of the tomographic maps for a single Monte Carlo trial. The •’s and ×’s correspond to the true neural source and interference source locations, respectively. Results are shown for Rawdata/Beamformer (upper **...**

Figure 2 shows the mean localization error for the first experiment as a function of the (input) SIR. The ICA/Dipole Fits method, not shown in order to reduce clutter, performs slightly better than PCA/Beamforming for low SIR and slightly worse than PCA/Beamforming for high SIR. The ICA/Dipole Fits and PCA/Beamformer methods consistently perform the worst, PFA/Column Correlation has the best performance for most values of SIR, and the Raw Data/Beamformer performs poorly for low SIR and performs the best (as expected) for extremely high SIR.

Figures Figures33 and and44 pertain to the second experiment. In this experiment we replace one of the focal neural sources with a distributed source. The single distributed source spans a 1 cm × 1 cm area and consists of 9 uniformly-placed focal sources, each of which has an independently-drawn time course and orientation. Each of the methods continues to assume that there are only 2 sources of interest, although there are actually 10. Figure 3 shows the contour plots of 4 of the methods. In this example, the Raw Data/Beamformer has a single large peak at one of the interference sources and the PCA/Beamformer has peaks at 2 of the interference sources, whereas PFA/Beamformer and PFA/Column Correlation have peaks near both of the neural sources and 1 or 2 extraneous peaks.

Contour plots of the tomographic maps for a single Monte Carlo trial. The •’s and ×’s correspond to the true neural source and interference source locations, respectively. Results are shown for Rawdata/Beamformer (upper **...**

Figure 4 shows the mean localization error for the second experiment. The ICA/Dipole Fits method (not shown) performs the same as ICA/sLORETA for low SIR and slightly worse than PCA/Beamforming for high SIR. The ICA/Dipole Fits, ICA/sLORETA, and PCA/Beamformer methods consistently perform the worst and the PFA/Column Correlation consistently performs the best.

Next we present results from real auditory-evoked data from a male subject in his mid 20’s. The stimulus is a 1 kHz tone of 400 ms duration, which is presented binaurally once every 1.35 sec (±100 ms jitter). The sample rate is 4 kHz, the number of trials is 247, and each trial lasts for 800 ms. We apply a 4 Hz high pass filter to the data in addition to the hardware anti-alias filter. The structural MRI data are collected using a 1.5 T Signa Excite by General Electric (Milwaukee, WI). The MEG data are collected using the 275-channel VSM system described earlier. The voxel size is 7 mm per side and *N*_{g} = 7000. For PFA denoising we use *N*_{s} = 2 and *N*_{x} = 5 and for ICA demixing we use AR-MOG with *K*_{g} =0 and *K*_{Q}=3.

Figure 5 shows the trial-averaged waveforms for 20 and 247 trials of data, where the trials are aligned according to the stimulus onset prior to averaging. Also shown in this figure are the two source estimates of the proposed method found for the case that there are only 20 trials. The M100 is visible in the source estimates of the proposed method with as few as 5 trials, whereas it takes 10 or 20 trials in order for the M100 to be evident in the trial-averaged data (not shown). Also, the proposed method indicates there is significant activity around 200 ms post-stimulus using as few as 20 trials, although this later activity cannot be seen in the trial-averaged data until there are much more than 50 trials.

Trial-averaged data computed using the first 20 trials (upper) and using all 247 trials (middle). The two estimated sources, for the proposed method, computed using 20 trials of data (lower).

Figure 6 shows the localization results of the PFA/Column Correlation method using the three standard orthogonal projections and the neurological convention. The voxels having the highest energy are highlighted, where the energy of each voxel is found by time-averaging over the post-stimulus period. Both coronal slices of the tomographic map produced by the proposed method show activation in the primary auditory cortex.

The three methods, known to the authors, that are most similar to the proposed method are Bayesian methods by Knuth [30] and Roberts [43], and a convolutive ICA method by Took et al. [39]. These methods assume that the sources are statistically independent and they incorporate a physical forward model, as does the proposed method. The first of the methods assumes that the sources radiate spherically and the amplitude falls as the distance squared. This is unlike our model, where each source has an orientation (hence, radiation is not spherical) and the magnitude falls as the distance cubed. The second method imposes a non-negativity constraint on the ICA mixing matrix. The third method is designed for convolutive mixtures and it requires that the number of sensors is equal to or larger than the number of potential source locations. This is unlike the standard model for EEG and MEG data, which consists of an instantaneous mixture of the sources and where the number of putative source locations far exceeds the number of sensors.

Good denoising performance does not necessarily equate with good localization performance. For example, the mean PCA/Beamformer performance shown in Figures Figures22 and and44 is always worse than for the Raw Data/Beamformer even though PCA successfully removes noise (as measured by output SNR, the results of which are not shown here). For our data, PFA denoising outperforms PCA denoising when both use the same localization method (beamformer) and column correlation outperforms the beamformer when both use the same denoising method (PFA). The peaks in the tomographic map of the proposed method are not as sharp as the beamformer methods since the correlation of two columns of *F*_{s} decays slowly as the distance between the two associated voxels increases (especially if the orientations associated with the two voxels are approximately collinear). ICA denoising consistently suffers from over-fitting, which is not surprising since it must learn ${N}_{b}^{2}=75625$ parameters. Hence, the two methods that use ICA denoising, ICA/sLORETA and ICA/Dipole Fit, perform poorly. The proposed method, on the other hand, must learn approximately ${N}_{b}{N}_{g}+{N}_{s}^{2}=554$ parameters. The ICA/sLORETA, ICA/Dipole Fit, and all 3 beamformer methods require the inversion of a (*N*_{b}×*N*_{b}) matrix (*N*_{b} = 275 for our data), whereas the proposed method requires the inversion of a (*N*_{s}×*N*_{s}) matrix (*N*_{s} = 2 for our data). Finally, the magnitude of the normalized correlation between the localization and separation metrics is 0.46, which indicates that the localization performance of the proposed method is not overly sensitive to the quality of the source separation.

This work was partially supported by NIH grant F32-NS052048 and NIH Grant R01-DC004855.

**Kenneth E. Hild II** (M’91, SM’05) received a Ph.D. in Electrical Engineering from the University of Florida, Gainesville, where he studied information theoretic learning and blind source separation in the Computational NeuroEngineering Laboratory. Later, he studied biomedical informatics at Stanford University, Palo Alto. From 1995 to 1999, Dr. Hild worked in the Advanced Concepts group at Seagate Technologies. From 2000 to 2003, he taught several graduate-level classes on adaptive filter theory and stochastic processes at the University of Florida. From 2003 to 2007, he was at the Biomagnetic Imaging Laboratory at UCSF, San Francisco, where he applied variational Bayesian techniques for biomedical signal processing of encephalographic and cardiographic data. He is currently employed at the Oregon Health & Science University, Portland, where he is performing research on human and computer vision. His interests include full Bayesian models for competing versions of evolution, reinforcement learning using a hierarchy of innate concepts, and general-purpose machine learning algorithms.

**Srikantan S. Nagarajan** (S’90, M’91, SM’07) began his research career at the Applied Neural Control Laboratory of Case Western Reserve University (CWRU), where he obtained both an M.S. and a Ph.D. in Biomedical Engineering. After graduate school, he did a postdoctoral fellowship at the Keck Center for Integrative Neuroscience at the University of California, San Francisco, under the mentorship of Drs. Michael Merzenich and Christoph Schreiner. Subsequently, he was a tenure-track faculty member in the Department of Bioengineering at the University of Utah. Currently, he is the Director of the Biomagnetic Imaging Laboratory, an Associate Professor in Residence in the Department of Radiology, and a member in the UCSF/UCB Joint Graduate Program in Bioengineering. His research interests are in the area of Neural Engineering where his goal is to better understand dynamics of brain networks involved in processing, and learning, of complex human behaviors such as speech, through the development of functional brain imaging technologies.

[1] Jutten C, Herault J. Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture. Signal Processing. 1991 July;24(1):1–10.

[2] Cardoso JF. Blind signal separation: Statistical principles. Proc. IEEE. 1998 Oct;86(10):2009–2025.

[3] Cardoso JF. Super-symmetric decomposition of the fourth-order cumulant tensor: Blind identification of more sources than sensors; Intl. Conf. on Acoustics, Speech, and Signal Processing; Toronto, Canada. May 1991.pp. 3109–3112.

[4] Yu Y, Petropulu AP. PARAFAC-Based Blind Estimation Of Possibly Underdetermined Convolutive MIMO Systems. IEEE Trans. on Signal Proc. 2008 Jan;56(1):111–124.

[5] Comon P, Rajih M. Blind identification of under-determined mixtures based on the characteristic function. Signal Processing. 2006 Sept;86(9):2271–2281.

[6] Atashbar M, Kahaei MH. Information and Communication Technologies. Vol. 1. Damascus, Syria: Apr, 2006. Multi-Speakers Direction-Of-Arrival Finding using ICA; pp. 1212–1213.

[7] Raju K, Ristaniemi T, Karhunen J, Oja E. Jammer Suppression in DS-CDMA Arrays Using Independent Component Analysis. IEEE Trans. on Wireless Commun. 2006 Jan;5(1)

[8] Bell AJ, Sejnowski TJ. An information-maximization approach to blind separation and blind deconvolution. Neural Computation. 1995 Nov;7(6):1129–1159. [PubMed]

[9] Pham DT. Mutual information approach to blind separation of stationary sources. IEEE Trans. on Information Theory. 2002 July;48(7):1935–1946.

[10] Kwak N, Choi CH, Choi JY. Feature Extraction Using ICA. Lecture Notes in Computer Science; Intl. Conf. on Artificial Neural Networks; Vienna, Austria. Aug. 2001; pp. 568–573. Springer, Berlin.

[11] Vigario R, Jousmaki V, Hamalainen M, Hari R, Oja E. Independent component analysis for identification of artifacts in magnetoencephalographic recordings; Advances in Neural Information Proc. Systems; MIT Press, Cambridge, MA. Nov. 1994.pp. 229–235.

[12] James CJ, Gibson OJ. Temporally constrained ICA: An application to artifact rejection in electromagnetic brain signal analysis. IEEE Trans. on Biomedical Engineering. 2003 Sept;50:1108–1116. [PubMed]

[13] Chen L, Wang C, Shih IM, Wang TL, Zhang Z, Wang Y, Clarke R, Hoffman E, Xuan J. Biomarker Identification by Knowledge-Driven Multi-Level ICA and Motif Analysis; Intl. Conf. on Machine Learning and Applications; Cincinnati, Ohio. Dec. 2007.pp. 560–566.

[14] Dawy Z, Sarkis M, Hagenauer J, Mueller JC. Fine-Scale Genetic Mapping Using Independent Component Analysis. IEEE/ACM Trans. on Com. Biology and Bioinformatics. 2008 July-Sept;5:448–460. [PubMed]

[15] Naik GR, Kumar DK, Weghorn H. ICA based identification of sources in sEMG; Intl. Conf. on Intelligent Sensors, Sensor Networks and Information; Melbourne, Australia. Dec. 2007.pp. 619–624.

[16] Sarvas J. Basic mathematical and electromagnetic concepts of the biomagnetic inverse problem. Physics in Medicine and Biology. 1987 Jan;32(1):11–22. [PubMed]

[17] Delorme A, Makeig S. EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. Journal of Neuroscience Methods. 2004 Mar;134:9–21. [PubMed]

[18] Dalal SS, Zumer JM, Agrawal V, Hild KE, Sekihara K, Nagarajan SS. NUTMEG: A neuromagnetic source reconstruction toolbox. Neurology and Clinical Neurophysiology. 2004 Nov;52:1–4. [PMC free article] [PubMed]

[19] Sekihara K, Nagarajan S. Adaptive Spatial Filters for Electromagnetic Brain Imaging. Springer; NY: Jun, 2008.

[20] Sekihara K, Nagarajan S, Poeppel D, Marantz A. Asymptotic SNR of scalar and vector minimum-variance beamformers for neuromagnetic source reconstruction. IEEE Trans. on Biomedical Engineering. 2004 Oct;51(10):1726–1734. [PubMed]

[21] Mosher JC, Lewis PS, Leahy RM. Multiple dipole modeling and localization from spatio-temporal MEG data. IEEE Trans. on Biomedical Engineering. 1992 June;39(6):541–557. [PubMed]

[22] Ossadtchi A, Baillet S, Mosher JC, Thyerlei D, Sutherling W, Leahy RM. Automated interictal spike detection and source localization in magnetoencephalography using independent components analysis and spatio-temporal clustering. Clinical Neurophysiology. 2004 Mar;115(3):508–522. [PubMed]

[23] Tang AC, Pearlmutter BA, Malaszenko NA, Phung DB, Reeb BC. Independent components of magnetoencephalography: Localization. Neural Computation. 2002 Aug;14:1827–1858. [PubMed]

[24] Leahy RM, Mosher JC, Spencer ME, Huang MX, Lewine JD. A study of dipole localization accuracy for MEG and EEG using a human skull phantom. Electroencephalography and Clinical Neurophysiology. 1998 April;107(2):159–173. [PubMed]

[25] Kobayashi K, Akiyama T, Nakahori T, Yoshinaga H, Gotman J. Systematic source estimation of spikes by a combination of independent component analysis and RAP-MUSIC: I: Principles and simulation study. Clinical Neurophysiology. 2002 May;113:713–724. [PubMed]

[26] Grosse-Wentrup M, Gramann K, Wascher E, Buss M. EEG source localization for brain-computer-interfaces; IEEE Engineering in Medicine and Biology Society Conference on Neural Engineering; Arlington, VA. Mar. 2005.pp. 128–131.

[27] Pascual-Marqui RD. Review of Methods for Solving the EEG Inverse Problem. Intl. Journal of Bioelectromagnetism. 1999;1:75–86.

[28] Niedermeyer E, Lopes da Silva F. Electroencephalography: Basic Principles, Clinical Applications, and Related Fields. 5th ed. Lippincott Williams & Wilkins; Philadelphia, PA: 2004. p. 280.

[29] Knuth K. Maximum Entropy and Bayesian Methods. Munich, Germany: 1998. Convergent Bayesian formulations of blind source separation and electromagnetic source estimation; pp. 217–226.

[30] Knuth KH. Bayesian source separation and localization; Proc. of SPIE, Bayesian Inference for Inverse Problems; San Diego, CA, SPIE Press, Bellingham, WA. July 1998.pp. 147–158.

[31] Hild KE, II, Nagarajan S. Source localization of EEG/MEG data by correlating columns of ICA solution with lead field matrix; Intl. Conf. on Acoustics, Speech, and Signal Processing; Honolulu, Hawaii. April 2007.pp. 1177–1180.

[32] Cichocki A, Amari S. Adaptive blind signal and image processing. John Wiley & Sons; Chichester, England: 2003. pp. 305–309.

[33] Mosher JC, Leahy RM, Lewis PS. EEG and MEG: Forward Solutions for Inverse Methods. IEEE Trans. on Biomedical Engr. 1999 Mar;46(3):245–259. [PubMed]

[34] Grosse-Wentrup M, Buss M. Multiclass Common Spatial Patterns and Information Theoretic Feature Extraction. IEEE Trans. on Biomedical Engineering. 2008 Aug;55(8):1991–2000. [PubMed]

[35] Nagarajan S, Attius HT, Sekihara K, Hild KE., II . Independent Component Analysis and Signal Separation. Charleston, SC: Mar, 2006. Partitioned factor analysis for interference suppression and source extraction; pp. 189–197.

[36] Nagarajan Srikantan S., Attias Hagai T., Hild Kenneth E., II, Sekihara Kensuke. A probabilistic algorithm for robust interference suppression in bioelectromagnetic sensor data. Statistics in Medicine. 2007 Sept 20;26(21):3886–3910. [PubMed]

[37] Adjamiana P, Barnesa GR, Hillebranda A, Hollidaya IE, Singha KD, Furlonga PL, Harringtonb E, Barclayb CW, Route PJG. Co-registration of magnetoencephalography with magnetic resonance imaging using bite-bar-based fiducials and surface-matching. Clinical Neurophysiology. 2004 Mar;115(3):691–698. [PubMed]

[38] Hild KE, II, Attias HT, Nagarajan S. An Expectation-Maximization method for spatio-temporal blind source separation using an AR-MOG source model. IEEE Trans. on Neural Networks. 2008 Mar;19:508–519. [PMC free article] [PubMed]

[39] Took CC, Sanei S, Rickard S, Chambers J, Dunne S. Fractional Delay Estimation for Blind Source Separation and Localization of Temporomandibular Joint Sounds. IEEE Trans. on Biomedical Engr. 2008 Mar;55(3):949–956. [PubMed]

[40] Knuth K. Maximum Entropy and Bayesian Methods. Boise, ID: Aug, 1997. Difficulties applying recent blind source separation techniques to EEG and MEG; pp. 209–222.

[41] Weinstein Ehud, Feder Meir, Oppenheim Alan. Multi-channel signal separation by decorrelation. IEEE Trans. on Speech and Audio Proc. 1993 Oct;1(4):405–413.

[42] Parra Lucas, Spence Clay. Convolutive blind separation of non-stationary sources. IEEE Trans. on Speech and Audio Proc. 2000 May;8(3):320–327.

[43] Roberts S, Choudrey R. Deterministic and Statistical Methods in Machine Learning. Vol. 3635. Springer; Berlin: 2005. Bayesian independent component analysis with prior constraints: An application in biosignal analysis; pp. 159–179.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |