|Home | About | Journals | Submit | Contact Us | Français|
The auditory oddball task is a well-studied stimulus paradigm used to investigate the neural correlates of simple target detection. It elicits several classic event-related potentials (ERPs), the most prominent being the P300 which is seen as a neural correlate of subjects' detection of rare (target) stimuli. Though trial-averaging is typically used to identify and characterize such ERPs, their latency and amplitude can vary on a trial-to-trial basis reflecting variability in the underlying neural information processing. Here we simultaneously recorded EEG and fMRI during an auditory oddball task and identified cortical areas correlated with the trial-to-trial variability of task-discriminating EEG components. Unique to our approach is a linear multivariate method for identifying task-discriminating components within specific stimulus- or response- locked time windows. We find fMRI activations indicative of distinct processes that contribute to the single-trial variability during target detection. These regions are different from those found using standard, including trial-averaged, regressors. Of particular note is strong activation of the lateral occipital complex (LOC). The LOC was not seen when using traditional event-related regressors. Though LOC is typically associated with visual/spatial attention, its activation in an auditory oddball task, where attention can wax and wane from trial-to-trial, indicates it may be part of a more general attention network involved in allocating resources for target detection and decision making. Our results show that trial-to-trial variability in EEG components, acquired simultaneously with fMRI, can yield task-relevant BOLD activations that are otherwise unobservable using traditional fMRI analysis.
Information processing during even the most simple perceptual discrimination tasks evolves through many steps, including stimulus detection, evidence accumulation, categorization, response planning and execution. One of the most well-studied perceptual discrimination tasks is the auditory oddball, which can be seen as a very simple example of task-relevant target detection. During this task, subjects are asked to discriminate target (or rare) tones from standard (or distractor) tones, and make a response indicating the detection of the target tone. The task has been well-studied electrophysiologically, with the P300 or P3b, an event-related potential (ERP) seen in the electroencephogram (EEG), identified as a neural correlate of the underlying target detection processes (Donchin and Coles, 1988; Picton, 1992; Polich, 2007). Though traditionally EEG, due to its millisecond temporal resolution, has been the modality of choice for studying the auditory oddball and its associated ERPs such as the P300, fMRI has more recently been used as a way to localize cortical areas involved in the generation of the underlying neural activity (Friedman et al., 2008; Kiehl et al., 2005; Linden et al., 1999). Ideally, a principled integration of the two modalities would yield a more precise spatio-temporal interpretation of the constituent neural processes underlying this simple form of target detection.
Simultaneous EEG/fMRI (Benar et al., 2007; Bledowski et al., 2004; Debener et al., 2006; Debener et al., 2005; Eichele et al., 2005; Goldman et al., 2000, 2002; Linden et al., 1999; Martinez-Montes et al., 2004; Mulert et al., 2004; Mulert et al., 2008) offers the opportunity to consider such an integration. An inherent challenge in simultaneous EEG/fMRI, however, is how to couple the electrophysiological activity with the blood oxygenation level dependent (BOLD) signal in a way that provides added insight into the cortical circuitry; in other words, insight that could not be provided by either modality alone or by acquisitions that were not simultaneous.
One such way to integrate these modalities is to correlate the BOLD signal with trial-to-trial variability of brain activity measured by simultaneously acquired EEG. ERP amplitude and latency can vary on a trial-to-trial basis and this variance can potentially be exploited for teasing apart the steps in information processing. For example, Benar et. al. (Benar et al., 2007) correlated single-trial variability of P300 amplitude and latency from a single electrode with fMRI for the auditory oddball task. While isolating brain activity related to trial-to-trial variability of features from individual EEG channels may be informative, it does not exploit the information in correlations between electrodes which can be captured from multi-channel EEG. Multivariate analysis of the EEG, for example via independent component analysis (ICA), has been used to exploit such statistical correlations between electrodes, particularly in high density arrays, to decompose ERPs into distinct components—i.e. to address the “neural cocktail party problem” (Brown et al., 2001). For example, Makeig et al (Makeig et al., 1999) used ICA and found separate independent, and presumably functionally distinct components within the late positive complex (e.g. the P3f, P3a, P3b and Pmp). ICA has also recently been proposed as a method for analyzing simultaneous EEG and fMRI data (Debener et al., 2005; Eichele et al., 2008; Eichele et al., 2005; Moosmann et al., 2008), as have other methods for blind source separation (Ritter et al., 2008).
One issue with the above methods is that they find components in the data blindly, and thus the identified components do not have a well-defined functional significance. Our group has addressed this using a different multivariate method to tease apart these separate processes in the EEG by finding components in different EEG time windows that maximally discriminate between two event types (Gerson et al., 2005). We have found task discriminating components that are stimulus-locked as well as response-locked, and like Makeig et al, have also found distinct components in the late positive complex (Gerson et al. 2005). n this paper we use both stimulus-locked and response-locked single-trial analysis of the EEG to identify task-relevant components that discriminate stimulus condition in an auditory oddball task. We then use the single-trial amplitudes of the discriminative components for different time windows to construct regressors for correlation with the BOLD signal. We specifically focus on the fMRI activity correlated with the single-trial EEG variability, for this cannot be explained solely by stimulus or behavioral measures, such as event type or reaction time.
Eleven healthy normal subjects (6 female, mean age 31, range 25-38) participated in the experiment. Informed consent was obtained from all participants in accordance with the guidelines and approval of the Columbia University Institutional Review Board.
An auditory oddball paradigm was used, with standard tones of frequency 350 Hz and oddball (target) tones of frequency 500 Hz. Auditory stimuli were presented through MR compatible headphones that did not contain any electronics that might add artifact to EEG, and stimulus intensity was set to 85 dB as measured at the headphones. Tones were presented for 200 ms with an inter-stimulus interval (ISI) chosen from a uniform distribution between 2 and 3 seconds in increments of 200 ms. The probability of a standard tone was 0.8, with the probability of a target tone being 0.2. Subjects were instructed to close their eyes during all experiments and the lights in the scanner room were off—this was done to minimize artifacts in the EEG due to blinking. Subjects were also instructed to press a button with the index finger of their right hand as soon as they heard the target tone. There were a total of 50 target and 200 standard trials for each subject.
Whole brain fMRI data were collected on a 1.5T scanner (Philips Medical Systems, Bothell, WA). Functional EPI data were acquired using 15 slices of 64 × 64 voxels with in-plane resolution of 3.125 mm and slice thickness of 8 mm. Repetition time (TR) was 3000 ms with an echo time (TE) of 50 ms. Structural scans were performed using a T1-weighted spoiled gradient recalled (SPGR) sequence (72 slices; 256 ×256; 2 mm thickness; 0.86 mm in-plane resolution).
EEG was collected simultaneously using a custom-built MR-compatible system consisting of a multi-channel magnet-compatible differential amplifier with a bipolar electrode EEG cap (Goldman et al., 2005; Sajda et al., 2007). The cap consists of a 36 Ag/AgCl scalp electrode montage including left and right mastoid. Each electrode has in-line 10 kOhm surface-mount resistors to ensure subject safety. Leads for bipolar electrode pairs were twisted for their entire length to minimize inductive pick-up. All input impedances were < 20 kOhm (this includes the 10 kOhm surface mount resistors on each electrode). All channels were sampled at 1 kHz. Analog-to-digital conversion of the EEG was synchronized to the MR scanner clock, to enable removal of gradient artifacts, by sending a transistor-transisitor logic (TTL) trigger pulse at the start of each image TR to a field programmable array (FPGA) card (National Instruments, Austin, TX), programmed to emit a pulse train that reset with each TTL trigger pulse (Anami et al., 2003; Cohen et al., 2001; Goldman et al., 2005; Mandelkow et al., 2006).
Data for the auditory oddball paradigm was also collected on 8 of the subjects outside the scanner in order to compare single-trial discrimination performance with EEG data acquired during fMRI (Sajda et al., 2007). Subjects remained on the scanner bed and were wheeled away from the scanner bore past the 5 Gauss line. Auditory stimuli were presented identically to that of the inside-scanner method, and in addition, scanner noises of the functional EPI sequence were played over speakers directly behind the subject's head.
A software-based 0.5 Hz high-pass filter was used to remove DC drift. Gradient artifacts were then removed by aligning data for each bipolar channel to the start of each TR and subtracting the mean across TRs. A ten-point (10 ms) median filter was then applied to eliminate the minimal remaining RF artifacts. Software based 60 Hz and 120 Hz (harmonic) notch filters were applied to remove line noise artifacts. All filters were designed to be linear-phase to prevent delay distortions.
Ballistocardiogram (BCG) artifacts were then estimated by low-pass filtering the data at 4 Hz and using principal component analysis (PCA) to find the first two principal components across bipolar EEG channels. The sensor weights derived from PCA were applied to the original EEG (not filtered at 4 Hz), and this BCG estimate was projected into each electrode and subtracted from the data. Motor response and stimulus events recorded on separate channels were delayed to match latencies introduced by digital filtering of the EEG.
We epoched the EEG data into trials for each event and analyzed the data in two ways, first stimulus-locked and thus aligned to the onset of the auditory tone, and then response-locked and so aligned to the onset of the subject's button press. Because subjects did not press a button for standard tones, for the response-locked case, response times (RTs) for the standard tones were randomly chosen from the distribution of RTs for the target tones. These randomly chosen RTs were only used to perform the response-locked single-trial EEG analysis.
We used single-trial analysis of the EEG to discriminate between presentation of standard and target tones on a subject by subject basis. Our method has been described in depth previously (Parra et al., 2002; Parra et al., 2005) but will be explained briefly here. The approach looks to identify a projection of the multivariate EEG, within a short time window, that maximally discriminates target trials from standard trials. Specifically, denote with the vector x(t) the multidimensional EEG data at time t. A weighting vector (spatial filter) w is used to generate a one-dimensional projection y(t) from D channels of EEG,
The key to our method is to “learn” a w which results in maximal separation (discrimination) of target and standard trials along the projection y(t). We formulate this learning problem in terms of logistic regression. The assumption in logistic regression is that the data, when projected onto coordinate y(t), is distributed according to a logistic function, i.e., the likelihood that sample x belongs to the class of positive examples (c = +1 is target trial, c = -1 is a standard trial) follows
Given this formulation, we can learn w using maximum likelihood—i.e. maximize the likelihood of the data with respect to the model parameters;
where we sum over T trials. We can use this basic approach to learn a w for specific latencies and window sizes of the data. For example, for both stimulus locked and response locked data we defined a training window centered at a time τ with a width (duration) of δ=50ms, estimating the spatial weighting vector wτ,δ as in (3). Substituting the learned wτ,δ into equation (1) we get:
The result is a “discriminating component” yτ (t), which is specific to activity correlated with one condition (target trials) while minimizing activity correlated with the alternate condition (standard trials). Note that in Equation (4) we add the subscript τ to y(t) to denote that we have discriminating components at different latencies.
In order to determine the evolution of the discrimination vector across time, we systematically shifted the training window latency τ from 0 ms to 1000 ms post-stimulus onset for the stimulus-locked analysis and from −250 ms to 200 ms around the button press for the response-locked analysis, all in 50 ms increments centered on the window time. For each training window, the amplitude of the discriminating component could be assessed on a single-trial basis (Figure 1, top).
For each subject's EEG data, we quantified the performance of the discriminating component for each window by generating an ROC curve (Green and Swets, 1966) using a leave-one-out procedure (Duda et al., 2001), where the area under the ROC curve, or Az, was the probability of trials being correctly classified as targets or standards for that window. We estimated a probability distribution for Az by performing the leave-one-out test after randomizing the truth labels of our target and standard trials. We repeated this randomization process 3351 times across all the subjects for the stimulus-locked 250, 350, and 450 ms windows to compute the Az of p = 0.01. Az values for each time window were then computed as the average Az across subjects.
Given the linearity of our model, we can create a plot of the scalp topography for the discriminating components by estimating a forward model for each component (Parra et al., 2002; Parra et al., 2005). The forward model (scalp topography) aτ is given by
where we now write the EEG data and discriminating components in matrix-vector notation for convenience (i.e. time is a dimension of the matrix/vector). Equation (5) describes the electrical coupling aτ of the discriminating component yτ that explains most of the activity X. We then can plot aτ as a scalp topography.
In addition to the single-trial analysis, we computed traditional ERPs for the EEG data. Trials were epoched off-line from 100 ms pre- to 900 ms poststimulus. Grand means were computed across the individual standard and target averages. Scalp plots for these trial averaged ERPs have been included in the Supplementary Material (Figure S.1).
In order to localize activity in the EEG that discriminated standard from target tones (i.e. to localize the discriminating components identified using the methods described above), we performed a general linear model (GLM) analysis (Beckmann et al., 2003; Bullmore et al., 1996; Woolrich et al., 2001; Worsley and Friston, 1995) using FSL (Smith et al., 2004). Specifically, we constructed EEG-derived fMRI regressors on a subject by subject basis for each significant stimulus-locked and response-locked 50 ms time window by using the single-trial variability seen in that subject's EEG to model the amplitude of individual events. In our fMRI analysis of the EEG components, we only consider those for which the Az value of the discriminating EEG component was >= 0.75. This criterion ensures not only that the discriminability is significantly above chance (p 0.01) but also that it is substantial and that the component variability is likely not purely due to noise. Figure 1 illustrates how we construct separate regressors for each subject for each of 2 stimulus-locked windows. For a given temporal window of interest, the output of the linear discriminator yτ has dimension M ×T where M is the number of trials and T the number of training samples (50 in this case). We averaged across all training samples to compute:
where i is the trial index. We then used the amplitude of τ,i for each trial to model each regressor event (Figure 1). The onset of each event was determined by the onset of the temporal window of interest.
The analysis/modeling of the auditory oddball data for each stimulus-locked and response-locked window was performed as follows. Two traditional event-related design regressors were used to model the average brain response (Event-Related Average Response, or ERAR) to the target (ERAR-Targ) and standard (ERAR-Stand) tones (i.e. constant amplitude of 1 and duration equal to the amount of time each tone was played), and were used to calculate a targets vs. standards (ERAR-TargVsStand) contrast. Another two regressors were derived from the single-trial EEG logistic regression, constructed with amplitude as outlined above and were used to model single-trial variability (STV) for the oddball and standard tones for both the stimulus-locked (S-STV-Targ, S-STV-Stand) and the response-locked (R-STV-Targ, R-STV-Stand) case. These STV regressors were each orthogonalized to their corresponding traditional regressor (STV-Targ to ERAR-Targ, STV-Stand to ERAR-Stand) in order that they modeled single-trial variability around the mean. A fifth regressor modeled response time variability (RT) (only for the target tones) with event-related impulses of height proportional to the de-meaned reaction time (normalized to maximum de-meaned reaction time and thus ranging from -1 to 1) and duration 100 ms. Response time variability was included in the model to explicitly separate activity related to single-trial variability as measured by the EEG from reaction time variability. Thus, for each stimulus-locked and response-locked window, the analysis modeled mean activity to targets (ERAR-Targ), mean activity to standards (ERAR-Stand), single-trial variability of targets ((S or R)-STV-Targ), single-trial variability of standards ((S or R)-STV-Stand), and response time variability (RT). Another 6 regressors, the motion parameter time series (3 rotations and 3 shifts) generated from fMRI image motion correction, were used as regressors of no interest. The full three level (scan, subject, group) fMRI analysis was run separately for each single-trial EEG temporal window that passed the Az threshold for EEG discrimination.
Note that only the variability in amplitude of the single-trial EEG component was used to model the BOLD response—the spatial information in the discriminating EEG component was not used in any way in the fMRI analysis. Thus, the scalp topography and the fMRI map were completely independent methods for identifying the spatial distribution of the single-trial variability for each discriminating window.
Prior to applying the GLM analysis, the MRI data was pre-processed with the following: slice-timing correction, motion correction, spatial smoothing using a kernel of 8 mm full-width at half-maximum, and high-pass filtering with the high-pass cutoff at 100 s. We analyzed the fMRI data using a mixed effects approach as implemented in FSL (Smith et al., 2004).
At first, all activated regions for each fMRI image contrast were considered significant at the group level for an uncorrected voxel threshold of p<0.005 and a cluster size of >10 voxels in order to compare our results with previously published data (e.g. (Benar et al., 2007)). Because the model for the traditional event-related regressors (ERAR-Targ, ERAR-Stand, RT) was the same for the separate analyses conducted for each single-trial temporal window, multiple contrast maps (one for each 50 ms SL window and one for each 50 ms RL window) were generated for the ERAR and RT regressors that differed only slightly due to the model fit for that window. To display the results for these contrasts, contrast masks of clusters that passed the image threshold of p<0.005 and a cluster size of >10 voxels were generated for each window, and these were summed across windows. Thus, results for the ERAR and RT regressors are given in units of number of contrast maps above threshold at each voxel (range 1-12).
In order to determine an image threshold and cluster size that would minimize false positives for the single-trial EEG correlated contrasts, we estimated the false positive probability distribution by re-running the full analysis replacing the single-trial regressor amplitudes with uniformly distributed random values between −1 and 1, such that the model had the same timing as the STV-Targ and STV-Stand model, but had randomly generated amplitudes. This was done once for each of 18 temporal windows (stimulus-locked 200 ms to 600 ms, response-locked −200 ms to 200 ms, Az > 0.67 (p>0.01)). We then used the maximum cluster size of voxels (voxel size = 2×2×2mm3 in group space) above a per-voxel threshold of p<0.005 from these random-input single-trial contrasts as the cluster size threshold for the single-trial results. Thresholding and clustering is further discussed in the results and discussion sections.
Subjects performed the auditory oddball task with high accuracy (percentage correctly detected oddball tones was 98.36% ± 1.75%; percentage correctly rejected standard tones was 99.82% ±0.34%, N=11), and a reaction time of 413 ± 43 ms.
In the EEG acquired simultaneously with fMRI, single-trial discrimination for the stimulus-locked windows passed significance (p<0.01, Az = 0.67) for the consecutive 50 ms windows from 150 ms to 600 ms. Single-trial discrimination was also significant for the response-locked consecutive 50 ms windows from −200 ms to 200 ms (Figure 2).
Across all windows (0-1000 ms), single-trial discrimination performance for data recorded during fMRI as indicated by the window's Az value was greater than 90% of that recorded outside the scanner (Goldman et al., 2005; Sajda et al., 2007). Figure 2 shows the stimulus-locked and response-locked ERPs for the EEG data acquired during fMRI and demonstrates that the windows that passed the Az significance threshold for task discrimination occurs during the P300. Twelve (12) windows, 5 stimulus-locked and 7 response-locked, had an Az > 0.75 (see Figure 2), and these were used in the fMRI analysis.
Traditional ERPs for targets and standards for EEG recorded during fMRI are shown in Figure 3. Clear in the stimulus locked data are the N1 and P300 ERPs, with the P300 extending from approximately 300-600ms and prominent on the parietal (Pz) electrode. For the response locked data, we also see the P300, and also observe that the mean behavioral response time falls close to the peak time of the response-locked P300. Scalp topologies of ERPs across all electrodes are given in the Supplementary Material Figure S.1.
At the group level, the single-trial EEG component variability (S-STV-Targ, R-STV-Targ) for each of the 50 ms windows was not correlated with RT (p = 0.05 corrected for multiple comparisons). When runs were considered individually, there were only 3 runs out of the 264 that were significantly correlated with RT, and these were just below the p = 0.05 level corrected for multiple comparisons. Note that none of the discriminating components for which runs yielded a significant non-zero correlation with RT resulted in fMRI activations which passed our z-score and cluster criteria.
Brain regions that passed significance for the traditional event-related model in the contrast of target vs. standard tones (ERAR-TargVsStand) are shown in Figure 4. These areas were activated on average to the presentation of target vs. standard tones. These include, among others, left and right auditory cortex, anterior cingulate gyrus, superior frontal gyrus, posterior cingulate gyrus, left and right temporal pole, and left and right thalamus. Hand motor cortex in the left hemisphere is also activated, as expected, since subjects responded to target tones with a right-hand button press.
Figure 5 shows areas that correlated significantly with response time variability (RT). Positive correlations with BOLD signal were found in areas including the precuneus/intracalcarine cortex, right superior lateral occipital cortex, right cuneus, left lateral occipital cortex, right middle temporal gyrus, right temporal fusiform cortex, left and right superior lateral occipital cortex, and left angular/middle temporal gyrus. Negative correlations with BOLD signal were found in the genu and the left forceps minor of the corpus callosum.
In the 18 “activation” maps generated using the random-input single-trial regressors, 55 clusters passed the image threshold of per-voxel p<0.005 and cluster>10 voxels. The largest cluster size was 72. Thus, to minimize false positives for the STV results for both stimulus- and response-locked windows, we used an image threshold of per-voxel p<0.005 and cluster>73 voxels, a volume equivalent to a sphere of 1cm diameter, in the data presented here. However, for completeness, the lower threshold STV results are included as Supplemental Material.
While many of the stimulus-locked windows with a significant Az yielded BOLD fMRI maps with areas of significant activation for the S-STV-Targ contrast at the lower cluster size (10 voxels, see Supplemental Figure S2 and Supplemental Table ST1), only the 450 ms window (S-STV450-Targ) passed the more conservative 73 voxel cluster size. For this window, a significant negative correlation with the BOLD signal was found in the left post-central gyrus (cluster size = 101 voxels) (Figure 6). No significant BOLD correlations were seen for S-STV-Standards in the windows above the Az threshold of 0.75.
Also shown in Figure 6 is the scalp topography of the stimulus-locked discriminating EEG component for the 450 ms temporal window. The scalp topography for this window shows peak activity in an area consistent with the fMRI activation.
As in the stimulus-locked case, many of the response-locked windows with a significant Az that had BOLD fMRI maps with areas of significant activation at the lower cluster size threshold (10 voxels, see Supplemental Figure S3 and Supplemental Table ST2) no longer showed any significant regions at the larger cluster size cutoff. Only two windows, R-STV50-Targ and R-STV150-Targ, yielded activations above the more conservative 73 voxel cluster size. For both of these windows, a negative correlation with the BOLD signal was seen in the right inferior lateral occipital cortex (R-STV50 cluster size = 203 voxels, R-STV150 cluster size = 108 voxels) (Figure 7).
The scalp topographies of the stimulus-locked discriminating EEG component for the response-locked windows in the time range 50-150ms showed a similar pattern of activation in an area consistent with the lateral occipital complex (Figure 7).
Trial-to-trial variability in neuronal activity, though sometimes attributed to noise, can be functionally significant, potentially being a signature of task-relevant brain-state changes. Here we present results from simultaneous EEG and fMRI of an auditory oddball experiment. We used a multivariate analysis of the EEG for each subject to learn spatial filters, at specific stimulus-locked and response-locked time windows, which maximally discriminated target trials from standard trials. For each time window, we constructed fMRI regressors based on the trial-to-trial fluctuations of the EEG discriminators' output and used these to model the trial-to-trial variability of the events (S-STV, R-STV). These regressors were combined with a regressor to model reaction time variability (RT) as well as traditional event-related regressors to model the mean activation (ERAR). All these regressors were convolved with a canonical hemodynamic response function and used as explanatory variables in a general linear model analysis of the fMRI. The traditional event-related regressors, ERAR-TargVsStand, produced a contrast similar to those previously seen for this task (Benar et al., 2007; Friedman et al., 2008; Kiehl et al., 2005) as did those seen for reaction time, RT, (Yarkoni et al., 2009). Alternatively, the different trial-to-trial variabilities specific to particular stimulus-locked (S-STV) and response-locked (R-STV) time windows yielded focally distinct hemodynamic activations that were not explained by traditional event-related modeling either of the stimulus (ERAR) or of behavioral measures such as reaction time (RT). Further the forward model EEG scalp topographies determined from the discriminant functions for each time window showed patterns consistent with the observed activated areas even though no spatial information from the single-trial EEG components was used in the fMRI modeling (Figs. 6 & 7).
In their recent study of the auditory oddball paradigm using simultaneous EEG/fMRI, Benar and colleagues (Benar et al., 2007) used a univariate analysis and visual inspection to identify single-trial latency and amplitude variations in EEG, and correlated these with the BOLD signal. Comparing our results to those of Benar, we see some similar activation patterns. For example, the right LOC cluster we find negatively correlated with our R-STV50-Targ and R-STV150-Targ regressors, Benar also finds negatively correlated with P300 amplitude, although the activation we see is a bit more anterior and lateral (see Benar fig 7B). Benar also finds negative correlation in the left LOC, but a smaller cluster. The cluster they see positively correlated with P300 Latency, we see positively correlated with our RT regressor, (compare Benar figure 7C to our figure 4, z=+42). This is not surprising since Benar et. al. find a significant correlation between P300 latency and RT (see Benar figure 4), and thus we would expect our RT regressor to identify similar brain regions. Because we included response time variability explicitly in our model and our STV regressors for each window did not correlate with RT, we were able to separate activity related to single-trial variability as measured by the EEG from activity related to reaction time variability, and thus were able to dissociate single-trial variability during the P300 from motor (RT) related correlations.
Of note, we find a region in the corpus callosum (CC) where BOLD signal is negatively correlated with RT, though not with the trial-to-trial variability of our EEG-derived discriminating components. Benar et al. see a similar activation, though positively correlated with the amplitude variation of the P300. They localize the activation to left anterior cingulated though there is clearly substantial overlap with the CC (see their Fig. 7a (Benar et al., 2007)). Though BOLD activations in white matter are not commonly reported in the literature, activation in the same region of the CC has been found in fMRI studies of intermemispheric transfer (Tettamanti et al., 2002; Weber et al., 2005).
Our finding, in many ways consistent with Benar et al, that the LOC is highly correlated with the trial-to-trial variability of EEG components discriminative of auditory target detection seems at first surprising, for the LOC is typically associated with visual object perception (Grill-Spector et al., 2001; Grill-Spector et al., 1999; Philiastides and Sajda, 2007; Sehatpour et al., 2008) and spatial attention (Hopf et al., 2006; Murray and Wojciulik, 2004). Though there is little evidence of direct activation of LOC by auditory stimuli, there is substantial evidence of LOC being modulated by attention (Bingel et al., 2007; Glascher et al., 2007; Hopf et al., 2006; Murray and He, 2006; Murray and Wojciulik, 2004). Given that one interpretation of the single-trial variability of our EEG components is as a surrogate for attentional engagement in the task, our results are consistent with the hypothesis that though the LOC is not activated by the auditory stimuli, the ongoing waxing and waning of attention modulates a large number of areas, including those not directly participating in the information processing for the task. This in fact may also explain why we see our single-trial EEG components producing negative BOLD correlations. As attention shifts to the auditory task it reduces attentional resources available for visual processing (a push-pull effect (Shomstein and Yantis, 2004))—i.e. an increase of the single-trial regressor for a trial represents an increase in auditory attention at the cost of a decrease in visual attention, with the result being a negative correlation in those visual areas highly modulated with attention, as is the LOC. Additional simultaneous EEG/fMRI experiments which explicitly probe both auditory and visual target detection within the context of push-pull attentional resources will be needed to further substantiate this hypothesis. However our results provide compelling evidence that single-trial analysis of simultaneous EEG/fMRI can be used to measure the effects of latent brain states, such as attention, on areas not directly participating in the information processing of the task at hand.
Our S-STV450-Targ regressor resulted in negatively correlated fMRI activation in the left postcentral gyrus, consistent with an interpretation that it reflects variability in somatosensory activity that is discriminative for the task (target detection), with the location on the sensory homunculus in the left hemisphere consistent with a right-handed button press for all subjects. Note that this regressor is for a window that is very close to the mean RT (413ms ± 43 ms). It is interesting to note that this activity does not arise from RT variability but EEG-derived discriminating component variability. Given that the time window for the discriminating component lies within the period of the P300, it may reflect the target detection process rather than the motor response and sensory feedback. There has been increasing interest focused on teasing apart the contribution of different aspects RT related fMRI activations (such as amplitude changes vs temporal shifts) since they may be reflective of different underlying cognitive processes, such as delay in processing vs time-on-task (Yarkoni et al., 2009). The methods we present in this paper suggest one such avenue for teasing apart different neural processes overlapping with reaction time, namely by separating fMRI activity due to the trial-by-trial variability in behavioral measures from the underlying task-related electrophysiological variability.
Our results clearly demonstrate that single-trial variability in task-relevant EEG components can identify novel cortical areas different from those found via traditional event-related stimulus/behaviorally derived regressors. There are, however, several techniques for identifying EEG components and thus several types of single-trial variability one can consider. Our approach has been to use supervised machine learning to identify EEG components that are task discriminating at specific moments in time, relative to stimulus onset and response. As mentioned earlier, Benar et al. focused explicitly on latency and amplitude variation, on a trial-to-trial basis, of the P300 in individual electrodes. Though each is interesting in its own right, there are two important differences between their method and ours for identifying meaningful single-trial variations in the EEG. The first is that the method of Benar et. al. is inherently more subjective, in that the peak-picking procedure they use requires visual inspection. Our methods for identifying single-trial components and the resulting variability are completely data driven, with choices made strictly based on statistical significance measures. The second is that Benar' et al. is not able to directly address the nature or existence of subcomponents which makeup the P300. In our case, however, our components are determined from the data to be discriminating for the task in the temporal window in which they are trained and thus can examine the details of the relationship of the components to the task.
There are of course other multivariate machine learning methods than can be used to capture single-trial variability in the EEG. Unsupervised machine learning, such as independent component analysis (ICA) has been widely used to extract components in EEG (Makeig et al., 1997) and more recently has been used as a technique for coupling the single-trial variability of the components with the BOLD signal for simultaneously acquired EEG and fMRI. Debener et. al. (Debener et al., 2006; Debener et al., 2005) used ICA to identify the single-trial amplitudes of the error-related negativity (ERN), and correlated these with the BOLD response, finding activity in the rostral cingulate zone. Though such correlations between the ERN and BOLD activity in the anterior cingulate (ACC) are interesting in their own right, it should be noted that the single-trial amplitudes were also correlated with reaction time (positively for the current trial and negatively for the subsequent trial) and thus the fMRI activation seen in the ACC could also potentially be explained by reaction time variability. In addition, the ICA method requires visual inspection of the individual components in each subject to identify the component associated with the ERN and can thus introduce bias.
The linear discriminant component method we describe, though relatively simple, comes with certain limitations in terms of the types of EEG components it can extract and therefore the BOLD correlates it can identify. First is that the method learns only a single component for each time window and thus is able to extract only a single discriminating component at any given time. This is in contrast to ICA that typically addresses the issue of extracting multiple simultaneous (temporally overlapping) independent components. Related to this, our linear discriminant method is trained for a fixed window length, with all samples in the window treated as independent and identically distributed (i.i.d.) and therefore changes in the polarity of the EEG in the window would cancel each other. Thus the method can potentially miss components that are defined by rapid (relative to the temporal window length) polarity changes which in our case would be those with frequencies higher than 25 Hz. Finally, our linear discriminant method extracts discriminating components based only on amplitude changes in the EEG. Other features, such as power in specific frequency bands, have also been shown to be useful in single-trial analysis of EEG (Pfurtscheller and Andrew, 1999). None of these issues, however, is a fundamental limitation of discriminant component methods, and we and others have developed algorithms which enable the extraction of both multiple overlapping (in both space and time) discriminating components (Dyrholm et al., 2007) as well as discriminating components defined by differences in frequency band power (Christoforou et al., 2008). In addition, these newer methods utilize a bilinear decomposition, where both spatial and temporal profiles (i.e. filters) are extracted without utilizing a temporal window for analysis—i.e. the methods are better suited for extracting components which are defined by EEG channels that rapidly change polarity.
The discriminating component method we describe in this paper is a linear discriminator based on logistic regression. It is relatively simple and is able to address the issues of predictability and interpretability of its results. Predictability is considered in that the model yields good discrimination of task relevant variables (stimulus type; oddball vs standard). Interpretability is addressed in that the linear discrimination method yields components with functional/neural significance, and we can localize these activities in time and easily construct forward models of the resulting component, e.g. the scalp plots in Figures 6 & 7. Future work will consider more complex, yet flexible, discriminant models such as those mentioned above.
We used a resampling procedure to determine a statistically significant cluster size and found that only large clusters, with a threshold of 73 voxels, were significant. We note that this cluster threshold is substantially larger than what others have reported in the literature (Benar et al., 2007; Debener et al., 2005). Though our motivation for using resampling was to be conservative in the activations we identified (activations at lower thresholds are given in Supplemental Material), we believe that our finding brings up an important issue related to significance testing in the analysis of single-trial simultaneous EEG and fMRI. If one employs a Bayesian argument, then we can say we have some belief in the regressor model we choose for identifying activations in the fMRI. Traditional regressors typically are linked to some characteristic of the stimulus or response, both of which are relatively easy to measure and have low measurement noise and uncertainty. Thus we can say we are highly confident in the traditional event-related model regressor and our prior on the regressor model has low variance. On the other hand, when regressors are constructed using trial-to-trial fluctuations of an underlying EEG measurement, as in the case of our analysis and those such as in Benar and Debener, there is more uncertainty on what in fact constitutes the “correct” regressor. This increases our uncertainty and implies a prior distribution over possible regressors with a larger variance (relative to the stimulus/behavior derived regressors). A proper statistical analysis requires sampling this prior in order incorporate the uncertainty of the particular regressor into the estimation of the significance of the fMRI activations. When we do this we see that a larger cluster threshold size is required to achieve the desired statistical significance. While this method certainly minimizes false positives, it also leads to higher likelihood of false negatives for small clusters. Further development of thresholding methods for these new single-trial EEG/fMRI techniques will help to address these issues.
A caveat to our approach, as well as that of others, is that the coupling of the EEG components and BOLD response should not be interpreted as spatially localizing the EEG components. Since the basic general linear model (GLM) (Beckmann et al., 2003; Bullmore et al., 1996; Smith et al., 2004; Woolrich et al., 2001; Worsley and Friston, 1995) we use, and which is standard in fMRI analysis, is a correlation method, our results highlight those voxels in the fMRI whose trial-to-trial fluctuations in BOLD co-vary with the trial-to-trial fluctuations of specific EEG components. As is the case with any correlation-based analysis, localization (and thus causation) cannot be inferred since we might be observing the result of some indirect covariation via a latent process (which in fact is our hypothesis for the LOC activations). Some research groups have focused their simultaneous EEG/fMRI studies more on the localization problem, for example by trying to more directly integrate the spatial activations in fMRI with the localized dipole sources in the EEG (Bonmassar et al., 2001; Scarff et al., 2004). Instead of focusing on localization, our approach is aimed at using single-trial variability of task-discriminating EEG components as a surrogate for changes in latent brain states, such as the waxing and waning of attention, which are measurable only at high temporal resolution. We then use the magnitude of the variability on each trial, as measured by the EEG, as a way to identify voxels in the fMRI that are modulated in the same way by trial-to-trial activity. These maps highlight areas related to the variability of brain activity across trials, and provide complimentary information to traditionally derived event-related fMRI findings that show regions activated on average to a task. It is hard to over-emphasize the importance of our finding that these highlighted areas show excellent spatial consistency with the scalp topographies independently derived from the forward model of the discriminating component for both time windows in which we observed a significant correlation even though spatial information in the scalp topographies is not used in any way for localizing the fMRI activations. This finding provides even greater confidence that the EEG variability that generates fMRI activations is in fact related to the task and the cortical areas which are localized via fMRI. In summary our approach shows that we can identify meaningful EEG-derived fMRI activation maps that are not based on pre-defined labels or observed behavioral responses but rather on task-discriminating and subject specific electrophysiological components and their trial-to-trial variability.
The following tables and figures show activations for per-voxel p<0.005 and the more liberal cluster threshold of cluster>10 voxels.
Table ST1: Summary of areas where BOLD signal correlated with stimulus-locked single-trial variability to presentation of target tones, S-STV-Targ, with image threshold at per-voxel p<0.005 and cluster>10 voxels and single-trial EEG discriminating component Az threshold=0.75.
Table ST2: Summary of areas where BOLD signal correlated with response-locked single-trial variability to presentation of target tones, R-STV-Targ, with image threshold at per-voxel p<0.005 and cluster>10 voxels and single-trial EEG discriminating component Az threshold=0.75.
Figure S1: Scalp topologies of ERPs computed for the same temporal windows in which discriminating component fMRI activations were found (S-STV450-Targ, R-STV50-Targ, and R-STV150-Targ). Note that for stimulus locked plots, the scalp topologies represent the difference signals between targets vs standard. For the response locked plot, the scalp topologies are for the ERPs of targets only, since there are no response times for standards.
Figure S2: fMRI activations for stimulus-locked discriminating components (Az > 0.75) showing (A) positive correlations and (B) negative correlations between BOLD signal and the stimulus-locked single-trial EEG variability to targets (S-STV-Targ) above a threshold of per-voxel p<0.005, cluster>10 voxels. The stimulus-locked windows showing activity above threshold are 350ms (blue), 400ms (green), and 450ms (orange).
Figure S3: fMRI activations for response-locked discriminating components (Az > 0.75) showing (A) positive correlations and (B) negative correlations between BOLD signal and the response-locked single-trial EEG variability to targets (R-STV-Targ) above a threshold of per-voxel p<0.005, cluster>10 voxels. The response-locked windows showing activity above threshold are -100ms (orange), -50ms (light blue), 0ms (red), 50ms (blue), 150ms (green), 500ms (orange), and 200ms (yellow).
This work was supported by grants from the National Institutes of Health (EB004730, AG005213, and HD14959). We thank Mark Cohen, Amir Abrishami, William Thomas, and Eric Black for their contributions to designing and building the EEG system, and Charles Brown III for programming and technical assistance.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.