|Home | About | Journals | Submit | Contact Us | Français|
Several studies show that the amplitudes of human brain oscillations are modulated during the performance of visual working memory (VWM) tasks in a load-dependent manner. Less is known about the dynamics and identities of the cortical regions in which these modulations take place and hence their functional significance has remained unclear. We used magneto- and electroencephalography (M/EEG) together with minimum-norm-estimate- (MNE) based source modeling to study the dynamics of ongoing brain activity during a parametric VWM task. Early stimulus processing and memory encoding were associated with a memory-load dependent spread of neuronal activity from occipital to temporal, parietal, and frontal cortical regions. During the VWM retention period, the amplitudes of oscillations in θ/α- (5–9 Hz), high-α- (10–14 Hz), β- (15–30 Hz), γ- (30–50 Hz), and high-γ- (50–150 Hz) frequency bands were suppressed below baseline levels and yet, in fronto-parietal regions, load dependently strengthened. However, in occipital and occipito-temporal structures, only β, γ, and high-γ amplitudes were robustly strengthened by memory load. Individual behavioral VWM capacity was predicted by both the magnitude of the N1 evoked-response component in early visual regions and by the amplitudes of fronto-parietal high-α and high-γ band oscillations. Thus both early stimulus processing and late retention period activities may influence the behavioral outcome in VWM tasks. These data support the notion that β- and γ-band oscillations support the maintenance of object representations in VWM whereas α-, β-, and γ-band oscillations together contribute to attentional and executive processing.
Working memory (WM) maintains and manipulates information (Baddeley, 1996) and is based on the retention of integrated neuronal object representations (Luck and Vogel, 1997, Zhang and Luck, 2008). The binding of sensory feature representations into object representations and the active maintenance of these representations in WM may depend on millisecond-range synchronization of neuronal network oscillations (Singer, 2009, Fries, 2009, Palva et al., 2010a). In human MEG and EEG recordings, the maintenance of visual (Medendorp et al., 2007, Osipova et al., 2006, Tallon-Baudry et al., 1998, Haenschel et al., 2009, Jokisch and Jensen, 2007), auditory (Leiberg et al., 2006, Kaiser et al., 2008), and somatosensory (Haegens et al., 2010) information in WM is associated with increased β- and γ-frequency band amplitudes. However, also the amplitudes of θ- and α-band oscillations are modulated during WM maintenance (Osipova et al., 2006, Haenschel et al., 2009, Jokisch and Jensen, 2007, Leiberg et al., 2006, Busch and Herrmann, 2003, Jensen et al., 2002, Onton et al., 2005). Prior studies have used several distinct WM tasks and memory retention intervals, which make it difficult to obtain an integrative view on the insight given by these studies into brain functioning. The difficulties in inferring the physiological roles of scalp-recorded oscillations also stem from the absence of source modeling in most studies. Studies using MEG/EEG source modeling suggest that the amplitude modulations in visual WM (VWM) tasks arise in occipital and parietal brain regions (Medendorp et al., 2007, Osipova et al., 2006, Jokisch and Jensen, 2007). Functional magnetic resonance imaging (fMRI) has, in addition, revealed VWM-maintenance related activity in several frontal and temporal regions as well as in the cingulate and insular cortices (Pessoa et al., 2002, Munk et al., 2002, Axmacher et al., 2007, Mohr et al., 2006, Prabhakaran et al., 2000, Rowe et al., 2000, Sakai et al., 2002, Todd and Marois, 2004, Linden et al., 2003). Intracranial recordings of epileptic patients also show amplitude modulations during VWM maintenance in widespread occipital, temporal, parietal, and frontal cortical regions (Howard et al., 2003, Mainy et al., 2007, Raghavachari et al., 2001, Sederberg et al., 2007, Meltzer et al., 2008) and in the hippocampus (Axmacher et al., 2007). Indeed, by using fMRI priors in source modeling, sustained event-related activity (Robitaille et al., 2010, Scheeringa et al., 2009, Wibral et al., 2008) as well as theta- (Morgan et al., 2010) and alpha-band (Grimault et al., 2009) amplitude modulations have been identified in occipital, temporal, and frontal cortical regions.
In the present study, we set out to characterize comprehensively the neuronal substrates underlying a parametric and psychophysically well validated VWM task using concurrently acquired MEG and EEG (M/EEG) data. We used data-driven statistical source-space analyses to map in individual cortical anatomy the dynamics of early stimulus processing and later amplitude modulations of ongoing oscillations. The investigation aimed at identifying the cortical regions and time-frequency windows where brain activity is modulated by VWM task performance, correlated with memory load, and/or predictive of individual behavioral performance.
We used a delayed-matching-to-sample task where the subjects’ were presented a Sample stimulus that contained one to six 6 squares and that had to be sustained in WM and then compared with a Test stimulus (Luck and Vogel, 1997, Palva et al., 2010a,b) (Suppl. Fig. 1A). The Sample stimulus was presented for 0.1 s, and was followed by a 1 s memory-retention period, after which a Test stimulus was presented for 0.5 s. The subjects indicated with a forced-choice left- or right-hand thumb twitch whether the stimuli were “same” or “different”. The hand-response correspondence was randomized across subjects. After this primary response, the subjects indicated whether or not they felt sure about their response with a second go-/no-go, respectively, twitch of the same thumb.
Accuracy was assessed as the proportion of correct responses from all responses (Suppl. Fig. 1B). Reaction time was given by the time from the Sample stimulus onset to the onset of the electromyogram (see below) activity corresponding to the primary response (Suppl. Fig. 1C). All analyses were based on artifact-free trials with correct behavioral responses.
13 healthy right-handed (age 28 ± 3, mean ± SD, 4 females) volunteers with normal or corrected-to-normal vision participated in this study. 366-channel M/EEG data were recorded with 204 planar gradiometers, 102 magnetometers, and 60 EEG electrodes (Elekta Neuromag Ltd., Finland) at 600 Hz sampling rate throughout the experiment (Suppl. Fig. 1D). The thumb-twitch responses were recorded with electromyography (EMG) of abductor/flexor pollicis brevis and detected with an automatic algorithm. All EMG responses were also verified visually. The electro-oculogram was used to detect ocular artifacts. Trials with the electro-oculogram signal exceeding 50 µV were excluded from further analysis. The MaxFilter software (Elekta Neuromag Ltd., Finland) was used to suppress extra-cranial noise and to co-localize the signal space data from different recording sessions and subjects. For cortical surface reconstructions, we recorded T1-weighted (MP-RAGE) anatomical MR images at a ≤ 1×1×1-mm resolution with a 1.5 T MRI scanner (Siemens, Germany) (Suppl. Fig. 1E). This study was approved by the ethical committee of Helsinki University Central hospital and was performed according to the Declaration of Helsinki. Written informed consent was obtained from each subject prior to the experiment.
Each channel, yj(t), of the single-trial M/EEG time-series data, Y(t), with nc channels j = 1… nc, was filtered into 36 frequency bands, f, f = 3…90 Hz, with a bank of Morlet wavelets h(t,f) so that the complex filtered signal yF(t,f) is given by yF(t,f) = y(t) * h(t,f), where * denotes convolution and h(t,f) = A exp(−t2/2σt2) exp(2iπft) (Suppl. Fig. 1G). The time-domain standard deviation of the wavelet is given by σt = m/2πf, where we selected m = 5, to define a compromise between time and frequency resolution, and f is the center frequency of the wavelet.
In the second stage of this study, we used a set of paired high- and low-pass finite-impulse-response (FIR) filters that gave a better temporal resolution than the Morlet wavelets and represented the spectral content captured by four adjacent wavelets with a single filter in each of the four frequency-bands-of-interest delineated by the prior wavelet analysis. The stop- and pass-bands of the high-pass filters, and the pass- and stop-bands of the low-pass filters, respectively, were (in Hz): θ/α [2.375, 4.75, 6.65, 9.5], h-α [4.75, 9.5, 13.3, 19.0], β [9.5, 19.0, 26.6, 38.0], γ [14.25, 28.5, 37.05, 49.875], and h-γ [48.75, 75.0, 112.5, 150.0]. The parameters for the FIR-filters used in evoked-response analyses were (in Hz) [0.01, 0.1, 25, 45] and [1, 2, 25, 45].
The FreeSurfer-anatomical-MRI analysis software (http://surfer.nmr.mgh.harvard.edu/), was used for automatic volumetric segmentation of the MRI data, surface reconstruction, flattening, cortical parcellation, and labeling with the Freesurfer/Destrieux atlas (Dale et al., 1999, Fischl et al., 1999, Desikan et al., 2006, Fischl et al., 2004, Fischl et al., 2002, Segonne et al., 2004, Fischl et al., 2001) (Suppl. Fig. 1H). The MNE software (http://www.nmr.mgh.harvard.edu/martinos/userInfo/data/sofMNE.php) was used for creating three-layer boundary element conductivity models, cortically constrained source models, for the M/EEG-MRI co-localization, and for the preparation of the forward and inverse operators (Hämäläinen and Sarvas, 1989, Nenonen et al., 1994, Hämäläinen and Ilmoniemi, 1994, Mosher et al., 1999, Lin et al., 2006) (Suppl. Fig. 1I, J). The source models for each individual were based on tessellated and decimated cortical surfaces where the dipoles were fixed normal to the surface and had an approximately 7 mm source-to-source separation. Depending on the brain size, the source models had a total of nd = 6000–8500 sources in the two cerebral hemispheres.
M/EEG sensor signals, Y, are linearly related to current strengths in nd source dipoles, X = [xk], k = 1… nd, so that Y(t) = ΓX(t) + N(t), where Γ is the lead field matrix, i.e., the forward operator, and N denotes noise. We obtained X(t) from measured Y(t) by using a minimum-norm estimator (Hämälainen and Sarvas, 1989, Nenonen et al., 1994) so that X(t) = MY(t) = CsΓT(ΓCs ΓT + λ2Cn) −1 Y(t), where M is the inverse operator, λ2 is a regularization parameter, Cs is the source covariance matrix, and Cn is the noise covariance matrix. We used λ2 = 0.05 and a multiple of the identity matrix as R, scaled so that the norm of ΓCs ΓT, whitened spatially using Cn, equals the number of channels, see MNE manual. The inverse operator, M, was prepared separately for each wavelet- or FIR-filter frequency with the noise covariance matrices obtained from filtered single-trial pre-stimulus baseline windows of all trials (Suppl. Fig. 1K). The complex inverse solution, XF = [xF,k], was obtained from the inverse estimates of the real, YF,RE, and imaginary, YF,IM, parts of the filtered data YF so that XF = MF YF,RE(t) + i MF YF,IM(t), where i is the imaginary unit (Suppl. Fig. 1L).
We used two cortical parcellation strategies in this study. The aims of source-model parcellation were to compress single-trial data and to transform the data into nameable anatomical regions that were common to all subjects. A parcellation denotes the grouping of vertices into a number of patches. In the Morlet-wavelet-based analyses, we employed the 156 labels of the FreeSurfer-atlas (Fischl et al., 2002) covering the entire cortical surface (see Suppl. Fig. 1H) and then further split the largest patches to obtain the anatomical parcellation, PAN240, comprising a total of 240 patches (Suppl. Fig. 1M). The splitting was carried out iteratively so that in each iteration the patch having the greatest mean number of vertices was selected and split along the one of three axes (anterior-posterior, lateral-medial, ventral-dorsal) that had the largest mean variance of vertex locations. This procedure gives an anatomical parcellation wherein the cut directions are identical for every subject. The individual anatomical data were thus directly comparable in group statistics of the subject population without a need for morphing the data across subjects (Palva et al., 2010a). The FIR-filter-based analyses were performed with two parcellations: the single-trial data were compressed to and averaged with an individual cluster parcellation of 700 clusters, PCL700. Group statistics were then performed with an anatomical parcellation with 700 patches, PAN700, after morphing the data each individuals’ PCL700 to that individuals’ PAN700(Palva et al., 2010a). The individual subjects’ parcellations were obtained by (1) inverse-modeling baseline data that was FIR filtered into a frequency band from 150–224 Hz (filter parameters as above: [96.85, 149.0, 223.5, 298.0]), (2) evaluating vertex-by-vertex phase-locking-value-based-1:1-phase-synchrony matrices within each of the 150 patches of the Destrieux atlas (3) performing hierarchical clustering of source space vertices so that always the pair of vertices (or vertex clusters) that had the greatest phase-locking value were clustered. The hierarchical clustering was continued until 700 clusters remained. The hierarchical clustering approach was thus similar to the one described earlier (Palva et al., 2010a), but here the clusters were limited to the 150 anatomical patches of the Destrieux atlas. This clustering compressed single-trial source space data with minimal loss of information because the clusters reflected vertices that were highly correlated in the original recording and/or in the inverse transform. The 150–224 Hz band was used for this purpose because the amplitude (and signal-to-noise ratio) of neuronal activity therein is negligible and hence the phase-synchrony estimates describe very well the correlations inherently present in the M/EEG inverse solutions.
The time series of the ~7000 source vertices were collapsed into time series of the cortical patches (Suppl. Fig. 1N) so that the collapsed inverse solution, XF,P = [xF,P,l], l = 1…np, is given by XF,P = Π(XF, P), where Π is a collapse operator, P is the parcellation, P = [pl], and np is the number of patches, pl, in P. In the Morlet-wavelet analyses, we defined Π so that the patch time series was obtained from vertex time series simply by averaging over the time series of the individual vertices of within that patch. In the FIR-filter-based analyses, Π was defined so that the amplitudes of the individual vertex-time series were averaged as above but the phases of a subset of vertices were shifted by π. This was done because in a source model with fixed dipole orientations, the time series for a single dipole in one wall will be 180 degree out-of-phase with the time series of the dipole in the opposite wall. These to-be-shifted vertices were identified from the within-patch phase synchrony estimates as that minority of vertices that had a phase difference of ~π with the majority of vertices having a zero-lag phase difference. Vertices with an opposite polarity arise in cortical patches where the patch, for instance, contains both sides of the same sulcus. The phase shifting prevents vertices with opposite polarities from canceling each other when the vertex time series are averaged into the patch time series.
We used the collapsed inverse estimates, XF,P,r(t,f) of single trials r, r = 1… ns, for cortex-wide mapping of evoked responses and the phase locking of ongoing activity to the Sample stimuli (stimulus locking) as well as for the mapping of event-related amplitude dynamics (Suppl. Fig. 1O). Stimulus locking was quantified with the phase-locking value, PLV(t,f) (also known as the phase-locking factor), that was given for each patch a = 1…np by PLV = ns−1|Σr (xF,P,a / |xF,P,a|)|(Sinkkonen et al., 1995). The averaged event-related amplitude envelopes, A(t,f), were given by A = ns−1Σr (|xF,P,a|) (Tallon-Baudry et al., 1996; Palva et al., 2005) and the evoked responses, ER(t,f), simply by the averaging the real parts of the time series so that ER = ns−1Σr [Re(xF,P,a)],
Before the statistical group analyses (Figures 1,,22,,44,,55,,7,7, Suppl. Fig. 1P, 2–8), the individual subject’s PLV(t,f), A(t,f), and absolute valued evoked responses (ER) were baseline corrected by subtracting the mean of the pre-Sample-stimulus period from −450 ms to −150 ms (Morlet-wavelet data) or from −700 ms to −100 ms (FIR-filter data). The data were then pooled across subjects by applying a statistical test to obtain PLVS and AS for each time-frequency element of PLV and A, and separately for each VWM load, LM = 1, 2, 3, 4, 5, and 6. In the Average condition and for all VWM loads separately, the baseline-corrected PLV, A, and ER were tested against a null hypothesis of PLV = 0, A = 0, and ER = 0 by using the Wilcoxon-signed-rank or student’s t-test. These conditions thus address whether the studied values are significantly different from the mean baseline value. The correlations of the PLV, A¸ and ER values in each time, frequency, and cortical location with memory load (Load condition) were estimated by using the Spearman’s rank correlation test across the memory loads LM = 1, 2, 3, 4, 5, and 6 so that the values were first rank transformed within each subject and then used the in the statistical group analysis across subjects. Individual variability in response magnitudes thus did not influence the outcome of the analysis.
Post-hoc analyses were performed to a selection of brain regions showing significant modulation by the VWM load (Figures 3 and and66 and Supplementary Figure 9–10). The peak latencies and amplitudes were estimated from the peaks searched from the signal in two time windows (80–120 ms and 160–200 ms for the first and second peak, respectively). The mean ER magnitudes were estimated from the same time windows, whereas the mean oscillation amplitudes were estimated from the late retention period time window (700–1000 ms). The selection of time window did not affect the results qualitatively.
The correlations of the ER and oscillation amplitude with the individual memory capacity were identified so that we first obtained mean ER and A values for separately for each of the six memory loads from 1 to 6, fitted a third order polynomial to this mean-as-a-function-of-load series, and subtracted the linear trend. The “predicted capacity values” were then obtained by finding the first positive peak. The correlation between the predicted and real capacity values was estimated by the Spearman rank correlation test.
During the M/EEG recordings, the subjects performed a delayed-match-to-sample-VWM task (Suppl. Fig. 1A) (Luck and Vogel, 1997, Palva et al., 2010a, Palva et al., 2010b) wherein they memorized a 0.1 s Sample stimulus containing one to six colored squares. A Test stimulus appeared one second after the offset of the Sample and in 50 % of the trials, one square in the Test had a different color than that square in the Sample. The subjects indicated with a forced-choice left- or right-hand thumb twitch whether or not the Sample was identical to the Test. As reported for earlier (Luck and Vogel, 1997) and the present (Palva et al., 2010a) data, the behavioral accuracy decreases with increasing memory load (Suppl. Fig. 1B) while the reaction times become longer (Suppl. Fig. 1C). Thus, this VWM task yields quantitatively similar psychophysical data during repeated M/EEG recordings as obtained outside the neuroimaging context.
The aim of this study was to identify the cortical sources and dynamics of VWM-related neuronal activities and to address their functional significance in VWM by finding the components that are correlated with memory load and/or predict individual psychophysical performance. The processing of the Sample and Test stimuli was examined first by evaluating absolute-valued evoked responses (ERs) with two broad-band FIR band-pass filters (1–45 Hz and 0.01–45 Hz) across the six memory load conditions. The data were summarized into waveforms indicating the fraction of brain regions where the ERs were significantly (p < 0.001, FDR corrected) above the baseline level. Both Sample and Test stimuli were associated with ERs lasting roughly the first 400 ms from stimulus onset (Fig. 1A). Interestingly, the ERs obtained with the lower high-pass filter (0.01–45 Hz) remained above the baseline throughout the retention period and thus contained a slow baseline-shift-like component. We then estimated the effect of memory load to the evoked responses by using Spearman’s rank correlation test (p < 0.001, FDR corrected), which revealed that the ER components from 60 to 200 ms, but not those after 200 ms from stimulus onset, were strongly modulated by the number of objects in Sample stimuli (Fig. 1B). One should note that in the present experiment, the number of objects covaried with the overall complexity of the stimulus and hence the strengthening of ERs with increasing memory load may be at least partly attributed to a concurrent change in physical stimulus properties.
ERs are generally thought to be generated by additive evoked activity but may also be influenced by event-related amplitude changes of non-zero mean oscillations (Nikulin et al., 2007) (see also de Munck and Bijma, 2010) and phase resetting (Makeig et al., 2002) of ongoing activity. To consider the contributions of these mechanisms into the generation of the ERs in the present VWM task, we quantified the phase locking of ongoing activity to the Sample and Test stimulus onsets (hereafter called “Stimulus Locking” (SL)) and evaluated the peri-stimulus amplitude dynamics by using the same broad-band filters that were used for ERs (Palva et al., 2005). The dynamics of SL was similar to that of ERs but SL was significant in a greater fraction of cortical regions (Fig 1C). The amplitude of ongoing cortical broad-band activity, on the contrary, was only briefly strengthened above the baseline level at around 100 ms after the Sample stimulus onset and thereafter suppressed below the baseline level for the rest of trial. Like ERs, both SL and the early amplitude enhancement were robustly correlated with memory load (Fig 1D).
We next identified the cortical regions involved in the processing of Sample stimuli. The first significant ER components (p < 0.001, FDR corrected) were found at around 70 ms. As the broad-band filter provided a good temporal resolution, we identified the ER sources in 10–20 ms steps (Suppl. Figs. 2, 3). The evoked activity spread rapidly from V1/V2 to higher-level extrastriate, occipito-temporal and parietal regions. One should note here, however, that the inverse modeling method used here (MNE) provides a distributed source estimate that is likely to be more widespread than the true cortical source. Hence the source reconstructions contain cross-talk between neighboring cortical patches in spatial scales of a few centimeters (Hauk et al., 2011). The P1 peak at 120-ms was observed in distributed visual areas in the occipital and inferior temporal cortices including the lateral occipito-temporal cortex (LOT), inferotemporal gyrus (ITG), and the intraparietal sulcus (IPS), as well as in the insular and cingulate cortices (Fig. 2A). The N1 peak at 190-ms was observed also in distributed regions in the posterior parietal cortex (PPC) as well as in central and superior temporal sulci. This constellation of occipital-, temporal-, and parietal regions was sustained for roughly 50 ms after which it gradually degraded. Broad-band SL was strongest in the same network of brain regions as the evoked response, but with a more smeared temporal evolution. This was expected because the phase estimates of ongoing activity, by definition, incorporate information across a time window that is dependent on the characteristic time scales of the activity itself. This may, in part, also explain the greater sensitivity of SL compared to ERs. It is interesting to also note that SL revealed early stimulus processing in several frontal cortical regions including the frontal eye fields (FEF) and lateral prefrontal cortex (LPFC) as well as in anterior insula (Suppl. Figs. 2, 3). In contrast, the brief broad-band amplitude enhancement was limited to striate cortex and extrastriate regions surrounding the occipital pole (Suppl. Figs. 2, 3).
VWM load strengthened ERs, amplitudes, and SL mostly in the same regions that were identified in the average condition (Fig 2B, Suppl. Figs 4, 5). While the memory load effect at P1 peak was limited to occipital, occipito-temporal and posterior parietal cortices, the VWM load strengthened N1 peak additionally in superior temporal, superior and orbito-frontal, insular, and cingulate cortices. The correlation of amplitude with memory load was robust throughout the posterior cortical networks and well co-localized with the spatio-temporal evolution of ERs (Suppl. Figs 4, 5). The SL, in contrast, was strongly correlated with memory load also in several more anterior structures including the anterior cingulate and insula as well as paracentral and superior precentral sulci (supplementary motor area (SMA) and FEF, respectively). These results hence suggest that in these data in posterior brain regions, ERs could have been generated by additive stimulus-evoked activity but in the anterior brain regions phase resetting of ongoing oscillations contributed to the generation of ERs.
Finally, we also localized the sustained slow evoked component that was salient in the ER obtained with 0.01–45 Hz filter (see Fig. 1A). After the termination of the transient ER and in the time window from 600 to 1100 ms, the slow component was strongest in the lateral prefrontal (LPFC) and cingulate cortices as well as in para-, pre-, and post-central areas (Fig. 2C). More posteriorly, the slow component was also observed in PPC and lingual gyrus.
Moreover, while the slow retention period ER component was largely co-localized with frontal load-dependent oscillation amplitude modulations (see below), it was uncorrelated with memory load, which suggests that this component might not be related to the amplitude changes of α or faster oscillations through the non-zero mean mechanism. The insensitivity to VWM load of the slow ER component suggests that it could reflect arousal- or alertness-related fluctuations (Schroeder and Lakatos, 2009, Monto et al., 2008).
To shed light on the functional significance of neuronal activity reflected in the ERs, we characterized the early components in more detail with post hoc analyses separately for each memory load condition in those cortical regions wherein the ERs were significantly correlated with memory load (see Fig. 2B). We first averaged the absolute-valued-ER time-series across this selection (Fig. 3A) and then used peak detection to estimate the mean latencies and amplitudes of the P1 and N1 peaks (at around 120 ms and 190 ms, respectively) across the subject population (Fig. 3B). The peak latencies were relatively stable across the memory loads and while N1 appeared to be earlier for higher VWM loads, this tendency was not significant. The peak amplitudes were enhanced with VWM load (Fig. 3C). We then focused the analysis into two anatomically defined regions of interest: one roughly containing V1 and V2 (calcarine fissure, cuneus, occipital pole, lingual gyrus) and another comprising the remaining extrastiate visual regions in lateral and inferior occipital (LO / IO) cortices (Fig. 2B and Fig. 3D). The strength of P1 was larger in the V1/V2 than in the LO/IO selection (Fig. 3D) but in both regions of interest, the strength of P1 and N1 were increased with the VWM load (Fig. 3E). Interestingly, while the amplitude of the P1 was essentially linearly dependent on the number of objects in the Sample stimulus, the amplitude of N1 increased from one to three objects and plateaued thereafter. This plateau onset is near the mean behavioral memory capacity in these data (4.1 ± 0.2, mean ± SEM; Palva et al., 2010a) and we hence investigated whether this plateau onset in individual subjects’ ERs would be correlated with their VWM capacity. We estimated the plateau onset by detecting the first positive peak in the difference of third and second order polynomials that were least-squares fitted to the series of ER peak amplitudes for the six memory loads (see Methods). The correlation of this ER-predicted capacity value with the real behavioral capacity value was then evaluated with Spearman’s correlation test. Intriguingly, this analysis showed that strength of N1 was, indeed, positively correlated with memory capacity in V1/V2 but not in LO/IO (Fig. 3F). P1 was uncorrelated with behavioral capacity in both V1/V2 and LO/IO.
Because early ER components are well known to reflect such physical stimulus properties, the memory-load dependence of P1 in these data cannot be used to infer a role for in VWM encoding. However, the correlation of N1 with individual behavioral memory capacity indicates that ERs in this latency range already reflect VWM-related processing and are not fully determined by the physical properties of the stimuli.
The principal aim of this study was to identify the cortical regions where non-stimulus locked ongoing oscillations are correlated with the VWM retention, memory load, or with behavioral performance. ERs, reflecting phasic stimulus-locked brain activity, lasted until ~400 ms from the Sample-stimulus onset and were VWM-capacity sensitive already at around ~200 ms suggesting that memory encoding could be completed before 400 ms. We thus consider the VWM-retention period in these data to be from 400 to 1100 ms from the Sample-stimulus onset. The majority of studies on the role of oscillations in VWM have used retention periods of up to three seconds. It has, however, been suggested that the mechanism of memory maintenance may change after first second of memory retention (Todd and Marois, 2004). The present study explores the neuronal substrates of this one-second memory retention with an experimental paradigm that has been psychophysically well validated (Luck and Vogel, 1997, Todd and Marois, 2004, Vogel and Machizawa, 2004, Vogel et al., 2005). To grasp the temporal and spectral evolution of event-related ongoing activity, we first computed time-frequency (TF) representations of SL and oscillation amplitudes by using a bank of 36 Morlet wavelets from 3 to 90 Hz. Similarly to the analyses of fMRI data (Kriegeskorte et al., 2009), TF-analyses of MEG data are susceptible to statistical inflation arising from circular data inspection and ROI selection. To have a non-circular discovery strategy, we first pooled the statistical TF data across all brain regions by plotting for each TF element the fraction of statistically significant patches (or M/EEG sensors). To control for multiple comparisons, we removed the number of significant TF elements predicted to be false discoveries at the chosen alpha level, A. Because VWM is known to involve a widely distributed cortical network, this approach was expected to reveal those TF regions where large-scale cortical activity was modulated by VWM processing.
We first evaluated the changes from baseline level with Wilcoxon-signed rank test for data averaged across the six memory loads. TFR for SL showed that frequencies up to 40 Hz were phase-locked to the stimulus onset for several hundreds of milliseconds from stimulus onset without a clear oscillatory component (A = 0.001, FDR corrected, Fig 4A). The TFR of oscillation amplitudes revealed an amplitude increase in the θ/α band during the first 400 ms after stimulus onset, which coincided with the period of strong stimulus-locked activity (A = 0.01, FDR corrected, Fig. 4B). After this transient, the oscillation amplitudes were suppressed below the baseline level throughout the analyzed frequency range from θ/α- to γ-frequency bands. This suppression was also evident in broad-band filtered data (see 1E). We also confirmed that the recording method would not affect the observed pattern of SL or oscillation amplitudes by computing the TFRs separately for MEG gradiometers, magnetometers, and EEG electrodes. All three recording methods converged with source modeling on a similar result for both SL and amplitude dynamics. The fraction of significant sensors was greater than the fraction of significant source patches suggesting that the source modeling was successful in disentangling cortical sources from their mixed sensor level signals (cf. Fig 4A–B, Suppl. Fig. 6).
To investigate the impact of memory load to the oscillatory dynamics, we identified the brain regions in which SL or amplitude dynamics were correlated with memory load by using the Spearman’s rank correlation test with the null hypothesis that the SL or amplitude values in the six individual memory-load conditions were uncorrelated among the memory loads from 1 to 6 objects. Prior to testing, the individual subjects’ data were rank transformed to ensure comparability across the population. The TFR of SL revealed an early positive correlation with memory load (A = 0.001, FDR corrected, Fig. 4C). The TFR of memory-load dependent amplitude changes showed an interesting pattern. Similarly to SL and ERs, the amplitudes of θ/α- and h-α-frequency bands were positively correlated with memory load during the first 200 ms after stimulus onset (A = 0.01, FDR corrected, Fig. 4D). The early positive effect thereafter changed into a negative correlation in the θ/α, h-α, and β bands, which indicates that the amplitude suppression was even deeper at higher memory loads. The negative correlation, however, was sustained through the VWM retention period only in the θ/α band. The amplitudes of h-α and β bands were positively correlated with memory load from 400 ms to the end of the VWM retention period even though they were was suppressed below the baseline level in the Average-condition.
On the basis of the TF-mapping, we designed a set of broad-band FIR filters to capture the θ/α-, h-α-, β-, γ-, and h-γ-frequency bands with a temporal resolution greater than that provided by the wavelets and we also used a patch collection with a spatial resolution higher than was used in the wavelet-based analysis (see Methods). The amplitude time series obtained with these filters reproduced very well the phenomena discovered with wavelets (A = 0.01, FDR corrected, Fig. 4E), which shows that the results presented here are not qualitatively affected by the filtering or cortical parcellation approaches. However, the dynamics obtained with these broad-band filters revealed that also γ- and h-γ-band amplitudes were strengthened with increasing memory load.
We used the FIR filtered θ/α-, h-α, β-, γ- and h-γ-band data to identify the cortical regions underlying the amplitude modulations during VWM retention. In data averaged across all six memory loads, the amplitudes were suppressed in distributed visual regions in striate, extrastriate, temporal, and parietal cortices (A = 0.01, Fig. 5A). In the β and γ bands, the suppression was most pronounced in medial occipital and parietal cortices, but also observed in occipito-temporal, and temporal visual areas. In addition, the suppression was observed in the same cortical regions during the early (400–700 ms from Sample stimulus onset) and late (700–1000 ms from Sample stimulus onset) parts of the retention period (Suppl. Fig. 7).
Increasing memory load was associated with a further suppression of amplitudes in the θ/α- and h-α bands in the occipital-, occipito-temporal-, and parietal regions (Fig. 5B). The amplitudes in the h-α, β, γ, h-γ bands, and very weakly in the θ/α band, were strengthened in widespread fronto-parietal cortical regions including the LPFC, pre- and post-central regions, and cingulate, insular, and orbito-frontal cortices. Interestingly, the β- and γ-band amplitudes were, in addition, strengthened in several visual regions in the occipital and occipito-temporal, and temporal cortices that are known to play a central role in the formation of visual object representations (Konen and Kastner, 2008, Riesenhuber and Poggio, 2002, Grill-Spector and Malach, 2004). While the anterior positive correlations remained very similar at A = 0.001, the posterior positive correlations in β and γ bands did not exceed this higher alpha level. High-γ-band amplitudes were positively correlated with memory load in LPFC, posterior parietal, temporal, insular cortical regions. Importantly, the network of load-dependent h-γ oscillations was spatially stable throughout the retention period (see Suppl. Fig. 7).
The Morlet-wavelet-analysis approach revealed essentially the same fronto-parietal pattern indicating that the parcellation scheme did not affect the primary results (A = 0.01, FDR corrected, Suppl. Fig. 8). However, the memory-load dependent strengthening of β- and γ-band amplitudes in visual regions did not exceed the significance threshold, which is likely to be caused by a greater spectral variability in these bands across the subjects.
The results have so far clearly indicated that the amplitudes of h-α-, β-, γ-, and h-γ-band oscillations were positively correlated with memory load in a fronto-parietal network (see Fig. 5B). To further characterize this phenomenon with post-hoc analyses, we estimated the amplitudes in all frequency bands separately for each memory load and individual subject as an average across the cortical regions that were positively correlated with memory load (see Fig. 5B). These data corroborated that the amplitudes were indeed memory-load dependent during the memory retention period as a function of memory load (Fig. 6A). Evaluation of mean amplitudes averaged over the later-half of the retention period (700–1000 ms) showed that oscillation amplitudes were systematically increased with memory load, but with a biphasic shape and a plateau at 3–4 objects (Fig. 6B). The analysis for the Morlet-wavelet filtered data showed similarly that both the strengths and fraction of significant cortical regions were systematically modulated by the increasing memory load (Suppl. Fig. 9A–D).
To address the link between the retention period amplitude modulations and individual behavioral VWM capacity, we used the same approach as above for ERs (Fig. 3). In the bilateral fronto-parietal regions, the plateau onsets of the h-α-band amplitudes were negatively correlated with memory capacity (Fig. 6C) implying that in high-capacity subjects, the load-dependent alpha strengthening leveled off at lower memory loads than in low-capacity subjects. On the other hand, despite a similar negative trend, neither β- nor γ-band amplitudes were significantly correlated with the memory capacity. Interestingly, although the h-γ-band amplitudes were only weakly correlated with the VWM load, the plateau onsets in h-γ band were clearly positively correlated with the individual VWM capacity.
We also investigated the correlation of the load-dependent amplitude suppression with memory capacity. We first estimated the relative amplitudes from those cortical regions that were negatively correlated with the memory load (Supplementary Figure 10A). The amplitudes were then averaged over the retention period (Supplementary Figure 10B). Lastly, we estimated the correlation between the predicted and real capacity values (Supplementary Figure 10C). The amplitude suppression was not, however, significantly correlated with the individual memory capacity in any of the studied frequency bands.
Finally, to corroborate the group analysis approach, we asked whether low- and high-capacity subject populations were associated with qualitatively and phenomenologically similar retention-period oscillation dynamics. The subject group was divided into two halves by their individual memory capacity and cortical regions where oscillation amplitudes were correlated memory load were identified with Spearman’s rank correlation test (p < 0.01, FDR corrected) as above (Fig. 7). The left-lateralized but bilaterally highly significant fronto-parietal h-α-, β- and γ-band oscillations were observed in both subject populations although they were more pronounced for the high capacity subjects. On the other hand, only the high-capacity group was associated with a significant memory-load correlation in the high-γ band. The apparent absence of memory-load dependent high-γ in the low-capacity group may, however, be associated the overall poor signal-to-noise-ratio in this frequency band. Interestingly, the amplitude suppression in the h-α-band was less pronounced for the high- than the low-capacity subjects.
The processing of Sample stimuli was associated with a rapid spread of activity during the first ~200 ms from occipital to temporal and parietal cortices. These regions are known to underlie the formation of visual object representations (Konen and Kastner, 2008, Riesenhuber and Poggio, 2002, Grill-Spector and Malach, 2004, Singer, 1999). The P1 component (~120 ms) was linearly strengthened by the VWM load, whereas the magnitude of the N1 component (~190 ms) plateaued with increasing memory load. This plateau onset was positively correlated with individual behavioral VWM capacity in the early visual regions. Interestingly, during the N1 component, activity in visual regions was accompanied with concurrent activation of frontal and parietal structures. It is thus likely that already the stage of stimulus processing and memory encoding reflected in the N1 component, fronto-parieto-sensory interaction and fronto-parietal top-down modulation of early visual regions give rise to individual behavioral memory capacity limits.
Evoked stimulus processing was followed by amplitude suppression in posterior, central, and frontal VWM-related cortical areas in frequency bands from 5 to 90 Hz. A similar wide-band amplitude suppression has been observed to follow sensory stimuli in MEG (Palva et al., 2005) and to characterize the VWM retention period in human EEG (Medendorp et al., 2007) and in direct cortical (Axmacher et al., 2007, Meltzer et al., 2008) recordings. In the present and prior VWM MEG/EEG data, increasing memory load strengthened the amplitudes of h-α-, (Haenschel et al., 2009, Leiberg et al., 2006, Busch and Herrmann, 2003, Grimault et al., 2009), β- (Leiberg, et al., 2006), and γ- (Osipova et al., 2006, Tallon-Baudry et al., 1998, Haenschel et al., 2009) frequency bands. In the present data, the amplitudes of h-α, β-,γ-, and h-γ-frequency bands were positively correlated with memory load in several frontal, parietal, and temporal regions as well as in the cingulate and insular cortices. Prior MEG studies suggest that the positive memory-load dependence in the α band originates from occipito-parietal brain regions (Medendorp et al., 2007, Osipova et al., 2006, Jensen et al., 2002), which is in contrast with the frontal sources found here. However, the central difference in these studies is the time window of the effect. In the present study, the retention period effects are observed in a time window of 0.5 to 1 s and in those cited above, in a time window from 1 to 3 s. In fact, these findings together constitute neurophysiological support for the notion that there is transition in VWM maintenance mechanisms at around 1 s (Todd and Marois, 2004).
These data thus show load-dependently strengthened oscillation amplitudes in several cortical regions which have earlier been identified in VWM studies using fMRI (Pessoa et al., 2002, Munk et al., 2002, Prabhakaran et al., 2000, Rowe et al., 2000, Todd and Marois, 2004, Linden et al., 2003). Although specific frontal (DLPFC and FEF) and parietal (S/IPG, IPS) regions are well recognized cortical substrates for WM function, also a number of other areas have been revealed by fMRI and here with M/EEG. While the functional roles of cingulate, orbital, and insular cortices in VWM have remained unclear, it is intriguing that their multi-band amplitude dynamics were in the present study similar to those observed in fronto-parietal regions. The cingulate, orbital, and insular structures were also strongly phase synchronized with the fronto-parietal networks in these frequency bands in our prior work (Palva et al., 2010a). These results together suggest that in addition to classical VWM regions, also these regions contribute to the VWM maintenance in sub-second time scales.
Three aspects of the present data yield insight into the functional roles of oscillatory activity in VWM: memory-load dependence, anatomical localization, and correlation with behavior. Frontal and parietal regions are thought to underlie attentional and central executive functions in VWM as indexed by fMRI (Prabhakaran et al., 2000, Rowe et al., 2000, Sakai et al., 2002, Petrides, 2005, Curtis and D'Esposito, 2003) and lesion (Voytek and Knight, 2010) studies. Hence, the fronto-parietal localization of the positive memory-load dependence of h-α, β, γ, and h-γ oscillations implies that they are associated with attentional functions of VWM retention. This finding thus supports earlier suggestions for a role for h-α- (Palva and Palva, 2007, Klimesch et al., 2007), β- (Gross et al., 2004, Buschman and Miller, 2007), γ-, and h-γ- (Fries, 2009) oscillations in attention. β- and γ-frequency band oscillations were strengthened also over several visual cortical regions including the infero- and occipito-temporal regions that are associated with visual object processing (Konen and Kastner, 2008, Riesenhuber and Poggio, 2002, Grill-Spector and Malach, 2004), which is in line with the idea that β- and γ-band oscillations support the maintenance of object representations in VWM (Palva et al., 2010a, Tallon-Baudry et al., 1998, Jokisch and Jensen, 2007). Thus, in terms of activity localized to sensory brain regions, the present data are in line with the currently prevailing idea of distinct functional roles for alpha vs. beta-gamma oscillations. Alpha oscillation amplitude dynamics are often interpreted along the lines of the inhibition hypothesis (Klimesch et al., 2007, Pfurtscheller, 2003), which posits that large alpha amplitudes in a given region reflect the disengagement of that region from task-relevant processing and that α-band amplitude suppression reflects a release from inhibition of that region. However, both the overall amplitude suppression and the positive memory-load dependence in h-α, β, γ, and h-γ bands was largely co-localized in the fronto-parietal network, which is difficult to reconcile with the inhibition hypothesis. It is thus possible that the phenomenology and functional roles of alpha and perhaps of also other oscillations depend on the level of cortical hierarchy (Bollimunta et al., 2008).
A number of fMRI studies report that activity in IPS and LOC may predict the behavioral VWM capacity (Todd and Marois, 2004, Robitaille et al., 2010, Xu and Chun, 2006). In MEG and EEG, the lateralization of ER components (Robitaille et al., 2010, Vogel and Machizawa, 2004), posterior α-band amplitude modulations (Sauseng et al., 2009), and the strength of inter-areal phase synchrony in α- and β-bands (Palva et al., 2010a) are also correlated with the individual VWM capacity. In the present study, we found that the plateau in the load-dependent magnitude of the N1 evoked response component in early visual regions was positively correlated with individual capacity. During VWM retention period, the memory-load dependent strengthening of h-α-band amplitudes in the fronto-parietal regions was negatively correlated with the individual capacity showing that h-α amplitude peaked or plateaued at progressively lower memory loads for higher-capacity subjects. This suggests that enhanced h-α oscillations are associated with attentional control when the number of to-be-memorized items exceeds the individual memory capacity. In contrast, the plateau in the load-dependent strengthening of h-γ oscillations was positively correlated with individual VWM capacity, which supports the concept that h-γ oscillations underlie integrated neuronal object representations (Singer, 1999) that are sustained in an active state during VWM maintenance. Accordingly, the memory-load dependence of h-γ oscillations was also more pronounced in high- than in low-capacity subjects. However, the high- and low-capacity subject groups were overall characterized by qualitatively similar patterns of amplitude modulations, which indicates that these data do not warrant strongly dichotomous interpretations on subject classification. Overall, a growing body of evidence demonstrates that the neuronal mechanisms underlying the limited human VWM capacity may be more heterogeneous and complex than previously conceived and that these mechanisms include both encoding- and retention-related neuronal activities.
We have recently shown that inter-areal 1:1-phase synchrony in α, β, and γ bands is load-dependently strengthened in a fronto-parietal network (Palva et al., 2010a) wherein also the oscillation amplitudes were load-dependently strengthened. This raised the question of whether the synchronization estimates were biased by an improved signal-to-noise ratio associated with the strengthening of amplitudes. The concurrent changes in phase synchrony were, however, much stronger than those predicted by SNR changes (Palva et al., 2010a), suggesting that the positive correlation of fronto-parietal oscillation amplitudes with memory load could be a reflection of the co-localized strengthening of short-range phase synchronization. In line with this notion, the network topologies in the h-α and β bands were more clustered and small-world-like than those in theta or gamma frequency bands (Palva et al., 2010b). Hence, phase synchrony in dense core-like structures is likely to be positively correlated with local amplitude changes while in the topologically less clustered brain regions, phase synchrony and local amplitude dynamics can be more independent. In line with previous studies (Freunberger et al., 2008, Freunberger et al., 2009, Doesburg et al., 2009), inter-areal synchronization was enhanced above baseline levels throughout the memory retention period (Palva et al., 2010a) whereas the oscillation amplitudes were suppressed below the baseline. This important phenomenological dichotomy between oscillation amplitudes and synchrony shows that local amplitude modulations and long-range inter-areal phase synchronization are dissociable phenomena that conceivably reflect local processing vs. large-scale interactions.
This study was supported by the Academy of Finland, by the Helsinki University Research Funds and NCRR Center for Functional Neuroimaging Technologies P41 RR14075 and by NIH grants 1R01EB009048 and 1R01 EB006385