|Home | About | Journals | Submit | Contact Us | Français|
In this study, we focus our investigation on task-specific cognitive modulation of early cortical auditory processing in human cerebral cortex. During the experiments, we acquired whole-head magnetoencephalography data while participants were performing an auditory delayed-match-to-sample (DMS) task and associated control tasks. Using a spatial filtering beamformer technique to simultaneously estimate multiple source activities inside the human brain, we observed a significant DMS-specific suppression of the auditory evoked response to the second stimulus in a sound pair, with the center of the effect being located in the vicinity of the left auditory cortex. For the right auditory cortex, a non-invariant suppression effect was observed in both DMS and control tasks. Furthermore, analysis of coherence revealed a beta band (12~20Hz) DMS-specific enhanced functional interaction between the sources in left auditory cortex and those in left inferior frontal gyrus, which has been shown to be involved in short-term memory processing during the delay period of DMS task. Our findings support the view that early evoked cortical responses to incoming acoustic stimuli can be modulated by task-specific cognitive functions by means of frontal–temporal functional interactions.
Modulation of auditory cortical responses evoked by acoustic stimuli has been widely observed in both animal and human research. Studies using anesthetized or awake animals have shown modulation effects induced by acoustic context (Condon and Weinberger, 1991; Ulanovsky et al., 2003; Bartlett and Wang, 2005), attention (Fritz et al., 2003), behavioral state (Gottlieb et al., 1989; Fritz et al., 2005), and self-initiated vocalization (Eliades and Wang, 2003). A broad spectrum of excitatory and/or inhibitory modulation effects have been observed in studies with different focuses and different experimental manipulations. In humans, such a broad spectrum of modulation effects has also been reported to occur in a number of evoked cortical responses, including the M100/N1 response in magnetoencephalographic (MEG) and electroencephalographic (EEG) studies (Hillyard et al., 1973; Stanny and Elfner, 1980; Näätänen, 1990; Woldorff et al., 1993; Jääskeläinen et al., 2004; Ahveninen et al., 2006; Sabri et al., 2006; Manuel et al., 2010).
As one of the early MEG/EEG evoked cortical responses with a latency of around 100ms after stimulus onset, the M100/N1 is believed to be correlated with the detection of changes in the acoustic environment (Näätänen and Picton, 1987; Hari, 1990). Modulation of this transient response has shown both enhancement and suppression effects in previous experiments. Passive listening (PSL) tasks showed adapted M100/N1 response to repetitively presented stimuli (May et al., 1999; May and Tiitinen, 2004). In behavioral paradigms requiring active manipulation of attention to a task-related auditory domain, such as dichotic listening (Hillyard et al., 1973; Woldorff et al., 1993; Brancucci et al., 2004) and selective attention tasks (Fujiwara et al., 1998), enhancement of the M100/N1 response to the attended and relative suppression of the responses to the unattended stimuli/features (Sabri et al., 2006) have been observed. Self-initiated tones (Schafer and Marcus, 1973; Martikainen et al., 2005) or speech sounds (Houde et al., 2002) have displayed exclusively suppressive effects. Active task performance paradigms requiring memory processing, such as discrimination (Melara et al., 2005) and working memory tasks (Lu et al., 1992; May and Tiitinen, 2004; Luo et al., 2005), have shown a mixture of modulation effects – increases, decreases, or both have been observed. Hypotheses concerning the mechanistic interpretation of these findings include forward masking (Wehr and Zador, 2005), “repetitive suppression” (Näätänen et al., 2001; Ulanovsky et al., 2003), and feedback modulation from downstream neural populations (Miller and Cohen, 2001; Friston, 2005). Forward masking and repetitive suppression hypotheses emphasize the intrinsic automatic adaptation to repeated stimulus presentations (for a review, see Grill-Spector et al., 2006 and the comment in Baldeweg, 2006) and insensitivity to different cognitive and behavioral conditions, whereas feedback modulation can arise from functional interactions between multiple regions involved in specific cognitive functions.
In this study we use MEG and analysis of the current sources inside the brain to investigate the modulation of evoked responses in human auditory cortex during performance of a delayed-match-to-sample (DMS) task (see Abbreviation Table in Appendix for a list of all major abbreviations we use). The analysis was via comparison with two control tasks: PSL and simple counting (CNT). Performing the DMS task involves formation, maintenance, and manipulation of the short-term memory (STM) of the first sound (S1) in a pair of acoustic stimuli during a silent delay period (Gottlieb et al., 1989; Zatorre and Samson, 1991; Lu et al., 1992; Pasternak and Greenlee, 2005), as well as decision-making and motor responses based on the comparison to the perceived second stimulus (S2; Postle et al., 1999). By contrast, the PSL task does not require the active maintenance of the STM trace, although participants still need to pay attention and listen to the sounds; the CNT task requires participants to maintain the numeric memory of the presence of the sounds, but not the memory of their acoustic features, which is required during performance of the DMS task. We hypothesized that a task-specific modulation of the auditory evoked responses (AER) to S2, possibly related to maintenance/retrieval of the STM of S1 and anticipation of the upcoming S2, will be observed in DMS task.
In addition, it has been suggested that during cognitive task performance, anterior–posterior oscillations in a broad spectrum of frequency bands are involved in memory processing (Klimesch, 1999; Lutzenberger et al., 2002; Palva and Palva, 2007). By measuring coherence between the cortical current sources in frequency bands from delta to gamma, we investigate the DMS-specific functional interactions between cortical regions to explore the involvement of these top-down neural mechanisms in the DMS-specific modulation of human auditory cortex.
Healthy right-handed adults (n=12; age, 23–35 years; six females) with normal or corrected-to-normal vision and normal hearing participated in the experiments. For each participant, MEG and structural MRI data were acquired in separate scans. Informed consents by the participant were obtained before each scan. The consent forms were approved by the NIDCD-NINDS IRB (protocol NIH 92-DC-0178) and University of Maryland, College Park IRB (IRB#01566).
Each MEG scan had nine recording sessions. Six of them were task sessions with three types of task conditions: PSL, counting (CNT), and a delayed-match-to-sample (DMS) task. Each task had two sessions with two different types of stimuli for each. The stimuli (Figure (Figure1A)1A) were pure tones (Tone) and tonal contours (TC). Each stimulus is an acoustic sound with duration of 350ms. Each Tone has one frequency component. Each TC consists of two 125ms up or down frequency modulated (FM) sweeps interspersed by a 100-ms tone. We kept the tasks in the order of PSL→CNT→DMS to avoid a potential CNT or DMS task performance influence on PSL. The order of Tone or TC sessions with same task type was randomly assigned and counter-balanced among participants. Each recording session had 100 trials. Each trial (Figure (Figure1B)1B) was 3.7s in duration, started with a 500-ms silent period (baseline), followed by a pair of stimuli (S1 and S2, respectively) with a 1-s silent period (delay) between S1 and S2, and a 1.5-s inter-trial interval (ITI) after offset of S2. The ITI period also served as the response time (RT) in the DMS task. Within each recording session, match (exactly identical S1 and S2) and non-match (different S1 and S2) trials were randomly mixed and counter-balanced. The sound stimuli were presented to a participant at a fixed level between 65 and 75dBA, which was determined by testing the participant before the MEG scan to make sure the participant could hear the sounds clearly and comfortably. Each session began with a visual instruction presented on a screen that informed the participant about the task condition, response requirement, and type of stimuli. The instruction also informed the participants of our requirement of fixating on a cross mark at the center of the screen during each trial. In the PSL sessions, participants were instructed to relax, stay still, and listen to the sounds without any response; in the CNT sessions, participants were instructed to count the number of sounds and report how many they had heard at the end of each corresponding session; in the DMS sessions, participants were instructed to compare the two sounds in each trial, and press the left button with the left thumb for a match and press the right button with the right thumb for a non-match. The button box was held in both hands in all sessions. Therefore, each experimental condition is a combination of task type (PSL, CNT, or DMS), sound type (Tone or TC), and S1/S2 matching type (match or non-match). In addition to the task sessions, each participant had two DMS training sessions and one click-counting session. The DMS training sessions were before the DMS tasks; each had 40 trials with either Tones or TC to familiarize the participant with the task. In the click-counting session, which was used to determine the peak latency and time window of the M100 response, we played 50ms 1kHz clicks and instructed the participant to count the number of the sounds.
Participants lay in a supine position during the MEG scans. MEG signals were recorded with a CTF Omega2000 275-channel whole-head MEG System (CTF Systems, Inc., Coquitlam, Canada) placed in a magnetically shielded room (Vacuumschmelze, Germany) inside the MEG Laboratory of the National Institute of Mental Health (Bethesda, MD, USA). The ongoing MEG signals were sampled at 600Hz, filtered with a 150-Hz low pass analog filter, balanced with third gradient coils for noise reduction, and then stored for off-line analysis. Temporal events, such as stimuli onsets and button presses (DMS sessions only) in each trial, were on-line marked. In a separate scan, we acquired the anatomical map of the same participant's brain with a T1-weighted protocol (MPRAGE; 24cm×24cm FOV; 128 axial slices; 1mm×1mm×1.2mm voxel size), using a 3-Tesla Signa MR scanner (General Electric, Waukesha, WI, USA). For the purpose of spatial alignment between the MEG sensors and the anatomical structures, three fiducial points (one nasion and two preauricular) were marked for each participant. On these points, head coils were fixed during the MEG scanning and Vitamin E capsules were attached during the MRI scanning to mark their locations. In addition, we localized the head coils at the beginning and the end of each MEG recording session to detect head motions. When head movements exceeded 0.5cm during a session, the whole session was discarded and the subject was rescanned.
With the stored raw MEG signal, we took four preprocessing steps to reduce noise and artifact contamination: (1) remove the DC offset based on the whole trial trend; (2) remove the power line noise plus harmonics with notch filters centered at 60, 120, 180, and 240Hz (fourth order paired band elimination filters with width=8Hz); (3) remove the low-frequency fluctuations with a high-pass filter (stop frequency=0.5Hz); and (4) remove artifacts (EKG, EOG, and motion related signals) using an automatic clustering method based on independent component analysis (ICA; Rong and Contreras-Vidal, 2006). MEG signals from three subjects (one male, two females) were removed from further analysis due to incomplete experiments or excessive artifact contamination. The noise-reduced and artifacts-cleaned datasets of the remaining nine subjects (four females) were then partitioned on a single-trial basis for further analysis. For each task trial, a 3.7-s epoch time-locked to the onset of S1 was extracted (Figure (Figure1B).1B). The epoch includes a 0.5-s baseline period at the beginning, followed by the first sound stimulus (S1, 0.35s), the delay period (1s), the second sound stimulus (S2, 0.35s), and the response period/ITI (1.5s). For each of the click-counting trials, the epoch was 1.05s time-locked to the stimulus onset with a 0.5-s baseline.
In this study, we were particularly interested in task-related modulation of the M100. The M100 response is usually seen as a deflection in the epochs of the averaged field strength with its peak at ~100ms after sound stimulus onset (Figure (FigureA1AA1A in Appendix: Sensor Space Analysis of the Modulation Effect). At the peak latency, it usually shows a bilateral dipole-like contour pattern of the magnetic field with a “source” and a “sink” located at fronto-temporal and parieto-temporal regions (Figure (FigureA1BA1B in Appendix: Sensor Space Analysis of the Modulation Effect). We used data from the click-counting session, which is independent to the task sessions, to determine a subset of representative sensors for M100 analysis in each participant. By examining the averaged epochs from the click-counting session, 20 sensors (10 per hemisphere) surrounding the centers of the “sources” and “sinks” of the peak M100 contour were selected as the representative sensors for the participant (c.f., Luo et al., 2005). Based upon the signal from these representative sensors, we calculated the root mean squares (RMS) of the averaged magnetic field time course, identified the peak RMS value at ~100ms after stimulus onset, and defined the time point as the peak latency of the AER to each stimulus in each experimental condition. In addition, a 50-ms time window centered at the peak latency is defined as the window of AER. Therefore, we obtained one peak latency and one corresponding AER window for each stimulus under each experimental condition. Analysis of the sensor space data is presented in Section “Sensor Space Analysis of the Modulation Effect” in Appendix.
In addition to determination of peak latencies and AER windows in sensor space, we estimated the multiple source activities distributed across the brain using the all-sensor MEG epochs. The sources were imaged with an event-related beamformer algorithm based on the linearly constrained minimum variance (LCMV) method (Van Veen et al., 1997), for which the forward source–sensor relationship was modeled by a multiple local-sphere head model (Huang et al., 1999). Each model was a 20cm×20cm×17cm spatial grid composed of 5mm×5mm×5mm cubic voxels covering the participant's head. The integrated intracellular synaptic current of the neuronal population inside each voxel was estimated by a source dipole whose origin was located at the center of the cube. Each source dipole's activity was quantified by a measure of normalized power (“neural activity indices” – NAI). Using this imaging method, we took the following steps to quantify the AER and modulation effects in each source: (1) we computed a time course of NAI values for each source on a single-trial basis; (2) AER to S1 and S2 in each trial were quantified as integrated NAI in the AER window of the corresponding experimental condition, then normalized to baseline by subtracting the averaged NAI during the baseline period; (3) the modulation effect of each experimental condition was measured as an modulation index (MI) value calculated from the normalized AER values:
where AER1 and AER2 represent the normalized quantification of AER to S1 and S2, respectively. The MI values range from −1 to 1, where the positive values indicate decreased evoked response to S2 as compared to evoked response to S1, and the negative values indicate the opposite effect. Hence, if the mean MI value from one condition is significantly greater than zero, a significant suppressive modulation effect is inferred.
With the quantified AER and MI values, we took two independent approaches to test the hypothesis that the modulation of the AER in the DMS tasks is significantly different from the effect in the control tasks. One approach applied within-participant analysis by using paired t-tests to compare the normalized AER to S1 and S2 for each experimental condition. The sources that showed a significant difference (FDR corrected p<0.05) were then taken as sources demonstrating within-participant significant modulation of the evoked responses for the corresponding condition. With the resulting probability images, the sources in bilateral temporal cortices with maximal absolute t values in the DMS tasks were selected as the representative sources for further statistical analysis. For each representative source, a MI value was computed using Eq. 1 for each experimental condition. With the MI values from all participants, we tested the hypothesis statistically by applying repeated measures ANOVA with three factors: task (PSL, CNT, DMS), sound type (Tone, TC), and trial type (match, non-match), which was followed by post hoc comparison between the mean MI values of single experimental conditions using the Tukey–Kramer method. We used SAS v9.1 (SAS Institute Inc., Cary, NC, USA) for statistical analyses of this approach.
In addition to assessment of the modulation effects by selecting a single source to represent the auditory cortical cluster showing a significant difference, we employed another approach to visualize the spatial expansion of the DMS-specific modulation effect by analysis of all sources. With this approach, we computed a MI image including all sources for each experimental condition, and used a two-way three-dimensional ANOVA (type 4 3dANOVA3) provided by AFNI (Analysis of Functional NeuroImages; (Cox, 1996); NIMH, Bethesda, MD, USA; also refer to http://afni.nimh.nih.gov/) to analyze the group-level modulation effect across all sources. The variance analysis was performed with two factors: task (PSL and DMS) and sound type (Tone and TC). To correct for statistical comparison of multiple sources, Monte Carlo simulation with estimation of the between-source spatial correlation (Forman et al., 1995) was used to determine the criteria (the threshold cluster size and uncorrected probability value for each source within the cluster) of statistical significance (corrected p<0.05).
We focused our interest of interregional functional interactions to coherence between the representative sources and all other sources in the brain. For each participant, we selected the representative source that demonstrates the DMS-specific modulation effect as a reference, and computed the coherence values using the dynamic imaging of coherent sources (DICS) method (Gross et al., 2001). The coherence values were computed on a single frequency based in a broad frequency range from 2 to 50Hz, with a step size of 2Hz. We then averaged the coherence values in frequency bands of delta (2~4Hz), theta (4~8Hz), alpha (8~12Hz), beta (12~20Hz), high beta (20~30Hz), and gamma (30~50Hz). For each frequency band, the modulation related changes of the functional interactions were quantified as the ratio of coherence change (RCC) values, which were computed as normalized differences between the coherence values obtained from the late delay period (0.5~1s after offset of S1, which is a 500-ms window before onset of S2) and the coherence values obtained from the baseline period (the 500ms window before onset of S1)
where Ldelay and baseline represent the coherence values in duration of late delay and baseline periods, respectively. The RCC value ranges from −1 to 1, where positive RCC values represent increased late delay period coherence as compared to baseline period. We then used the two-way three-dimensional ANOVA method described in the previous section to analyze the RCC values to test our hypothesis that during the delay period, frontal brain regions related to cognitive functions recruited for performance of the DMS task would show increased functional interaction with the temporal sources that have shown DMS-specific modulation of the evoked responses. The factors included task (PSL and DMS) and sound type (Tone and TC). Monte Carlo simulation was also used to estimate the criteria of statistical significance for both ANOVA and contrast between experimental conditions. Only the clusters showing significant task or task×sound type effect, and significant difference in contrast between PSL and DMS conditions, were considered as clusters demonstrating DMS-specific functional interaction with the reference sources. Threshold statistics for each individual source are F1,8>14.64 for ANOVA and t>3.826 (df=8) for simple contrasts, corresponding to uncorrected p<0.005.
In the counting task, all participants recalled the number of sounds they heard with counting error within ±2 in each session. In the DMS task, all participants showed accuracy above 84%. A significant sound type×trial type interaction was observed (two-way ANOVA, F1,8=12.9, p=0.007), which could be accounted for by the lower performance level on the TC non-match trials (TC_N, 91.1±0.95%, mean±SEM) than on the other three conditions (Tone_M: 99.8±0.95%; Tone_N: 98.7±0.95%; TC_M: 98.9±0.95%). RT in each trial was measured as the time elapsed from the onset of S2 to the button press in the DMS task. Analysis of variance revealed a significant sound type effect (Figure (Figure2)2) on RT (one-way ANOVA, F1,8=6.1, p=0.039), where the RT for TC stimuli (812±36.4ms, mean±SEM) was significantly longer than the RT to Tones (754±36.3ms). No significant effect of trial type or sound type×trial type interaction was observed. Our observation of longer RT for TC is consistent with the results in an fMRI study using the same set of stimuli (Husain et al., 2004).
Figure Figure3A3A provides an example of the within-participant comparisons between the AERs to S1 and S2 under the three experimental conditions. The data are from the matched trials using TC stimuli for participant #4. Overlaying on a standard anatomical atlas (Talairach and Tournoux, 1988), the three probability maps highlight the clusters of the left hemisphere sources that demonstrate significant differences between the evoked responses to S1 and S2 in the PSL, CNT, and DMS tasks, respectively. The blobs with bright colors indicate the spatial locations of the clusters. In each task condition, the probability map displays multiple clusters of sources with significant difference between AER to S1 and S2: the cluster in the superior temporal region (where auditory cortex is located) shows up in all three tasks, which has more voxels for the DMS task than the control tasks, indicating an expanded suppressive modulation effect during performance of the DMS task than during the control tasks. In contrast, the anterior cluster also showing up in all three conditions contains fewer sources for the DMS task than control tasks, indicating a weaker modulation effect for the frontal sources in the DMS task. Unlike the above two clusters, the posterior clusters appears only in the CNT and DMS tasks. Between them the signs of the modulation effects are opposite (a greater response to S2 than the response to S1), which suggests enhancement of the evoked responses to S2 rather than suppression for these current sources. Though most within-participant analyses display more than one left hemisphere cluster showing significantly different AERs to S1 and S2 among the experimental conditions, only the left temporal cluster showed consistent patterns of task-specific modulation effects. Number and spatial location of the voxels in this cluster are different among participants.
In addition to the within-participant analysis, group analysis of the left representative sources demonstrated a DMS-specific suppressed AER to S2, as displayed by the grand mean activity waveforms of the left representative sources averaged across all participants (Figure (Figure3B).3B). The locations of these representative sources (Talairach coordinates: [−52±9.3, −24±7.8, 8±4.7], mean±SD) are within the vicinity of the left primary auditory cortex (Heschl's gyrus) and adjacent planum temporale region (Hall et al., 2003), consistent with the distribution of the superior temporal sources for M100 responses that have been described in previous studies (Hari, 1990; Herdman et al., 2003). Variance analysis of the MI values from the representative sensors confirmed this finding. It demonstrated a significant task effect (one-way ANOVA, F2,16=9.64, p=0.0018). No other main factor or interaction effects were observed. For each experimental condition, the mean MI values for DMS_Tone (t=4.48, df=8, p=0.002) and DMS_TC (t=7.80, df=8, p<0.0001) demonstrated significant suppressive modulations of the AER to S2 as compared to the AER to S1, where none of the mean MI values from the control tasks was significantly different from zero (Figure (Figure3C).3C). Post hoc comparisons of MI values between experimental conditions revealed that the mean MI value of DMS_TC was significantly greater than both PSL_TC (p<0.01, Tukey–Kramer method) and CNT_TC (p<0.01), which indicates a greater suppression of the left auditory AER to S2 during performance of the DMS task with TC stimuli than the control tasks. We did not observe any significant difference between the mean MI values with Tone stimuli. Furthermore, the significantly greater mean MI value for DMS_TC than for DMS_Tone (p<0.05, Tukey–Kramer method) suggests a greater suppression of the AER to TCs than the effect to Tones during performance of the DMS task. Examination of individual data showed consistent task-specific modulation patterns in the left auditory cortex – seven out of nine participants display greater MI values for DMS_TC than PSL_TC condition (Figure (Figure33D).
Modulation of AER in right auditory cortex showed different patterns from modulation effects displayed in the left auditory cortex. As an example, Figure Figure4A4A illustrates the cluster(s) of sources in the right hemisphere of participant #4 that showed a significant difference between evoked responses to S1 and S2. The data are from the tasks with TC stimuli. In contrast to the left hemisphere, the cluster in the right temporal region displays a similar modulation pattern across all three tasks for this participant. The locations of the right representative sources are roughly mirror symmetries to the left representative sources (Talairach coordinates: [57±6.5, −24±6.1, 9±7.9], mean±SD), with the center coordinates falling in the vicinity of the right auditory cortex. While the spatial location of the representative sources in each hemisphere demonstrated a rough symmetry, the averaged activity waveforms from the right representative sources displayed a pattern different from what was seen on the left side: suppression of the AERs to S2 was observed in all three tasks, although for the Tone stimuli, the CNT and DMS tasks showed a reduced suppressive modulation effect (Figure (Figure4B).4B). Group analysis shows no significant difference in the mean MI values across all three tasks (Figure (Figure4C;4C; one-way ANOVA, F2,16=2.44, p=0.12). Individual MI values from the right representative sensors also displayed smaller differences in the MI values between the DMS_TC and PSL_TC conditions than what was demonstrated by the left representative auditory sources (Figure (Figure44D).
Statistical analysis using the MI values across all sources revealed consistent results in left temporal region to the modulation effects demonstrated by the analysis of the representative sources: a cluster of sources in left auditory cortex with significant suppression of the AER to S2 in the DMS tasks as compared to the PSL conditions (Figure (Figure5A)5A) was observed. This cluster extended from left superior temporal gyrus (STG) (BA22) to left insula (BA13). In addition to the left temporal cluster, two other clusters also displayed greater suppressive modulation effect during performance of the DMS task than during the PSL conditions: one was located in the left orbital frontal region (Figure (Figure5B)5B) and the other one in the premotor area of the right middle frontal cortex (Figure (Figure5C).5C). These additional clusters suggest involvement of corresponding regions in the network dynamics specifically correlated with performing the auditory DMS tasks.
Analysis of the modulation effect in cortical source activities demonstrated a DMS-specific suppressive modulation of the AER in response to S2 in the left auditory cortex. We asked the question whether there existed correlated task-specific functional interaction changes between the left auditory cortex and other brain regions. Among the frequency bands from delta to gamma that had been covered by analysis of RCC values, a single cluster of sources showed stronger functional interaction during the DMS task than the PSL task in the beta band (12~20Hz). The cluster had 176 voxels expanding from BA 44 to BA 46 in the left inferior frontal gyrus (IFG; Figure Figure6A).6A). Analysis of the RCC values demonstrated a significant task effect (FWE corrected p<0.05, with threshold cluster size of 21), and post hoc comparison (FWE corrected p<0.05, with threshold cluster size of 101) showed a significant difference between the PSL_TC and DMS_TC conditions (Figure (Figure6B).6B). Increased RCC values in the DMS tasks suggest enhanced functional interaction between the frontal cluster and the left temporal cortical sources during the late delay period of the DMS task, as compared to the PSL conditions. Examination of the coherence values in each frequency showed greater late delay vs. baseline differences in the beta band for the DMS tasks with TC stimuli, in which the greatest difference was found at 18Hz (Figure (Figure6C).6C). Furthermore, delay period activity of the sources in the frontal cluster showed a greater magnitude in DMS tasks than in the PSL conditions (Figure (Figure6D),6D), indicating its involvement in DMS-specific memory processing.
The current experiment investigated the task-specific modulation of human auditory cortex during performance of an auditory DMS task, which specifically emphasized the maintenance of STM during the delay period and decision-making/motor response based on comparison between the STM trace and perception of the acoustic stimuli (Posner, 1967). In comparison to the control tasks, the observed DMS-specific modulation effect involved a suppression of the AER with latency around 100ms. The auditory current sources showing this effect were lateralized to the left hemisphere. The cluster of the significant sources covered the region extending from primary to association auditory cortices (Figures (Figures3A3A and and5A)5A) with the center sources located in the STG. Furthermore, this effect was greater in the DMS task for sounds with multiple frequency components (TC) than for sounds with only one frequency component (pure tones), indicating a close relationship between STM load and the observed modulation effect. Along with the observed modulation effect, enhanced functional interactions between left auditory reference sources and sources in left inferior frontal regions were observed during the late delay period of the DMS task in the beta band (12~20Hz), suggesting involvement of a DMS-specific frontal–temporal interaction in the observed modulation effect.
These results provide experimental evidence in humans that support the hypothesis of task-specific top-down modulation to auditory information processing during the DMS task (for a recent review, see Scheich et al., 2007). With measurements and analyses of the temporally sensitive MEG signals, our findings reveal two important aspects of this modulation effect: (a) left lateralization of the observed DMS-specific suppression of the transient early cortical AERs, and (b) close relationship with STM processing as revealed by significant stronger modulation of AER to TC stimuli than Tones, and greater beta-band functional interaction between the left auditory cortical sources and the left IFG.
Measured by MEG/EEG, with peak latency around 100ms after stimulus onset, the M100/N1 response is believed to be involved in detection of changes in the acoustic environment, and can be influenced by both upstream and downstream auditory subcortical/cortical regions (Näätänen and Picton, 1987; Hari, 1990). Suppression of this response has been observed by passive listening to repetitively presented stimuli (Näätänen and Picton, 1987) and by active auditory perception during task performance (Hillyard et al., 1973; Woldorff et al., 1993; Luo et al., 2005; Martikainen et al., 2005). Recent studies have correlated the modulation effect with improved performance in healthy adults (SanMiguel et al., 2008; Lijffijt et al., 2009; Alain et al., 2010; Navarro Cebrian and Janata, 2010), and dampened or diminished modulation with behavioral deficits in schizophrenia patients (Heinks-Maldonado et al., 2007; Lukhanina et al., 2009; Dale et al., 2010).
To account for these observations, a broad spectrum of interpretations from pre-attentive habituation (Tiitinen et al., 1994) to cognition related top-down modulation (Fritz et al., 2007; Scheich et al., 2007) has been proposed. With supportive experimental results mainly obtained from mismatch negativity (MMN) studies (Näätänen, 1990), the habituation hypothesis postulates that stimulus-specific adaptation to repetitively presentes sounds suppresses the evoked response to an upcoming stimulus, given the upcoming one has similar salient features. This hypothesis suggests hierarchical, gradual, and implicit procedures of memory establishment and a pre-attentive intrinsic adaptation mechanism underlie the observed suppressive effect. Consequently, this view indicates that the suppression should not differ between PSL and active task performance.
In contrast, active performance of cognitive tasks also displays suppression of the AER without reliance on repetitively presenting identical sounds/features. Examples include relative suppression of M100 responses to unattended stimuli (Hillyard et al., 1973; Woldorff et al., 1993; Poghosyan and Ioannides, 2008; Atiani et al., 2009), features (Ahveninen et al., 2006; Kaiser et al., 2009), and modality (Oatman, 1976; Alho et al., 1994; Eimer et al., 2004) in selective attention tasks, suppression of M100/N1 responses to self-initiated tones (Schafer and Marcus, 1973; Martikainen et al., 2005) or speech sounds (Houde et al., 2002), and suppression of the M100/N1 response to the second sound of the pair in behavioral paradigms employing the DMS task with a broad spectrum sound stimuli from simple sounds such as tones and TC (Lu et al., 1992) to complex speech sounds such as vowels and consonant vowel syllables (Luo et al., 2005; Lijffijt et al., 2009). It is believed that the prediction of the afferent sensory signal by the top-down attentive, motor, or memory related efference signals is involved in the observed inhibitory modulation effect (Blakemore et al., 1998; Heinks-Maldonado et al., 2006; Fritz et al., 2007). This evidence suggests an explicit, active, and task-specific mechanism underlies the observed suppression effects: the cognitive task-specific neural processing selectively modulates the sensory-evoked responses.
In this study, we focused our research on the task-specific AER suppression during performance of an auditory DMS task and hypothesized that the DMS-related cognitive functions play active roles in the observed modulation effect. By comparing to control tasks such as PSL and counting, the results revealed both task-specific and non-specific suppressive modulation effects to the early cortical AER. In the right auditory cortex, a similar suppressive modulation to AER among the tasks agrees with the habituation hypothesis. In the left auditory cortex, by controlling the habituation effect with identical timelines for each trial (a sound pair separated by a 1-s silent delay period) and the attention effect by instructing subjects to listen to the sounds during both control and DMS conditions, we have demonstrated a suppressive AER modulation effect specifically correlated to performance of a DMS task that involved overt STM maintenance and manipulation. Furthermore, the relatively greater suppression effect in the DMS task than in the counting task not only strengthens the task-specificity of this effect, but also suggests that this effect is specifically related to the STM processing of the acoustic features of the sound stimuli, given that performing the counting task also required participants to hold a simpler format (numbering) of the STM trace of each sound stimulus (Nieder and Dehaene, 2009).
In addition to the task-specificity, we observed left lateralization and selectivity to TC stimuli of this modulation effect. Task-specific hemispheric asymmetry has also been shown in previous MEG studies using other task paradigms (Poeppel et al., 1996; Chait et al., 2004). Furthermore, a recent fMRI study demonstrated that BOLD activation related to working memory of FM tones was lateralized to the left auditory cortex (Brechmann et al., 2007), which overlapped with the location of the significant sources observed in our study. For interpretation of the lateralization phenomenon, both hemispheric functional specificity (Zatorre and Belin, 2001; Grimm et al., 2006) and temporal scale sensitivity (Poeppel et al., 2004) have been proposed. Because we did not design this experiment to investigate the functional asymmetry of auditory information processing between the two hemispheres, further investigation is needed to explore these hypotheses.
In our results the DMS-specific suppression is to the AER of the second stimulus in a sound pair (Figure (Figure3B),3B), suggesting that the neural dynamics during the delay period and the first 100ms of S2 presentation are most likely behind this modulation effect. Therefore, we focused on the late delay period to investigate the DMS-specific functional interactions between the left auditory cortex sources and other sources in the brain. 3dANOVA analysis on the baseline-corrected coherence during the late delay period (the RCC values) revealed a cluster of sources in left inferior frontal cortex that showed significantly enhanced functional interaction with the left auditory reference sources during the late delay period (Figures (Figures6A,B).6A,B). The effect was observed in the beta band (12~20Hz) and peaks at 18Hz (Figure (Figure6C).6C). Moreover, the sources in the frontal cluster displayed greater delay period activity in the DMS tasks than during PSL (Figure (Figure6D),6D), indicating involvement of this region in memory processing during the DMS task performance.
Correlation of left inferior frontal activity and auditory memory processing has been shown in a wide variety of studies: positron emission tomography (PET; Jonides et al., 1998) and functional MRI (Husain et al., 2004, 2006) studies have shown increased left inferior frontal oxidative metabolism in DMS tasks. Recent event-related fMRI data found involvement of this region in all procedures from coding, maintenance to response periods (Strand et al., 2008). MEG studies found increased frontal activity during the delay period and the following response phase in DMS tasks (Luo et al., 2005; Grimault et al., 2009; Kaiser et al., 2009). Brain disorders, such as schizophrenia (Stevens et al., 1998; Menon et al., 2001) and dyslexia (Dufor et al., 2007) showed decreased left inferior frontal activity in working memory tasks. In addition to functions related to auditory memory processing, studies have also correlated this region with other cognitive functions including response selection (Binder et al., 2004), phonological processing in speech (Hickok and Poeppel, 2007), lexical processing in music (Peretz et al., 2009), and attention control (Ross et al., 2010).
The functional role(s) of beta-band interaction between left IFG and left auditory cortex remains poorly understood. Some may argue that our observation of the DMS-specific functional interaction in the beta band is due to the bottom–up afferent to a memory center in left IFG, not top-down modulation of left auditory cortex as we have hypothesized. However, data from patient studies support our proposal that left IFG can modulate AER by means of frontal–temporal functional interactions. For example, frontal lobe patients show a correlated increase of AER magnitude with the degree of behavioral deficiency during performance of auditory DMS tasks (Knight et al., 1999), and schizophrenia patients display decreased beta-band frontal–temporal coherence along with deficit gating of the N1 response (Rosburg et al., 2009). Furthermore, recent visual-motor studies suggested that enhanced beta-band coherence between frontal and visual cortices plays a functional role in cognitive tasks that require top-down anticipitation of upcoming sensory event (Buschman and Miller, 2007; for a review, see Engel and Fries, 2010). Though there is little direct evidence with healthy participants in previous studies, our results support the left IFG's role in task-specific top-down suppression of AER.
The current study used an auditory DMS task to investigate the task-specificity of top-down modulation in human auditory cortex and the neural mechanisms underlying the observed modulation effects. Besides the demonstration of a DMS-specific suppressive modulation of the early phase AER, we also observed increased functional interaction between the modulated auditory cortex and left IFG, in which the frontal sources showed increased activity in DMS tasks, indicating their involvement during STM processing. Our results suggest that a task-specific interactive network including both auditory and frontal cortical regions is necessary for successful performance of the auditory DMS task, where the frontal regions can exert influences on the early phase of auditory cortical processing. The latency of these influences could be as early as tens of milliseconds after stimulus onset. Therefore, the findings from this and previous studies lead us to propose that cortical responses to auditory stimuli are affected by task-specific networks in which processing of the relevant information can be enhanced and retained, and processing of the irrelevant information can be suppressed through task-specific frontal–temporal functional interactions.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This work was supported by the NIDCD Intramural Research Program. We thank Dr. Hung Thai-Van for help with data collection, Dr. Gang Chen for help with data analysis, and Drs Jane Clark, Todd Troyer, and Jonathan Simon for their thoughtful comments on the work.
|AER – auditory evoked response|
|ANOVA – analysis of variance|
|CNT – counting|
|DICS – dynamic imaging of coherent sources|
|DMS – delayed-match-to-sample|
|IFG – inferior frontal gyrus|
|LCMV – linearly constrained minimum variance|
|MI – modulation index|
|PSL – passive listening|
|RCC – ratio of coherence change|
|RMS – root mean square|
|TC – tonal contour|
|STM – short-term memory|
In sensor space, the measurements of the AERs were derived from the representative sensors for each participant. For each trial, we averaged the field strength across the representative sensors, and quantified the AERs as root mean square (RMS) value of the peak field strength with a delay at ~100ms (M100) after stimulus onset. The AER values were then normalized by subtracting the averaged RMS of field strength during the baseline period. Therefore, for each trial, we obtained a normalized AER value for each stimulus. After quantification of the AERs to S1 and S2 for each experimental condition, we evaluated the modulation index (MI) values using Eq. 1. Statistic analysis of the MI values used a repeated measures ANOVA with three factors: task (PSL, CNT, DMS), sound type (Tone, TC), and trial type (match, non-match). After ANOVA, the mean MI values obtained from different experimental conditions were further compared by using the Tukey–Kramer method. SAS v9.1 (SAS Institute Inc., Cary, NC, USA.) was employed for the statistical analyses.
In the left hemisphere, the grand mean RMS of the field strength obtained from the representative sensors across all participants showed a pattern of decreased M100 responses to S2 as compared to the responses to S1 in all experimental conditions except the non-match trials with Tone stimuli. The matched trials in DMS task with TC stimuli showed the greatest suppression (Figure (FigureA1C).A1C). Consistently, a greater than zero mean MI value (27.1±5.76%) is obtained in the match condition of DMS task with TC stimuli but not in the other experimental conditions (Figure (FigureA1D).A1D). Furthermore, analysis of the MI values demonstrated significant sound type (one-way ANOVA, F1,8=12.18, p=0.008) and trial type (F1,8=7.74, p=0.02) main effects and a significant task×sound type×trial type interaction (three-way ANOVA, F2,16=8.93, p=0.002). However, no significant task effect was demonstrated (one-way ANOVA, F2,16=0.18, p=0.84), and post hoc comparison did not show any significant simple contrasts between experimental conditions.
In the right hemisphere, the averaged RMS waveforms showed decreased M100 responses to S2 in all conditions except the non-match condition of DMS task with Tone stimuli (Figure (FigureA1E).A1E). A major effect of trial type is observed (one-way ANOVA, F1,8=10.88, p=0.011), where suppression of the M100 response to S2 for the match trials was greater than for the non-match trials. Neither main effect of task or sound type, nor any of the interaction effects is revealed by the statistical analysis. No mean MI value is significantly different from zero in all experimental conditions (Figure (FigureA1A1F).
To summarize the results in sensor space, a significant suppression of the M100 response to S2 as compared to the response to S1 was revealed by the left representative sensors in the match condition of DMS task with TC stimuli. However, no significant differences between tasks were observed by statistical analysis of the MI values for either hemisphere. The lack of task-specific differences of the MI values in sensor space between conditions may be caused by the different task-related dynamics of the multiple cortical sources that contributed to the M100 response (Hari, 1990), whose locations are found not only in superior temporal plane, but also in other anterior and posterior regions. Thus, analysis of the MI values computed from the source activity in bilateral superior temporal cortices is necessary to assess the task-specificity of the modulation of AERs in a more spatially focused manner.