|Home | About | Journals | Submit | Contact Us | Français|
Responses to acoustic input were recorded from human temporal cortex using subdural electrodes in order to investigate in greater anatomical detail how attentional load modulates exogenous auditory responses. Four patient-volunteers performed a dichotic listening task in which they listened for rare frequency deviants in a series of tones presented to both ears at interstimulus intervals (ISIs) of 400, 800 and 2000 msec. Across all ISIs, stimuli presented contralateral to electrode location produced the strongest deflections in the averaged ERP at approximately 90 and 170 msec post-stimulus on average (labeled N90stg and P170stg). Maximal recording sites for these peaks most often occurred over the Sylvian fissure or the upper bank of the posterior superior temporal gyrus. Neither ISI nor selective attention exhibited substantial effects on peak latencies. However, as presentation rates increased (decreasing ISI), overall averaged event-related potential (ERP) amplitudes declined significantly, while attending to the contralateral stimulus significantly increased both the N90stg and P170stg peaks for most patients. This effect of attention increased with decreasing ISI for both components most clearly in the difference between the grand-average ERPs for attending to vs. ignoring the contralateral stimulus, and even more dramatically in the percentage ratio of that difference over the mean peak amplitude. This amplifying effect of attention with increasing load, along with its anatomical location, suggests that attention can enhance exogenous sources in auditory cortex.
It is well known that attention directed toward particular aspects of sensory input (i.e., selective attention) can modulate the baseline physiological response to that input. Human electrophysiological research has resulted in two competing models of the effects of auditory selective attention. In one view, Hillyard and colleagues have proposed an early enhancement of the exogenous auditory evoked response potential (ERP) waveform deflection occurring approximately 70–120 msec post-stimulus, commonly labeled N1 (i.e., the “N1 effect”, Hillyard et al., 1973; Näätänen & Picton, 1987; Woods, 1990). Its primary auditory source is believed to reside along the supratemporal plane (STP) (Liégeois-Chauvel et al., 1994; Picton et al., 1999; Yvert et al., 2001; Godey et al., 2001), which, along with its early latency, tonotopicity, and sensitivity to the state of the listener, has lead many to view the N1 as an obligatory (i.e., exogenous) sensory response (see Näätänen, 1992, for a review). Its attentional modulation would therefore indicate that selective attention can intervene early and directly in sensory processing.
Conversely, Näätänen and colleagues have proposed that the attentional modulation of late auditory ERP deflections arises primarily from an endogenous generator independent of that producing the N1 (Näätänen, 1992). In this view, selective attention adds a distinct negative component to the ERP labeled the “processing negativity” (PN), which is considered a physiological index of an endogenous perceptual comparison process. The broad temporal effect of the PN can not only add negativity to attended ERPs at N1 latencies, thus creating the N1 effect, but also to later positive deflections such as the P2, which would thus shift negatively in attend relative to ignore conditions (Näätänen, 1992; Teder et al., 1993).
A fast stimulus presentation rate was initially viewed essential for creating the N1 effect because it was thought to increase information load to the point where attention must intervene early in sensory processing to achieve target set selection (Hillyard et al, 1973; Schwent et al, 1976; Hansen & Hillyard, 1984). For example, Woldorff and colleagues reported an early attentional modulation of the positive mid-latency deflection occurring around 20–50 msec post-stimulus (known alternately as P20–50 or P1), but only under conditions of high load created with very short ISIs (e.g., 200 msec) (Hackley et al., 1990; Woldorff & Hillyard, 1991; Woldorff et al., 1993). The generators of these enhanced ERP components have also been localized to the STP, further supporting the view that exogenous auditory responses can be directly modulated by attention (Woldorff et al., 1993; Woods et al., 1994; Giard et al., 2000). Reports of a more positive P2/P190, along with the more positive P20–50 and more negative N1, also suggest that attention generally enhances responses in sensory cortex beyond the contributions of putative endogenous components (Woldorff & Hillyard, 1991). This association of greater attentional load or cognitive effort with greater activation in sensory cortex has found growing support in both EEG (Alcaini et al., 1995; Sussman et al., 2003) and neuroimaging studies (O’Leary et al., 1997; Alho et al., 1999; Jäncke et al., 1999; Zatorre et al., 1999; Petkov et al., 2004; Shomstein & Yantis, 2004).
This model has been questioned by other researchers, however, who have countered that fast stimulus rates do not increase the exogenous N1 peak directly but rather the onset of an attention-related negativity (i.e., PN); in such cases, the early PN onset overlaps with the N1 to create an apparent N1 effect (Parasuraman, 1980; Naatanen et al, 1981; Teder et al, 1993). The attention-related negativity has since been subdivided into earlier and later subcomponents, as measured in the negative difference wave (Nd) formed by subtracting attended from non-attended ERPs (Naatanen et al, 1981; Hansen & Hillyard, 1984; Woldorff and Hillyard, 1991; Alcaini et al., 1995; Giard et al, 2000). There is evidence that earlier Nd components may reflect true enhancements of an exogenous, modality-specific component of the N1 in auditory cortex, while later peaks arise from endogenous sources, often of more frontal origin (Naatanen et al, 1981; Hansen & Hillyard, 1984; Woldorff & Hillyard, 1991; Woods et al., 1994; Alcaini et al., 1995; Giard et al., 2000). Unfortunately, the inherent temporal and/or spatial limitations of non-invasive electro-magnetic methods makes it difficult to completely isolate exogenous contributions to the scalp-recorded N1 effect from potentially distinct endogenous components such as the PN (Parasuraman, 1980; Woods et al., 1991; Teder et al., 1993; Alho et al., 1994).
Patients undergoing evaluation for the surgical treatment of medically intractable epilepsy provide a rare opportunity to measure the effects of selective attention directly from human cortex. As part of their treatment, intracranial recordings (electrocorticograms, ECoG) are often made to identify the locus of seizure activity and/or map cortical areas involved in speech and language. Intracranial recordings are not susceptible to the temporal and spatial filtering of electrical sources due to the skull and scalp that are seen in EEG recordings (Srinivasan, 1999). They further limit the superimposition of distant electrical sources and therefore enhance the measurement of local generators. In the present work, electrodes were implanted for clinical purposes in patient-volunteers over the lateral temporal cortex and perisylvian areas. The increased resolution of ECoG recordings from this region should help to isolate the contribution of potentially modality-specific contributors to the scalp-recorded N1 over frontal or deeper subcomponents. The goal of the current work was to investigate intracranially the effects of selective auditory attention and attentional load on human perisylvian cortex as produced in the classic dichotic auditory oddball paradigm.
Table 1 presents overall mean percent correct and d-prime values in the deviant detection task averaged across both attention conditions, and from all sessions for all four patients. Previous EEG studies measuring deviant detection performance at varying ISIs reported strong trends of improved detection at faster rates (Parasuraman, 1980; Näätänen et al, 1981; Teder et al, 1993). Performance differences in the present results were not significant across ISI, however, suggesting that task difficulty increased with increasing stimulus rates for participants, countering the potential detection advantages that can occur during rapid tone presentation (Alain & Woods, 1993).
Data from four patients are reported here. Analysis will focus on changes due to attention in ERPs formed by averaging individual ECoG epochs recorded in response to standard tones presented contralateral to grid hemisphere. The electrode exhibiting the maximal negative ERP peak in the time-window 70–120 msec post-stimulus at 800 msec ISI is defined as the modal electrode at which ERPs from all ISIs will be analyzed for each patient. In accordance with a previous report using this methodology (Neelon et al, 2006), this peak is labeled N90stg based on its polarity, average latency and anatomical location. The subsequent positivity recorded from the same electrode occurring between 121–220 msec post-stimulus is likewise labeled the P170stg. Figure 1 presents, for the four patients, schematics of electrode array locations superimposed on lateral surface MRIs, contour plots of the averaged response recorded between 70–120 msec post-stimulus superimposed on the electrode arrays, and sites of the maximal N90stg recorded for the three ISIs.
For 3 of the 4 patients, the location of the maximal N90stg electrode position moves inferiorly along the STG as ISI lengthens. For patients 6 and 9 (P6, P9), this movement is from electrodes just superior to the Sylvian fissure (SF) for the 400 ISI N90stg to electrodes just inferior of the SF for longer ISIs. For P15, the 2000 ISI N90stg electrode is more inferior and anterior along the middle STG than those for 400 or 800 ISIs (P15 did exhibit another, nearly equivalent maxima for the 400 msec ISI N90stg at electrode 54, as well). The lack of a maximal 400 ISI N90stg superior of the SF for P15 may be due to this patient’s relatively antero-inferior electrode array placement.
For consistency, only ERPs from the modal, 800ms ISI N90stg electrode will be used to compare effects of attention across the different ISIs. This criterion results in comparing ERPs recorded from electrodes 49, 63, 63 and 7 for patients P6, P9, P15 and P42, respectively, all of which lie along the upper bank of the STG. (Recordings for the first session at 2000ms ISI for P9 exhibited amplifier blocking at electrode 64, which was thus removed from further analysis.) Figure 2 presents for each patient the ERPs for each ISI (400 – solid, 800 – dashed, 2000 – dotted) and each attention condition (attend toward (AT) tone series in ear contralateral to grid –black; attend away (AA) to series in ipsilateral ear –grey) recorded from these electrodes. Table 2 presents the peak latencies and mean peak values (average of ERP in 20 msec time-window centered on peaks) of the corresponding N90stg and P170stg deflections, where single and double asterisks denote values which differ significantly due to attention condition at p ≤ 0.05 and ≤ 0.01, respectively, using a bootstrap sampling procedure.
Figure 3 presents, for each subject at each ISI: the minimum N90stg (A) and maximum P170stg (B) mean peak values across the two attention conditions from Table 2 (patient numbers), the difference between these values (AT – AA; black lines), and the group means (grey dashed lines). Separate 2×3 ANOVAs were performed on these N90stg and P170stg data, with the mean peak and difference values as levels of the first factor and the three ISIs as levels of the second factor. Separate analyses were performed because a single, 3-factor ANOVA would have obscured main effects or lower-level interactions by averaging across the sign differences between the two components. (All reported p-values are based on Greenhouse-Geisser adjusted degrees of freedom.)
The analyses reveal that overall ERP mean peak values increase significantly with increasing ISIs (N90stg ISI main effect: F = 6.62, p ≤ 0.03; P170stg ISI marginal main effect: F = 5.75, p ≤ 0.06). There is also a significant interaction between ISI and the mean peak vs. difference values for both components (N90stg values × ISI: F = 15.88 p ≤ 0.03; P170 values × ISI: F = 6.55, p ≤ 0.05). This interaction can generally be seen in both panels of Figure 3, for which there is a trend of decreasing difference values with increasing mean peak values. The trend is clearer, however, for the N90stg than the P170stg peak, and diverges between the two peaks at the 2000 msec ISI level. These interactions indicate that even as overall ERP level declines with decreasing ISI, the differences between the mean peak values of the two attention conditions (i.e., the effect of attention) remain stable. In fact, this constant effect size across ISI suggests that the attentional effect expressed as a percentage of peak ERP values grows larger with decreasing ISI (see Table 3).
Another measure of the overall effect of attention as a function of ISI can be obtained from the grand-average ERP, formed from the entire pool of data epochs recorded from each patient’s modal N90stg electrode. Figure 4 presents the grand-average ERPs from all patient epochs for each ISI (400 – solid, 800 – dashed, 2000 – dotted) and each attention condition (AT – black, AA – grey). Table 3 presents the peak latencies and mean peak values of the corresponding N90stg and P170stg deflections in the grand-average ERP, where asterisks denote significant differences using the bootstrap sampling procedure as per Table 2. Table 3 also includes a measure of the “attention effect”, calculated as the percentage ratio of the mean peak difference (AT – AA) relative to the minimum (N90stg) or maximum (P170stg) mean peak value, for each component at each ISI.
Confirming the trends seen individually in Table 2, the effect of attention on both intracranial components increases with decreasing ISI, though again the differences in P170stg due to attention are not as consistent as those for the N90stg. This latter phenomenon is chiefly because P42’s attention effect corresponds more to a PN, in which the later positive peak is more negative when actively attending toward the input. The remaining patients, on the other hand, show a positive increase in the P170stg peak in the AT versus the AA conditions. These opposing trends work to mitigate significant effects of attention for the P170stg in the grand-average ERP at longer ISIs.
The results of this study confirm previously reported intracranial findings that the presentation of brief tones produces a strong, multiphasic response in the human perisylvian area (Celesia, 1976; Richer et al., 1989; Liégeois-Chauvel et al., 1994; Halgren et al., 1995; Howard et al., 2000; Rosburg et al., 2004; Edwards et al., 2005). Lateral surface recordings from the temporal lobe in the present patients exhibited the largest negative peak in the averaged ERP waveform over the upper bank of the STG or the SF, with the maximal recording site more likely to occur over the SF as ISI decreases. This peak and the following large positive peak have been labeled the N90stg and P170stg, respectively, based on polarity, latency, and peak anatomical location (Neelon et al., 2006). The major auditory sub-component of the scalp-recorded N1, labeled the N1b, is believed to arise generally from the STP (Näätänen & Picton 1987; Liégeois-Chauvel et al., 1994; Alcaini et al., 1995; Picton et al., 1999; Godey et al., 2001). The present results suggest that the lateral grids used here may be measuring some portion of an exogenous auditory generator in posterior STG/SF contributing to the scalp-recorded N1/P2 complex (see also Richer et al, 1989, and Edwards et al., 2005, for similar findings based on depth and lateral recordings, respectively). The generally earlier latencies of the N90stg over that of the scalp-recorded N1 also likely reflects this sub-component nature. EEG recordings, unlike MEG or intracranial recordings, also summate responses from sources outside auditory cortex and from both hemispheres, which are likely to further delay scalp ERP peaks.
The primary goal of this work was to investigate how manipulating attentional load, via changes in stimulus presentation rate, would affect intracranial ERPs as measured by the N90stg and P170stg deflections. The most apparent result was that slower presentation rates (i.e., larger ISIs) produced much larger N90stg/P170stg peaks, and overall averaged ERPs, than did faster presentation rates. This effect of ISI on peak magnitudes accords well with the conclusions drawn from the electrophysiological literature of a refractory period for the underlying component sources (Näätänen, 1992; Rosburg et al., 2004). While selective attention had little effect on the individual latency values of either peak, it did significantly increase the magnitude of either the N90stg or P170stg (sometimes both) for all patients for at least one ISI presentation rate. This effect most often took the form of increasing the negativity or positivity of the N90stg or P170stg peaks, respectively, when patients attended toward the stimulus in the target ear (contralateral to grid location) compared to when they attended toward the non-target (ipsilateral) ear input.
These effects also became larger at faster stimulus presentation rates, suggesting that the modulating effects of selective attention strengthen as load is increased through greater task difficulty. This increasing effect of attention can be seen even more clearly in the grand average ERP, which was formed from all the individual epochs recorded from each patient’s modal, 800-msec ISI N90stg electrode. Even as overall mean peak levels of the ERPs declined with increasing stimulus rate, the attentional modulation of the N90stg and P170stg mean peaks increased. Measured as the percentage ratio of the mean peak differences relative to the mean peak values, the magnitude of this “attention effect” grew consistently with increasing attentional load (see rightmost column Table 3).
Though the N90stg and P170stg are strongly correlated in response to acoustic input, the slight differences in effect strength between the two peaks for individual patients suggest that the two components may arise from distinct generators which respond somewhat differently to the influence of selective attention. In both individual patients (Table 2) and the grand-average percent “attention effect” (Table 3), the N90stg peak exhibited a clear increase in amount of attentional modulation with faster stimulus rates. Though the magnitude of the grand-average P170stg attention effect followed a similar increasing trend with decreasing ISI, this tendency was not as clear across individual patients. Table 2 data superficially suggest that the effect of attention on the P170stg peak may be more independent of ISI, and by extension, attentional load. This may indicate that less attentional effort is required to modify the P170stg relative to the N90stg, or that the electrodes across the lateral surface of the STG are more sensitive to any such changes in the generator producing the P170stg versus that producing the N90stg. This latter view would be especially true if the two peaks arise from different underlying neural generators. A similar source distinction has also been proposed for the scalp-recorded N1 and P2 components (Knight et al., 1980, 1988; McCarley et al., 1991; Godey et al., 2001; Crowley & Colrain, 2004), to which the N90stg and P170stg likely contribute.
Nonetheless, only one patient, P42, exhibited an overall increased negativity of the averaged ERP due to attention, which may be more characteristic of an endogenous attentional component such as the PN. For the other three patients and in the overall grand average ERP, however, the significantly amplified positive values of the P170stg at higher attentional loads (i.e., short ISIs), coupled with the simultaneously amplified negative N90stg peaks, supports a generally enhancing function of selective attention as has been reported in other auditory and visual attention studies (Hillyard et al., 1998; Kastner & Ungerleider, 2000; Maunsell & Cook, 2002; Petkov et al., 2004; Shomstein & Yantis, 2004). This result, along with the location of the maximal peaks over the SF and upper STG, suggests that attention can amplify exogenous responses in auditory cortex.
We previously reported an inverse correlation between performance in the deviant detection task and the amount attention increased the N90stg peak, such that the attentional enhancement of the N90stg weakened as subjects performed better (Neelon et al., 2006). That trend is not evident here as attention effects increase as ISI decreases, relative to overall ERP levels, despite a slight (though non-significant) improvement in performance. The previous finding was calculated only for data collected at 800 ISI, however, and it was noted that practice effects over multiple sessions at the same ISI may have played a role in the decreasing effects of attention with increasing performance (Weissman et al., 2002). Repeating the task under the same ISI conditions may have lead to a greater level of automaticity and less cognitive effort, which could have lead to increased performance even as the effects of attention declined. Previous studies measuring deviant detection performance at varying ISIs have also reported improved detection at faster stimulus presentation rates (Parasuraman, 1980; Näätänen et al, 1981; Alain & Woods, 1993; Teder et al, 1993). The fact that our subjects did not perform significantly better at faster rates (see Hansen & Hillyard, 1984, for similar results) suggests that the increased presentation speed may have added other response pressures to counteract the possible detection advantages of rapid tone presentation.
Better task performance should be expected under conditions of greater attentional focus, ceteris paribus. The present results suggest a complicated interaction of potentially increased focus at faster ISIs leading to increased attentional effects, while simultaneously preserving performance levels even as the task becomes ostensibly more difficult. In support of the hypothesis that longer ISIs require less attentional focus, at least one patient reported that responding to deviants at longer ISIs was more difficult than expected because long time gaps between targets caught this listener off-guard. This would suggest that the relatively low performance in the task at 2000 ms ISI reflects a lack of attentional focus, and would then also underlie the generally equivalent AT and AA peaks at this ISI.
It should be noted that even the largest percentage attention effect reported here (e.g., Table 3, 400 msec ISI) is still generally smaller than the percentage increase reported for the scalp N1 effect, where the attended N1 peak may be several times greater than the corresponding peak in the unattended condition (see Näätänen, 1992, for a review of such effects). This difference in size of attentional effect between the intracranial and scalp ERPs may be due to several reasons. First, due to a hardware limitation, all data were high-pass filtered near 2 Hz (see Experimental Procedures). This filtering should attenuate the contribution to the present results of slow-wave effects (e.g., PN) often reported in EEG attention studies. Also, the scalp N1 effect is most likely a product of several distinct neural sources, all of which may contribute to the relatively large reported differences due to attention. Intracranial recordings, though superior to scalp EEG montages for measuring local neural generators of ERP components, exhibit a poor ability to measure distant generators which together may all contribute to the scalp N1 component. Electrode placements in intracranial studies are determined by clinical need and therefore may not always be located in areas also adding to the scalp N1 effect. Hence, the present subdural grid locations may only have recorded one such source whose contribution to the total N1 effect is limited.
Another mitigating issue is that there appears to be a trade-off, as a function of ISI, between the clarity and strength of the ERP waveform and any observed attention effects. For short ISIs, the effect of attention on the ERPs may be strongest, but the more variable ERP waveforms simultaneously hinder finding statistically significant differences in individual patients. In a related fashion, patients may also have adopted different listening strategies at the different ISIs which may have prevented measuring stronger attentional effects. As noted earlier, some patients may not have exerted much attentional focus during the task at presentation rates of 2000 ms ISI. On the other hand, rapidly presented stimuli could create an auditory streaming effect in which patients might only have to detect deviant “pop-outs” from the tone stream (Alain & Woods, 1993; Bregman et al., 2000; Fishman et al., 2004). Though patients would be under more time pressure to respond quickly at short ISIs, the increased streaming effect may facilitate deviant detection at fast stimulus rates, thus potentially weakening attentional effects. Such differences in listening strategies may also explain why the average deviant detection percent correct performances did not differ more greatly across ISI. More data needs to be collected under these varying conditions to determine if the trends reported here are reliable across future listeners.
Subjects were volunteers recruited from a patient population undergoing diagnostic and surgical procedures for medically intractable epilepsy. Protocols were approved by the University of Wisconsin - Madison and Middleton Veteran’s Affairs Hospital institutional review boards and all patients provided informed consent. Electrodes were placed intracranially according to the clinical need of each patient. Electrodes were implanted in the subdural space and leads were tunnelled transcutaneously. ECoG signals were split so clinical recording proceeded unimpeded. After implantation, the patient was returned to the Epilepsy Monitoring ward where they remained for a period of one to two weeks. Patients were fully awake during testing and able to follow complex instructions. Subjects (2 males and 2 females) ranged in age from 32–42 years (median = 38) and in composite IQ from 80–132 (median = 102.5). Prior to surgery, individual patients were on one or two of the following anti-convulstants: levetiracetam, phenytoin, carbamazepine, lamotrigine, phenobarbital, or topiramate. During monitoring, patients were weaned off their medications for clinical purposes. Seizure onset locations for the four patients were generally recorded from sites in the anterior lateral temporal lobe, inferior lateral temporal lobe, or ventral temporal lobe. Seizure activity was not predominantly located over regions of interest in the current experiment (i.e., postero-superior STG/SF).
The subdural electrode array consisted of a grid of 64 platinum-iridium contacts encased in a silicone matrix. Intercontact distance was five millimeters (center-to-center) and contact diameter of each circular electrode was 2.3 millimeters. The electrodes were connected to an array of four 16-channel, electrically isolated pre-amplifiers (RA16PA, Tucker-Davis Technologies, Alachua, FL), which in turn were connected to four signal processing modules (RA-16). During data acquisition, ECoG signals were sampled at a rate of approximately 6kHz, bandpass filtered between 1.6 and 100Hz, notch-filtered at 60Hz, then down-sampled by a factor of 4 (P6–P11) or 6 (P15, P42) before saving to hard disk for later analysis. Recordings were made with reference to a ground tied to a single electrode placed either on the contralateral scalp (P6 & P9) or in the extra-cranial sub-galeal space over the ipsilateral parietal or frontal lobe (P15 & P42). Because reference electrodes were extracranial and significantly distant from the cortical electrodes, the recordings reported here are considered monopolar. This was confirmed by comparing ERPs directly averaged from each electrode recording with those averaged after sample-by-sample subtraction of a common-average reference formed from the grand mean across all remaining channels (Crone et al., 2001). No significant differences were observed between the two types of averaged ERPs, or, more importantly, in the values and latencies of the N90stg and P170stg peaks. These results indicate that the references used here were inactive relative to the signal levels recorded intracranially.
The anatomic location of the implanted electrode grid was schematized by co-locating intraoperative photographs taken during initial surgical placement and/or resection onto a 3D, surface rendered magnetic resonance image (MRI) of each patient obtained pre-operatively. In two patients (P6 and P9), no pre-operative MRI was available, so photographs for these patients were localized onto a template MRI from a different patient (not analyzed here). For P15 and P42, electrode grid location was determined by co-registering the patient’s pre-implant MRI with a post-implant CT scan (Stealth Station, Medtronic Sofamor Danek, Memphis, TN, USA). Grids were found to cover at least a portion of the superior temporal gyrus in all patients.
A dichotic auditory oddball paradigm was used to investigate selective auditory attention in these experiments. A single experimental trial was defined as the dichotic presentation of a series of 200 tone bursts (30 msec duration; 5 msec rise/fall time; 100 in each ear) over ER-6 insert earphones (Etymotic, Elk Grove Village, IL) at nominal levels of 85 to 95 dB SPL. These levels were used to ensure clear audibility over the incidental noises of the patient’s hospital room. Depending upon experimental condition, mean ISI was set to either 400, 800 or 2000 msec, while actual onset times were randomly jittered from the mean ISI based on either a normal distribution with a standard deviation of 100 msec (P6 & P9) or a uniform distribution with a range of 300 msec (P15 & P42).
Attention was manipulated by requiring the subjects to attend to the tone series in a specified ear while ignoring the tone series presented concurrently in the other ear. Recordings made when subjects listened to the ear contralateral to the hemisphere of electrode location were labeled attend-toward (AT) conditions; likewise, recordings made while listening to the ipsilateral ear were labeled attend-away (AA). Directional terms were used to label these two attention conditions rather than “attend/ignore” in order to emphasize that any effects of attention direction were not due to general effects of arousal in auditory cortex, which could occur when comparing such conditions to passive listening or intermodal attention trials (Woods et al., 1992). Electrode arrays were placed on the left hemisphere in all patients except for P15, who was implanted on the right hemisphere.
The subject was asked to detect rare deviations in the frequency of the attended tones by pressing a hand-held response button. The number of deviant tones was targeted to comprise 10% of the stimulus sequence on average, while “standard” tones constituted the remainder (actual number was randomly determined at the start of each session). Frequencies of standard tones were 1500 Hz (right ear) and 2300 Hz (left ear), and frequencies of deviants were set at the start of testing for each patient to 5% (P6, P9) or 10% (P15, P42) greater than the corresponding standard for each ear. P15 and P42 consistently achieved 100% performance at these starting levels, so deviants were adjusted in later sessions to 5% (P15) and 3% (P42) greater than the standard frequencies in an attempt to equate performance across subjects. For P15 and P42, only data from sessions with the smaller pitch difference deviants were analyzed in this report.
A single experimental block consisted of one AT trial and one AA trial, with the order of attention direction randomized within single testing blocks. Each trial presented 200 tones which were later segmented into individual recording epochs for averaging purposes (see below). Order of tone presentation was pseudo-randomized within a single trial, with the restriction that two deviants could not occur in a row; for a single experimental block, however, presentation order of stimuli was held constant across both attention direction trials. All patients were briefly exposed to the dichotic listening task prior to data collection. Multiple blocks of data were often recorded on separate days for each patient (range: 1–8 days after implant surgery). Data for the present analyses were taken from the following number of individual epochs recorded for each patient at each ISI (400/800/2000 msec): P6 – 148/316/235 epochs; P9 - 79/237/161; P15 – 101/200/50; and P42 – 160/160/39. Data averaging procedures weighted results equally across patients, rather than by number of epochs.
Performance in the attention task was assessed using percent correct (p(C)) and d-prime (Woldorff & Hillyard, 1991; Teder et al, 1993). A correct response (Hit) was any response occurring within 1.5 sec after a deviant was presented in the attended ear; responses occurring at any other time were labeled false alarms (FAs; multiple FAs between deviants were ignored to prevent negative d-prime estimates). The hit rate (equivalent to p(C)) was the number of correct responses divided by the number of deviants; the false alarm rate was the number of false alarms divided by the number of standard tones for the attended channel. For trials where either the hit rate was 100% or FA rate was 0%, these measures were adjusted in calculating d-prime to avoid infinite values (see Macmillian & Creelman, 1991, for details).
Only ERPs formed from responses to standard tones presented contralateral to grid hemisphere were used in the following analyses (i.e., right ear standard stimuli for all patients but P15). Data recorded from all channels were considered valid and incorporated in the following results unless voltage values indicated amplifier blocking. Data from entire trials were also rejected if there was extreme noise in the ECoG or there were no responses to any deviants for an either an AT or AA trial (which may arise due to equipment failure, patient alertness, etc.). Due to inherent features of the TDT recording hardware used in the present experiment, all signals were high-pass filtered during recording at a cutoff frequency near 1.6 Hz (3-dB cutoff between 1.6 and 2.6 Hz). Entire ECoG data streams for accepted channels were bandpass filtered again off-line between 2–35Hz (-3dB cutoff points) using a phase-corrected FIR filter in order to alleviate potential delays normally caused by low- and band-pass filters. This added filtering may attenuate slow-wave attention-related negativities such as the PN; however, it should also help to isolate exogenous responses occurring in the alpha-band region, such as the N1 peak, which are of interest to the goals of this work. Individual ECoG epochs were isolated according to the recorded onset time of each tone burst, and began 50 msec prior to stimulus onset and continued for 300 msec post-stimulus. ERPs were then formed by averaging all epochs of the same stimulus/channel type.
Using a non-parametric bootstrap statistical test (Efron & Tibshirani, 1993), the effects of attention direction were assessed for each subject and for each ISI separately by comparing the mean peak voltages from the AT and AA averaged ERPs taken in 20 msec time-windows centered on the maximal negative and positive peaks observed between 70–120 and 121–220 msec post-stimulus, respectively (these peaks were subsequently labeled N90stg and P170stg). For data recorded for each ISI separately, all epochs from the modal N90stg electrode from the two attention conditions (including those recorded over multiple days) were first combined into one master data matrix for each patient. 10,000 AT and AA bootstrap epochs (bAT, bAA) were formed by drawing randomly with replacement from this combined collection of AT and AA epochs. Selecting with equal likelihood from the combined set of recorded epochs approximated the null-hypothesis that no difference exists between the observed AT and AA conditions (Di Nocera & Ferlazzo, 2000). Bootstrap ERPs were then formed by averaging across the bAT and bAA datasets, and the bootstrap peak means and latencies for the major negative and positive peaks during the aforementioned time-window were computed in the same manner as for the observed data. Finally, the p-values of the observed negative and positive peaks for the original AT and AA ERPs were computed by finding the number of bootstrap mean component values greater or less than (2-tailed statistical test) the observed mean component values, divided by the total number of bootstrap samples (10,000).
The following procedure was performed separately on data recorded for each ISI. Data from all AA and AT epochs from the maximal N90stg electrodes of all patients were combined into two grand AT and AA datasets. The bootstrap procedure as described above would require drawing randomly with uniform probability from these combined epoch pools to form a null-hypothesis bootstrap N90/P170stg difference distribution. The observed grand average N1 difference, taken by simply averaging the total epochs in each pool and subtracting them from each other, would then be compared to this null-hypothesis distribution. However, this would over-represent data from those patients who were able to participate in many experimental sessions (e.g., P6 and P9 2000 msec ISI data), and result in weighting the bootstrap and observed grand average ERPs by epochs rather than by patient.
An alternative method was used here in which each null-hypothesis bootstrap grand average ERP was created by first forming a list equal in number to the total number of pooled epochs (for each ISI independently) of the 4 patient numbers selected with uniform probability; then, for each subject number in that list, individual epochs were drawn with uniform probability from that subject’s subset of AA and AT epochs. This process created 2 pools of null-hypothesis bootstrapped “AA” and “AT” epochs in which each subject’s epochs were equally represented. The procedure was repeated 10,000 times until the null-hypothesis bootstrap N90/P170stgdifference distribution was formed. Finally, the observed grand average ERP was found by averaging the ERPs from the maximal N90stg electrode separately averaged for each block for each patient. This method weighted the averaging and bootstrap formation processes equally across subjects rather than by number of epochs.
Tests were assessed at critical significance levels of pcrit ≤ 0.05 (single asterisk) and pcrit ≤ 0.01 (double asterisk). Significant effects of attention on the N90stg peak are reflected in observed p-values less than pcrit, which indicate that the negative peak of the AT waveform was more negative than the corresponding AA peak because very few bootstrapped component values were less than the observed component value (and vice versa for the p-value of the P170stg peak). The effect of attention on component latencies were also assessed in the same manner by forming bootstrap distributions of the latencies of the peak deflections in previously defined time windows.
Funding for this research was provided in part by NIH/NIDCD 5K23DC006415 (PCG). The authors would like to thank Prakash Khanikar, Lisa Rhuelow, and Sue Busta for their assistance in organizing, collecting and analyzing portions of the data reported in this paper. Finally, this work would not have been possible without the generosity of the patients who volunteered to participate in these experiments during their clinical treatment.