displays the location of electrodes on the 2D MRI images for both patients while Talairach coordinates are provided in . In Patient 1, the electrodes in the left hemisphere were on the posterior-medial border between Heschl's gyrus and Planum Temporale. The electrodes in the right hemisphere were in the posterior medial portion of Heschl's gyrus, most probably in primary auditory cortex as indicated by probabilistic histological mapping [Rademacher et al., 2001
]. In Patient 2, the electrode in the left hemisphere was in the anterior lateral part of Heschl's gyrus and the electrode in the right was in the anterior medial part of Heschl's gyrus. These regions are slightly anterior to where primary auditory cortex is typically located [Rademacher et al., 2001
First, we assessed to what extent the recorded unit activity was evoked by our stimulus (see Methods). We recorded from a total of 30 cells (20 single units and 10 multi units) and 25 were responsive to the movie sound-track with an average inter-run correlation of 0.25 0.14 (mean S.D.; ).
The first question we addressed was what temporal resolution captures best the sensory evoked responses given the naturalistic nature of our stimuli. To that end, we calculated for each neuron the correlation level between two repeats of the same stimulus under different levels of temporal smoothing. This was conducted separately for the normal, double, and quadruple speeds. shows the results for the entire population of responsive neurons. We found a highly consistent result for all three presentation speeds tested, where the correlation reached a plateau at bin sizes of ~500 ms (for complete set of spiking time courses see Supporting Information Figs. S4–S5
). At the population level, responses seemed to be phase-locked to the stimulus envelope across all stimulus modulation rates used in our experiment (Supporting Information Figs. S4–S5
). The Pearson correlation coefficient (r
value) between the smoothed population spiking activity and stimulus envelope was 0.68 (df = 1,080, P
), 0.66 (df = 540, P
), and 0.68 (df = 270, P
) for the normal, double and quadruple speeds, respectively. For individual neurons, the correlations were 0.19 ± 0.11, 0.21 ± 0.11, and 0.23 ± 0.12 (mean ± S.D.) and significant at P
level. In addition, we calculated the latency between population spiking activity and stimulus soundwave envelope. To that end, we computed the time-lag at which maximal cross-correlation between the minimally-smoothed population spiking activity and the stimulus soundwave was achieved (see Methods). The latencies and correlations across the different modulation rates were 40 ms (r
= 0.17), 41 ms (r
= 0.18), and 40 ms (r
= 0.21) for the normal, double, and quadruple speeds, respectively.
In addition to the spiking activity, we tested what aspect of the LFP signals was evoked by our stimulus. We first filtered the LFP signals into six different frequency bands (1–4 Hz, 4–8 Hz, 8–16 Hz, 16–32 Hz, 32–64 Hz, and 64–128 Hz) and extracted the power changes in each frequency band by rectifying the filtered signals. We then assessed the level of correlation across the two normal speed runs (). As shown in the graph, only LFPs at the high gamma band frequencies (64–128 Hz) displayed significant correlation across the two normal speed runs (r
= 0.26 ± 0.10, P
; for complete set of gamma band LFP time courses see Supporting Information Figs. S6–S7
). Next, similar to the spiking activity, we examined how the level of correlation between runs depends on the bin size of temporal smoothing (). As was the case for spiking activity, sensory-evoked correlations increased with increasing temporal smoothing and reached a plateau around 500 ms. Consequently, we used 500 ms as our temporal bin size for further analysis of both spikes and the power changes in the high gamma band LFPs. The correlation between the soundwave envelope and high Gamma LFP power modulations was r
= 0.52, 0.55, and 0.49 for the normal, double, and quadruple speeds. The latencies and correlations between the minimally smoothed LFPs and soundwave envelope across the modulation rates were 62 ms (r
= 0.26), 68 ms (r
= 0.28), and 69 ms (r
We next examined whether neuronal firing rates were affected by the modulation rate of the stimulus. Overall the total number of spikes emitted during the entire stimulus presentation was proportional to the duration of the stimulus (), indicating that firing rates were preserved and invariant to modulation rate. Thus firing rates for normal, double, and quadruple speeds were 3.46 ± 3.1, 3.45 ± 2.80, and 3.62 ± 3.11 Hz, respectively (mean ± S.D.). These differences were not significant (one-way ANOVA, F(2, 147) = 0.05, P = 0.94), suggesting that firing rate is invariant to stimulus modulation rate.
A trivial explanation for preservation of firing rates across the different stimulus modulation rates is that during most of the stimulation period the neurons simply did not respond to the stimulus and hence neuronal firing was dominated by spontaneous activity. To examine this possibility, we checked whether firing rate invariance consistently holds throughout the entire experiment. To this end, we first chunked the 9-min (4.5-min) spike-trains emitted during the normal (double) speed stimulation into shorter blocks of 20 s (10 s). Next, for each neuron, and each time segment, we compared the firing rate during normal speed stimulation with the firing rate during the corresponding double speed stimulation (). As shown in the graphs, there was a linear correspondence (with slopes close to 1) between the firing rates during different speeds regardless of the absolute firing rate level. Most importantly, even during high levels of firing rate in both speeds, presumably evoked by the stimulus, the firing rate across stimulation speeds remained the same, indicating that firing rate invariance was sensory-driven rather than driven by spontaneous activity.
While high firing rates imply evoked activity, a more demanding test for such sensory evoked responses is the degree of reproducibility across repeated stimulation. We therefore performed a second analysis as follows: first, the spike-trains of each neuron evoked during normal (double) speed stimulation were chunked into 20 s (10 s). For each time segment we calculated (a) the degree of correlation between the two runs of the normal speed stimulation, and (b) the firing rate ratio across speeds (average firing rates during double speed stimulation divided by average firing rates during normal speed stimulation). If the firing rate invariance across speeds was merely due to noise, then we would expect it to depend on the degree of reproducibility, i.e., that during time segments with reproducible large fluctuations in the sensory-driven responses (high correlation values across the two normal speed runs), the firing rate ratio across speeds would be different than 1. Alternatively, if the firing rate during sensory-evoked responses was invariant to stimulus modulation rate, we should expect to see similar firing rates across speeds (ratio of 1) during highly reproducible time segments. displays the degree of firing rate invariance as a function of the reproducibility across repeated presentations. The results indicate that firing rate invariance was maintained even for highly reproducible responses, providing further support to the notion that a linear reduction in spike count across stimulation speeds was not due to spontaneous activity.
Next, we compared the distribution of inter spike intervals (ISIs) across the different speeds (). An increase in the proportion of spikes with short ISIs during faster stimulation would imply an increase in the evoked instantaneous firing rates. The overall distribution of ISI across stimulus modulation rates was largely similar. Moreover, neurons tended to fire in bursts where very short ISIs (between 4 and 6 ms) were most frequent. Even when focusing on these very short ISIs we could not reveal a significant effect of speed (one-way ANOVA on the percent of spikes with ISI between 4 and 6 ms, F(2, 147) = 0.24, P = 0.78). In addition, we computed the autocorrelation of the population spike trains recorded during the different speeds (). Evoked spike trains of similar duration should result in auto-correlation functions with similar width while a decrease in duration of evoked activity should result in a narrower auto-correlation function. We found a clear decrease in the width of the autocorrelation function as a function of stimulus modulation rate, suggesting that the duration of evoked responses matched the duration of the stimulus.
Figure 4 Effect of stimulus rate on spike train parameters. (A) For each neuron and each stimulus repetition we computed the distribution of inter-spike intervals (ISIs) normalized by the total number of spikes. The graph represents the average distribution of (more ...)
To further explore the notion that the duration of neural responses matched the duration of the auditory stimuli, we directly examined whether the responses to faster stimulus modulation rates could be modeled as the responses to normal speed stimulation “condensed” in time. To this end, the double speed spike-trains were first “stretched” in time so that they will be of the same length as the normal speed spike trains (i.e., from 4:30 min to 9:00 min). This procedure involved simply duplicating in time each millisecond time bin (containing 0 or 1 signifying spike occurrence) of the double speed spike-train. Thus, both the duration and the spike count of the original double speed spike train were doubled. Next, the spike trains were smoothed with a 500 ms square bin (see above, and Methods) and the correlation between the smoothed normal speed and smoothed “stretched” double speed spike trains was computed.
depicts the smoothed population activity of the normal speed superimposed with the “stretched” population activity of the double speed. The graph presents a time segment of 120 s, while full-length time-courses are provided in Supporting Information Figure 8
. As can be seen, there is a tight correspondence between the two signals (r
= 0.66, df = 1,080, P
< 0.01; right panel). The data of the double speed and quadruple speed were compared in a similar manner and also exhibited high correspondence (r
= 0.68, df = 540, P
< 0.01). In addition, we compared the normal speed and quadruple speeds by applying the stretch procedure twice to the quadruple speed spike train (r
= 0.59, df = 270, P
< 0.01). In addition to examining population activity, we also conducted the above analysis on the spike trains of individual neurons ( left panel). The correlations of individual neurons were substantially reduced (although still highly significant) compared to the population responses ( right panel). Similarly, the high gamma LFPs power modulations also exhibited strong correlation between the normal speed and double speed time course after applying the same stretch procedure (). The full-length high Gamma LFP power modulation time courses for the entire experiment are provided in supplementary Fig. 9
. The correlation and latency between the population spiking activity and the high Gamma LFP power modulation were r
= 0.52, and 6 ms for the normal speed, r
= 0.56, and 9 ms for the double speed, and r
= 0.6 and 6 ms for the quadruple speeds (LFP's preceding the spikes). Finally, since neurons in auditory cortex and superior temporal gyrus have been shown to have audio-visual interactions [e.g., Ghazanfar et al., 2005
; Kayser et al., 2008
; Reale et al., 2007
] we wanted to see whether the visual component of our stimulus could have driven our results. Therefore, we recorded neural activity from these electrodes during movie presentation either with or without sound (i.e., audio-visual or visual-only stimulation respectively; see Methods). We recorded from a total of 34 cells in four sessions (Patient 1, three sessions; Patient 2, one session; Supporting Information Table S1
). Critically, while 25/30 cells displayed significantly reproducible responses upon consecutive presentations of the audio-visual stimulus (see above), no cells displayed reproducibility between their activity during audio-visual and visual-only stimulation (r
= 0.03 ± 0.03; see also Supporting Information Table S1
and Methods). This lack of reproducibility indicates that the recorded cells respond differently to audio-visual stimulation and visual-only stimulation. However it does not rule out the possibility that these cells respond in a consistent manner to the visual content of the stimulus and in a consistent, albeit different manner, when the visual content is presented simultaneously with the auditory content of the stimulus. To examine this possibility, we recorded two repetitions of the visual-only stimulation (as opposed to one repetition) in two sessions. The reproducibility between the first and second visual-only stimulation was again extremely weak (r
= 0.03 ± 0.04, N
= 10 cells; Supporting Information Table S1
). Supporting Information Figure S10
displays the population spiking activity time courses during audio-visual and visual-only stimulation during one session.
Figure 5 Stretch analysis. (A) Average temporal profile of spikes (N = 25 cells) during normal speed presentation (blue traces) superimposed with the temporal profile during double speed presentation stretched in time (red traces; see Stretch analysis in Methods). (more ...)
Finally, we tested whether the LFP power exhibited consistent responses between audio-visual and visual-only stimulation. Although high gamma band LFP responses were found to be highly reproducible across repeated audio-visual stimulations (see above), correlations between audio-visual and visual-only stimulations were near zero for all frequencies including the high gamma band (Supporting Information Figs. S11–S12
). Overall, both spiking activity and high gamma band LFP power indicate that, in the regions we recorded from, responses were predominantly evoked by the auditory content of our stimuli.